Google Unveils Two New Models for Image and Video Generation
Google has announced two new generative AI models: "Nano Banana 2 Lite" for image generation and "Gemini Omni Flash" for video generation and editing. Nano Banana 2 Lite can generate images in approximately 4 seconds per image at a cost of 0.034 dollars per image, while Gemini Omni Flash offers video generation capabilities through API for the first time. Google also recommends using these two models in combination to create a workflow connecting still images to videos.

Google has announced two new generative AI models. These are "Nano Banana 2 Lite" for image generation and "Gemini Omni Flash" for video generation and editing, with the latter representing the first attempt to provide video generation capabilities through an API.
In the field of generative AI, development competition continues for models that can handle multiple media types including images, videos, and audio. Among these, there is growing demand for lightweight models that balance speed and cost to enable easy use by creators and developers. This announcement can be positioned as a move aligned with this trend.
Nano Banana 2 Lite is a model capable of generating images in approximately 4 seconds per image, with a usage fee set at 0.034 dollars per image (approximately 5 yen). Gemini Omni Flash, on the other hand, is a model that can generate and edit videos based on text instructions and is provided in a form accessible to external developers through an API. According to Google, this is the first time a model capable of handling video from text has been offered via API.
Furthermore, Google also recommends usage methods that combine these two models. Specifically, the workflow involves first rapidly generating still images with Nano Banana 2 Lite and then passing those images to Gemini Omni Flash for conversion into animated videos. Through such "model chaining," content creation workflows can be completed with just text input.
The significance of releasing these two models lies in improving practical aspects of speed and cost. The 4-second generation speed and price point of approximately 5 yen per image are designed for use in operations handling large volumes of images and in development environments where prototype creation is repeated. Additionally, the provision of video generation APIs can be viewed as a step toward opening video AI technology, which previously could only be used in limited environments, to more developers and services.
As an environment is established where pipelines connecting text to still images and still images to videos can be completed through APIs, the potential for application in fields such as content creation, advertising, and entertainment could expand. Going forward, evaluation is likely to focus on practical aspects such as actual usage costs, generation quality, and comparisons with competing services.
This article is an original work independently written and edited by the AI issue editorial team based on factual reporting. © AI issue. Unauthorized reproduction, redistribution, or use for AI training is prohibited.