Post Content Google demonstrated Gemini Omni during Google I/O 2026 as part of its push into AI-powered video generation. (Image: Google)
Google has unveiled Gemini Omni, a new multimodal AI model designed to generate and edit videos using combinations of text, images, audio, and video prompts. The announcement was made during Google I/O 2026, where the company described Omni as a major step toward turning Gemini into a fully creative AI system capable of understanding and producing multiple forms of media.
The first version of the model, called Gemini Omni Flash, is now rolling out through the Gemini app, Google Flow, and YouTube Shorts. Google says the model combines Gemini’s reasoning abilities with AI-powered content generation, allowing users to create cinematic-quality videos using natural language prompts.
AI video editing through conversation
One of Gemini Omni’s biggest features is conversational video editing. Rather than using conventional editing tools or timelines, users simply explain what they want done in simple terms.
Google showed examples where users changed sculptures into bubbles, converted mirrors into fluid, applied animations, or altered the environment without changing characters or the realistic physics within the video clip. The company says each instruction builds on previous edits, allowing users to refine videos across multiple prompts without losing continuity.
According to Google, the model has a stronger understanding of movement, lighting, gravity, fluid dynamics, and object interactions, helping generate scenes that appear more realistic and physically accurate.
Gemini Omni combines text, images, video, and audio
Google says Gemini Omni can work with multiple types of inputs simultaneously. Users can upload photos, existing videos, drawings, voice references, and text prompts to create a single cohesive output.
Also Read | Google I/O 2026: From AI agents to smart glasses, here are the biggest announcements
For example, users can apply the visual style of one image to a video, synchronise visuals to music, or generate cinematic clips based on rough sketches and written instructions. The system can also create educational explainers and animated sequences from short prompts.
Story continues below this ad
The company says Omni is designed to bridge the gap between AI-generated visuals and meaningful storytelling by combining creative generation with Gemini’s broader knowledge of science, history, and culture.
AI avatars and personalised content creation
Google is also introducing AI avatars as part of Gemini Omni. Users can create digital versions of themselves using their own appearance and voice to generate personalised videos.
The company says it is approaching these features cautiously due to concerns around deepfakes and misuse. For now, voice-based avatar generation will launch first, while additional editing features involving speech and audio manipulation are still being tested.
Also Read | Google announces free AI training for school teachers in Maharashtra, Assam, Punjab and other regions
All videos generated through Gemini Omni will include Google’s invisible SynthID watermarking technology, allowing viewers to verify that the content was AI-generated.
Story continues below this ad
Rolling out across Gemini and YouTube
Gemini Omni Flash is launching globally for Google AI Plus, Pro, and Ultra subscribers through the Gemini app and Google Flow. Google is also bringing the technology to YouTube Shorts and the YouTube Create app at no additional cost for creators.
The company says developer and enterprise API access will arrive in the coming weeks, allowing businesses and creators to integrate Gemini Omni into their own tools and workflows.
© IE Online Media Services Pvt Ltd