Google Veo 3.1: Improved prompt adherence, image-to-video and audio generation
Google has updated its Veo video generation model to version 3.1, focusing on better prompt adherence and stronger image-to-video capabilities. Veo 3.1 is available through Googles Gemini API and is integrated into the Flow video editor, bringing new practical tools for creators and editors.
The update builds on Veo 3 (introduced at Google I/O 2025) and adds a few notable improvements that make the model more useful for real-world video workflows:
- Stronger prompt adherence: Veo 3.1 aims to follow written prompts more reliably, reducing unexpected outputs.
- Image-to-video conversion: You can upload images as “ingredients” and have the model generate moving footage from them.
- Audio generation: Veo 3.1 can generate audio alongside video, a capability that was not available in Veo 3.
- Frame-to-Video in Flow: Flows new feature lets you supply a first and last frame and have Veo generate the frames between, with audio included.
Google demonstrated how these features work inside Flow, where Veo 3.1 can also be used to extend clips or insert objects into existing footage while producing corresponding audio. Although sample outputs still show an uncanny or variable realism depending on prompts and subjects, the focus on editor-friendly features signals Googles intent to make Veo useful for creators rather than only short-form social clips.
Key context and considerations:
- Availability: Veo 3.1 is available now through the Gemini API and is integrated into Flows editor.
- Comparison: While Veo 3.1 improves utility, other models (for example, OpenAIs Sora 2) may still produce more photorealistic results in some cases.
- Use cases: The combination of image-to-video and audio generation is useful for concept prototyping, storyboarding, and creative editing workflows.
For more details, see the original coverage: Engadget: Googles Veo 3.1 is better at generating videos from images.
Veo 3.1s improvements may make AI-assisted video editing more integrated into creative pipelines, but expect variability in output quality depending on the prompt and content. As with most generative systems, hands-on testing will reveal the best use cases.
Discussion: Will Veo 3.1’s image-to-video and audio features change your video workflow or are you waiting for more realism? Share your thoughts below.