We know that AI tech is progressing at a rapid rate, and Google’s recent unveiling of the Veo 3 video generator highlighted this once again. Veo 3 is a step up in terms of realism, and adds audio for the first time. Much of the time, the outputs from Veo 3 are pretty much indistinguishable from real videos.
It’s not perfect yet, but Veo 3 is taking the internet by storm, with viral clips covering everything from street interviews to incompetent Stormtroopers. Depicting soldiers of the Galactic Empire is at least one way of ensuring consistency between clips, because of course they all look the same.
Google itself has provided a showreel of impressive-looking video clips as well, which include a sailor at sea and a classical violinist. You need to look very, very hard to tell that these videos are AI, and even then it’s not always possible.
But those are the end results. What about the creation process? If you pay Google for an AI subscription, then you can produce Veo 3 videos of your own, and there are a couple of ways to go about it, which I’ll get into here.
All this AI video creation needs to be put into context each time: There are question marks around the technology in terms of energy use, copyright infringement, the threat to the creative industries, and misinformation spreading, all of which we’ve written about extensively before.
Creating videos with Veo 3
If you pay $20 a month for the Google AI Pro plan, then you get three Veo 3 video generations per day in the Gemini app, using the faster, lower-quality Veo 3 Fast model. If you’ve gone big with the $250-per-month Google AI Ultra plan, you get the “highest limits” for full Veo 3 access—Google doesn’t quantify this exactly, so there may be no hard ceiling, and it may well fluctuate based on demand. Each video is fixed at eight seconds long.
If you’re using the Flow and Whisk tools for your video creation, rather than the Gemini chatbot, it’s a bit different: You get 1,000 AI credits on the Pro plan per month, and 12,500 credits on the Ultra plan. A standard Veo 3 video will set you back 100 credits, and a Veo 3 Fast video is going to cost you 20 credits—and in these tools, the resolution can be upscaled to 1080p (it’s 720p if you’re using the Gemini app).
Credit: Lifehacker
As per Google’s Josh Woodward, Veo 3 Fast is faster (obviously), less demanding in terms of processing, and sticks to the same 720p resolution as normal Veo 3. It’s not clear exactly what trade-offs there are in terms of quality (the Google team wouldn’t elaborate when I asked via email), but anecdotally, it seems some of the textures, lighting, and details aren’t as good. Inside the Flow app, there’s a label saying Veo 3 Fast is a fifth the quality of Veo 3.
To start making videos, if you’re a Google AI Pro subscriber like me, you need to head to the Gemini app on the web (mobile video creation is limited to Google AI Ultra subscribers for now). Click the model picker in the top left corner, then choose 2.5 Pro (preview) or whatever the latest model is by the time you’re reading this: You can then select Video in the text input box and you’re ready to do some prompting.
Previously, I used Veo 2 to try and recreate the old Sony TV ad, where thousands of colored bouncy balls get thrown down the streets of San Francisco. The results weren’t great, so I gave Veo 3 Fast the same challenge. As you can see below, I got a better video back. It could almost pass as something that had been filmed in real life (the sun through the trees is great), but it still ignores most of my prompt instructions, and is nowhere near as good as Sony’s ad.
This brings us back to the nature of generative AI, which is to mimic what it’s seen before. I’m guessing Veo 3 has been trained on lots and lots of street interview vox pops, and not many ads where bouncy balls are cascading down hills. It also highlights that it can take a lot of prompting to get what you want, and throughout my AI video tests, getting tools to follow the prompts is an ongoing challenge.
With only two Veo 3 generations left for the day, I tasked Veo 3 Fast with recreating the classic “welcome to Jurassic Park” scene in Spielberg’s movie. Again, it’s better than the Veo 2 effort, but there are problems with prompt adherence, and there are too many paleontologists. The dinosaurs (and the dinosaur sounds) are well done, though.
Using Flow to create longer movies
Google also offers Whisk for animations and Flow for longer movie projects, using the same AI models as you’ll find in the Gemini app, corresponding to the plan you’re subscribed to. If you don’t want to make videos of your own, or you’re a free Google Gemini user, you can watch what other people have made via Flow TV.
Once you get into Flow in your web browser, click New project to get started. You can then start prompting, using the settings button in the top right of the prompt box to choose the model you want to use—you’ll see how many credits the generation is going to use up as well, before you do any rendering.
Credit: Lifehacker
I decided to splash out on 100 credits for a proper, full Veo 3 clip in an attempt to better produce the Jurassic Park scene in full and get my AI movie making career off the ground. I added a little more detail to the prompt, as well as some dialog, and what came out the other end was about on a par with the Veo 2 output. You actually get two generations to choose between, which you can see here and here.
Again, we have the usual problems, in that the AI generator doesn’t really know what it’s supposed to be doing here, or how to construct a scene beyond what it’s seen in other videos. Our intrepid adventurers are looking in the wrong direction when one of them delivers the “wow… would you look at that” line, and everything from the dinosaurs to the trees looks generic.
The difference with Flow and creating extended videos is that you can click Add to scene on any of these generated videos and start building something longer, made up of eight-second chunks. Scenes can then be extended and arranged as needed, with the same characters and environments carried over from one clip to the next.
My attempts to get John Hammond to enter the scene didn’t really work. The original characters stayed in place well enough, but our new character appeared out of nowhere and all sound was suddenly cut out because Flow had somehow switched me back to Veo 2. We also got a freak camera shake half way through. It’s clear that I’m not going to be able to switch from tech journalism to AI film directing anytime soon, especially with just 1,000 credits per month.
Veo 3 is still at an early stage, and Google has put “experimental” labels all over it and the Flow interface. However, at the moment you’re going to have to spend a lot of credits and a lot of time working on prompts to get something that’s consistent and realistic. It’s likely that hours of effort and trial runs went into the polished AI videos you see populating your social media feeds.