With each shot requiring a new set of prompts, it’s also hard to instill a sense of continuity throughout a video. The color, angle of the sun, and shapes of buildings are difficult for a video generation model to keep consistent. The video also lacks any close-ups of people, which Kahn says AI models still tend to struggle with.
“These technologies are always better on large-scale things right now as opposed to really nuanced human interaction,” he says. For this reason, Kahn imagines that early filmmaking applications of generative video might be for wide shots of landscapes or crowds.
Alex Mashrabov, an AI video expert who left his role as director of generative AI at Snap last year to found a new AI video company called Higgsfield AI, agrees on the current failures and flaws of AI video. He also points out that good dialogue-heavy content is hard to produce with AI, as it tends to hinge upon subtle facial expressions and body language.
Some content creators may be reluctant to adopt generative video simply because of the amount of time required to prompt the models again and again to get the end result right.