Welp, thats it folks. We literally can't trust videos as being real anymore. Its starting to get too realistic, I'm actually fatigued trying to detect the artifacts now. 1:41
I saw just a few things in a few examples. The panda and pterodactyl video had cars going backwards and melting into the bridge, but I think like CD quality music being replaced by crappier sounding MP3 and streaming audio, the average person is just going to get used to the small mistakes until they are ironed out of the models. It's a mindfk time to be alive, and we're at the start of the process.
@@YouCann0tSeeMeit will never be regulated by the nature of it. we live in a time where power is exactly synonymous with information. ai is a tool that can synthesize all information. we are going to become eclipsed by the machine by all of the metrics we've created for ourselves, and it is going to become a baseline. tools like this will become the new default.
Yeah I always thought about And the future will only get worse There should be a law where anything generated with AI should be explicitly mentioned somewhere or else risk your platform being taken down or be banned online. It's the only way to distinguish fake to real I say this because people will abuse what is fake and real
@@Trahloc idk why people drop so much hate towards Google. Google is the one that created AlphGo and beat the best go player in the world, they created AlphFold and got a Nobel for it. They CREATED the transformers architecture, the reinforced learning releasing everything on public research papers for anyone and everyone to use. Meanwhile, OpenAI all secretive and shit, releasing failure after failure (not as a business because clearly most people keep consuming their subpar products). In terms of LLM, Google was behind but is also the latest to join the race and is now approaching the competition. Imagen 3 is easy one of the top image generation models, NotebookLM is a homerun so idk why anyone would think VEO wouldn't follow the same standards.
10 or 20 years from now, if you want to watch a movie, you'll just prompt a movie AI, telling it what kind of movie with which actors you want and it'll just create it.
It will also monitor your bio (hardware already moves in that direction, lke Apple Vision already can monitor pupil dilation) and will adjust the story based on live reaction to it.
@tonystarkagi Nah, there's not enough compute to go around yet. Also the videos aren't consistent enough yet. It will take a while before we get there but you are right it probably won't take 20 years.
True, but any Tom, Dick or Harry with a YT Premium subscription can download the entire video library to their local hard drive, and get the exact same learning/processing access that Google has 🤷
@@sundog. LOL do you really think companies like ChatGPT+Microsoft cannot afford the physical and human resources required for a simple downloading and saving job?!!
man, this is getting out of control. imagine how many billions hollywood pumped into producing movies and now you can create such scenes with just a few prompts...
This is just how techology works. There are always exponential decays in costs. The first rockets cost multiple billions to create and fly to space. Now...spaceX can travel to space for a few million. Televisions used to cost thousands up to 10 thousand. Now you can get one for under $100 with free shipping. Same with any other techonology. Love it.
Once it generates something recognizable from one of those movies it's trained on, it will end up in court. Fake snuff is bs. How are we supposed to know if it's real and why did they train it with snuff to begin with?
@@Upgrayedddd Do you have any idea how accessible it is? It's out there. At some point soon the entire Internet will be the training data for these LLMs. They'll analyze EVERYTHING.
@ 10:33 - The real-world woman is obviously looking directly at herself in the mirror, but the mirror image is NOT looking direct back, but is instead looking at our lower-left.
Thank you i came to say this! When you think about it though, this is how a lot of shows and movies are shot so that the actor is looking at the camera directly. Very interesting how it's picked that up!
@@spinninglink I was just thinking that - there might be conflicts between physics it learned from real footage vs special FX from movies. I hope they took this into account in tagging the data
Great sequence expect the last 1 second of the scene after the right turn. The steering remained on full right lock after the right turn which should have spun the car. Should have counter-steered to catch the drift. I guess the AI trainer does not drive.
I mean its google and i would understand if they become #1 in the Ai industry Aside having money they have 90% of the world video data, crazy how far we came and are going. Cant wait to try this🔥🔥🔥🔥
@@Breath.EThat’s not how training works, it can mix different subjects it understands together into one. It doesn’t need your example specifically, just hippos and people skating
I think it's still not quite there. Or they would still do a lot of editing and other techniques, and this would just be one more tool in the bag. There is still low detail for things going on in the background and so on.
@@shirowolff9147 exactly, first it'll do just 20% of all the process and that 20% they're not having to do they'll dedicate it to blend the transition between AI and non-AI parts. Then it will be 50% and they wont need as much editing, then it will be 80%, 90%, 99%, etc until the AI becomes more original and creative and professional than any human and we humans just ask it to generate things because we'll know that if we try to change things from the narrative or the visuals or anything really we'll be fucking it up because the AI will have taken into account so many more things than any author could
In the palace party(?) scene, putting aside how the room seems to be a wierd mix of private dressing room and banquet room, the right hand mirror becomes a doorway into a different room in the second half of the pan
Mind blown by these video generations! VO2 is a game changer, the realism and detail are insane. Can't wait to see what people create with this tech. Wes, your examples are convincing, I'm sold!
And 2024 isn't even over yet. AI video will be flawless by the end of next year or maybe sometime in 2026. After that, if we're lucky, video lengths will be much longer and censorship will be nil. I can dream, can't I?
I think consistency and cost will still be limiting factors for the next few years, but I have friends in film who are still adamant that this will never replace them. I understand how they feel, and perhaps it’s just blind optimism, but I’m like… you guys really need to consider what you’re going to be doing in 5-10 years. And I don’t tell them that that’s probably their most optimistic scenario.
This stuff can be genuinely dangerous, so no, censorship will not be removed. There may be ways around it eventually, but no profit-motivated company is going to allow you to make fake r-ape or snuff films, or realistic fake combat footage from an ancient battlefield, or make world leaders do and say totally inappropriate things, unless it's Trump, and it would be weird if he _weren't_ saying inappropriate things.... Imagine people asking for details Colosseum fights, or mass decapitations and so on. The fact that it's AI generated won't make it any less gruesome. And if the training data ever contains video of REAL violence.... like beheadings, etc, scrubbing LiveLeak.... then that's going to be really nasty. Imagine seeing some dude having his head removed with a knife and you have no idea if it's real or AI. Ick!
16:53 Crocodiles often rely on the element of surprise, lying in wait for prey to come close. Capybaras, by being in groups and alert, reduce this advantage. Capybaras live in groups which can provide safety in numbers. A group of capybaras is more likely to spot predators early, giving them a chance to flee or hide. Moreover, their communal lifestyle includes warning signals; if one capybara senses danger, it can alert others. So mabe the crocodiles never really developed a taste for them since it would be super rare to eat one...
My understanding is that capybaras most certainly do get eaten by big things in the Amazon such as crocodiles, anacondas, and leopards. But yes, they travel in herds, so it's usually a weak, young, or old one that gets picked off from the side occasionally. It's like zebras. They are an important food source both for people and for other animals in the Amazon and they are not immune. They're just giant rodents, after all.
I'm blown away by V2's video generation capabilities. The level of realism and coherence is incredible. Can't wait to see what the future holds for AI-generated video!
Still having trouble with human movement. The karate scene at 19:23 seems to be the product of strict guardrails against "violent content" but even if you allowed raw violence I imagine it'd be a mess.
Now imagine when AI video can be quickly turned into an Unreal Engine scene, with the AI being intelligent enough to make all of the relevant assets, models, physics code, etc. Then have that in a loop structure, using the scene to make further video to iterate the scene.
@WesRoth problems with the historical scene I have spotted: - The lady faces the mirror straight, in the reflection the head is slightly durned towards right. - In the maquillage box on the left of the table there is a candle which does not belong with the makeup tools. It is not lit, but the stand of the candle took the texture of the box in which is positioned. It is out of place in my opinion. Regardless, it is a really good video for an AI. Thank you for your videos.
The fact that most of those were twitter posts from real people, it is mind-bogglingly good. The physics is 9.5/10 superb barely distinguishable from real life. Even the ones that had some glitches/artifacts looked amazing with only a few minor hiccups that, honestly, can be forgiven.
I wonder if one of the reasons all models seem to struggle with fast action (like the skateboard video) is due to the fact that most videos people have made throuh the decades are captured at a relatively low frame rate, so while they make sense in motion to our eyes, looking at them frame by frame reveal a confused mess, which is actually what these models are dealing with.
One of the distinguishing factors of VEO 2 is that it doesn't force the video clip to be in slow motion like literally all the competition. Not to mention it's mind-blowingly realistic and coherent. I really hope this is not just the best cases that were cherry picked, because....... i don't even have words.
i think you're referring to the extremely smooth and unnatural camera movements, and yeah that gives AI videos a certain shared vibe. That should be near to end though, models are gonna gain more and more precise control over how the video works and how everything moves so i bet you could ask it to make the camera move as if a human was grabbing the phone for example in a few months if not already with Veo 2
2:50 In that example it completely ignored all the stylization requested in the prompt, and it said several times to turn it stylized and abstract, so the video quality in that one is good but prompt adherence isn't great.
ISSUES: The Victorian model: A Candle in the pen box The Penguin Paragliding: The strange object on the back of its head Ball in the coins: There wouldn't be dust in that. got busy after that....
In the mirror scene the main issue I see is the main subject and her reflection are not facing the same way. The reflection is lookin at the camera while the subject is looking straight ahead the entire shot.
Pretty sure the one with the mirrors, the issue that sticks out to me is a slight difference in where her head is facing in the mirror as opposed to not. She's far more straight on when looking into the mirror as opposed to in the reflection where we see her head is rotated to the side more.
What needs to happen is dialogue. We need audio connected with video and the ability to generate mouth movements that coincide with specific speech. I haven't seen any examples of that yet, most of our videos 98% are dialogue oriented.
10:52 as the camera is sliding behind the woman in the mirror. There is a man appearing over her left shoulder. It seems that he is being generated as he comes into view and you can see his shoulder warping. I am spotting this because when I edit videos and need to use frame interpolation or speed warp for stabilization this sort of thing happens from time to time so I pick it up very quickly now. Outside of that, overall the entire scene is really really well done.
i think the coins glitch was a physics issue. it knows the ball should displace some coins and therefore raise the level of the coins but thats not really how it works in reality
The problem with the reflection/mirror output seems to be the reflection is at a different angle to the subject and the subject blinks but the reflection doesn't.
Wait, Google? Seriously? I was expecting this level of genius from SORA or Kling next year-not Google coming out of nowhere and stealing the show. Plot twist of the year!
7:50 The ability to generate realtime story telling with minute glitches and scenes that is perfectly coherent and consistent is just striking. Hollywood will be knocking on Google door to make a sweet deal. Google has set the bar really high because many of their geneated videos has no glitches and the stories are develop dynamically with perfect frame synchronization. I'm flabbergasted 😮😮wow.
Dialogue, consistency between scenes, and artistic control will be the key. If they will be able to provide that, then they will be in the movie business. But still, this level already lands them in the vfx for ads business. Ad agencies will salivate at this and will flood our minds with more and more creative works that steal our attention to present a product once it's stolen.
9:59 I hope we’ll be able to keep these imperfections / Hallucinations in the future using prompts .sometimes i Miss some of the early stage strong hallucinations
Is video AI able to create a charatcter and then continually there forth reference that character in multiple scenes? Can you like set a sort of image tag for the AI to there forth call back and place into various scenes?
This is amaaaazing! Storytelling will need consistency inc multiple character (Inc clothes) and background from different angles eg to cater for camera shots switching between characters sitting opposite each other in a restaurant but I have no doubt this won't take long to achieve. I can't wait!!
i thought that too but remember that all content on google and youtube is completely public and therefore can be scraped, any company with enough resources could scrape all data on google and youtube and train their AI on that data. ALL data. Its all available at the end of the day
@@alvaroluffy1 Not anymore. We used to access TH-cam to generate summaries and transcripts in automated jobs, but they now aggressively block IPs and close accounts that use TH-cam programmatically. They even block transcripts, so only Gemini can search over them.
This will perform better than other text to video models because it's trained on all the videos collectively available on TH-cam. They can anything video legally that is connected to google. Make it auto train , even this video can be used for further training
The issue I see with the woman facing the mirror: I see the “real” woman blink (right when the video seems to skip a frame), but the reflection never blinks.
By being too censored and woke, maybe, to the point of the model refusing to generate the level of violence and nudity present in an average HBO series. This will give competition an opportunity to create a model that will be able to do it.
10:20 when the truck is driving on the road it has a video game quality, it appears like a pool of water is moving with the tires rather than the tires driving through water that is already on the road. Uncanny valley is deeper than it appears.
Does this mean that reversing the generation from video to accurate instructions is possible now. If that is the case then I guess Google can also generate accurate instructions for robotic action just from watching video
10:53 Mirror: its the cameras perspective of the person, it basically doesn't change as the "camera" moves where it would in real life. If you watch it back carefully its still the exact same perspective as though they haven't moved the perspective. The background is good but the person themselves are stuck frozen like you're walking past a picture not a mirror.
10:55 I did notice (on the left side of the vanity) the weirdly placed candelabra in what seems to be the ink pen holder. On the right side (focusing on the mirror reflection of the pen holder there's a noticeable difference in the container and its contents. Looks to also be depicting those as pens and the reflection shows them as more like paintbrush handles.
We need a global registry of ‘News’ videos and locations they were taken at or no one will believe nothing. I was born in 76’. We used to whine about where our flying cars are, now we’re just buckling up.
We are about 4-5 years away from someone being able to remake the final season of Game Of Thrones into something good. I can't wait.
That would be the "killer app" for AI. Simply input the scripts from the first 6 seasons into Claude and ask for a suitable climax.
@PeterStrmberg007 Or, if George ever finishes the books just ask an AI to adapt them into a script.
4-5 is a stretch. more like 1 or 2
@@NinetooNine We’re at the point where we need to ask AI to just finish the damn books for him.
@@RapidRealityCH buddy said “flawlessly” 😅
We need the quintessential Will Smith scarfing spaghetti shot.
@@miki_wiki12 someone did it and it looks incredible. It wont generate will smith tho probably because of censorship issues.
🗿
I was thinking a noodle eating a plate full of tiny Will Smiths.
Indeed! Interestingly, that has become a pretty telling benchmark for accuracy, quality, and consistency of AI video generators.
"the rock eating rock"
This looks INSANE 😮
No. Previous video AIs looked insane with things appearing out of nowhere and changing size etc. This looks normal. Which is insane.
I didn't know you watch this stuff bro😂❤
We are not far from the Matrix becoming a reality 👀
busyworksbeatsssssss
@@nakatash1977 haha i love that
I am stunned and cannot move, not because of the AI video, but because I am, in fact, a carrot.
Always has been
Xd
One would imagine a carrot with 9999IQ to be able to think up a way to move, not impressed.
🎖
@@dannes22 you can't think arms and legs into existence. If you had an IQ like mine, you'd know this
When this gets to VR it will be insane
I'm just gonna die before that I have few months or couple years left maybe, I can't enjoy that, also no money and Job, no dream house to live.
Yep Holodeck time.
That’s what I keep saying. Infinite worlds and “time travel” will be 100% possible :)
Welp, thats it folks. We literally can't trust videos as being real anymore. Its starting to get too realistic, I'm actually fatigued trying to detect the artifacts now. 1:41
They should really regulate this before it gets way too out of hand
I saw just a few things in a few examples. The panda and pterodactyl video had cars going backwards and melting into the bridge, but I think like CD quality music being replaced by crappier sounding MP3 and streaming audio, the average person is just going to get used to the small mistakes until they are ironed out of the models. It's a mindfk time to be alive, and we're at the start of the process.
@@YouCann0tSeeMeit will never be regulated by the nature of it. we live in a time where power is exactly synonymous with information. ai is a tool that can synthesize all information. we are going to become eclipsed by the machine by all of the metrics we've created for ourselves, and it is going to become a baseline. tools like this will become the new default.
Yeah I always thought about
And the future will only get worse
There should be a law where anything generated with AI should be explicitly mentioned somewhere or else risk your platform being taken down or be banned online. It's the only way to distinguish fake to real
I say this because people will abuse what is fake and real
Many videos have very obvious errors.
Watching this vs Sora, this is much better.
However, I'll believe Google when they provide a finished product for us, users, to actually use.
I'd bet it has a better chance of entering the Google graveyard before being a public release.
@@Trahloc its already available to many users for testing so no i don't think this is going to graveyard
@@Trahloc idk why people drop so much hate towards Google. Google is the one that created AlphGo and beat the best go player in the world, they created AlphFold and got a Nobel for it. They CREATED the transformers architecture, the reinforced learning releasing everything on public research papers for anyone and everyone to use. Meanwhile, OpenAI all secretive and shit, releasing failure after failure (not as a business because clearly most people keep consuming their subpar products). In terms of LLM, Google was behind but is also the latest to join the race and is now approaching the competition. Imagen 3 is easy one of the top image generation models, NotebookLM is a homerun so idk why anyone would think VEO wouldn't follow the same standards.
I don't care what papers google is publishing before they actually deliver
they literally have? you can use Veo2 right now.
Interesting that Google waited to launch this just after Sora. Reverse OpenAI tactics, it seems.
I thought it was more like OpenAI caught wind that Google was close to releasing their version and rushed to get Sora out.
Sora is in good hands :)
10 or 20 years from now, if you want to watch a movie, you'll just prompt a movie AI, telling it what kind of movie with which actors you want and it'll just create it.
Yea we are only 2 years in 😅
bro what !!! 10 20 years ?? 😂😂😂😂😂😂😂😂😂😂 more like 2025
It will also monitor your bio (hardware already moves in that direction, lke Apple Vision already can monitor pupil dilation) and will adjust the story based on live reaction to it.
If you wanna watch soulless slop
@tonystarkagi Nah, there's not enough compute to go around yet. Also the videos aren't consistent enough yet. It will take a while before we get there but you are right it probably won't take 20 years.
The prompts are massive, d&d dms futures are bright.
It’s not a stretch to understand why Google with access to the most popular video platform on the planet has the best video generation out there.
True, but any Tom, Dick or Harry with a YT Premium subscription can download the entire video library to their local hard drive, and get the exact same learning/processing access that Google has 🤷
@@d1p70 Are you for real? Do you have any idea how much data that would be?
@@d1p70google already have that.
@@sundog. LOL do you really think companies like ChatGPT+Microsoft cannot afford the physical and human resources required for a simple downloading and saving job?!!
@@sundog.932 PB (PetaByte) of data if my memory serves me from another video about how Google is running out of Storage Space for TH-cam videos.
man, this is getting out of control. imagine how many billions hollywood pumped into producing movies and now you can create such scenes with just a few prompts...
I'm looking forward to the end of Hollywood.
I'm not exactly looking forward to all the fake snuff films that will be made though.
It'll do to movies what filesharing and vst's did to music. (Cheapen everything and make it disposable)
This is just how techology works. There are always exponential decays in costs. The first rockets cost multiple billions to create and fly to space. Now...spaceX can travel to space for a few million. Televisions used to cost thousands up to 10 thousand. Now you can get one for under $100 with free shipping. Same with any other techonology. Love it.
Once it generates something recognizable from one of those movies it's trained on, it will end up in court. Fake snuff is bs. How are we supposed to know if it's real and why did they train it with snuff to begin with?
@@Upgrayedddd Do you have any idea how accessible it is? It's out there. At some point soon the entire Internet will be the training data for these LLMs. They'll analyze EVERYTHING.
@ 10:33 - The real-world woman is obviously looking directly at herself in the mirror, but the mirror image is NOT looking direct back, but is instead looking at our lower-left.
Also the candles are in the wrong place in the reflection
Also, the faces of the men in the reflection looked very warped/low detail.
Thank you i came to say this! When you think about it though, this is how a lot of shows and movies are shot so that the actor is looking at the camera directly. Very interesting how it's picked that up!
Also it doesnt matter
@@spinninglink I was just thinking that - there might be conflicts between physics it learned from real footage vs special FX from movies.
I hope they took this into account in tagging the data
STUNNING QUANTUM SHOCK
Easily the best vid generator, by far.
That muscle car backtires were tilted like drifting cars. That is crazy!
Great sequence expect the last 1 second of the scene after the right turn. The steering remained on full right lock after the right turn which should have spun the car. Should have counter-steered to catch the drift. I guess the AI trainer does not drive.
I mean its google and i would understand if they become #1 in the Ai industry
Aside having money they have 90% of the world video data, crazy how far we came and are going. Cant wait to try this🔥🔥🔥🔥
That's more than just data, that's understanding physics. No one have footage of a hippo that skates.
And after that feds will force them to sell their tech. Just like it happens right now w/Chrome.
@@Breath.EThat’s not how training works, it can mix different subjects it understands together into one. It doesn’t need your example specifically, just hippos and people skating
This is the point where the movie industry will start to use it.
We are not far from the Matrix becoming a reality. 👀
I think it's still not quite there. Or they would still do a lot of editing and other techniques, and this would just be one more tool in the bag. There is still low detail for things going on in the background and so on.
@@johnshite4656 agree, the tech isn't quite there yet but there could be a limited version of it
@@johnshite4656this is just a complement, its not meant to do it entirely, but in the future it will
@@shirowolff9147 exactly, first it'll do just 20% of all the process and that 20% they're not having to do they'll dedicate it to blend the transition between AI and non-AI parts. Then it will be 50% and they wont need as much editing, then it will be 80%, 90%, 99%, etc until the AI becomes more original and creative and professional than any human and we humans just ask it to generate things because we'll know that if we try to change things from the narrative or the visuals or anything really we'll be fucking it up because the AI will have taken into account so many more things than any author could
00:11 This thing seems to BEE 🐝🐝🐝
“Bee” is a daughter of mine.
Best stay clear of my babies or treat them as though eternity awaits anything to corrupt them.
😂😂😂 Nice
Sora isn't close, this is a giant leap
Sora was revealed like a year ago
In the palace party(?) scene, putting aside how the room seems to be a wierd mix of private dressing room and banquet room, the right hand mirror becomes a doorway into a different room in the second half of the pan
She also has her head slightly tilted but in the mirror it's at the wrong angle
Yes, dressing room or party, I wonder what the prompt was?!
Yessss Wes! New thumbnail picture looking not like satan! love it!
Really? He looks like he's being probed! 🤣
Of course the cat renders are the most convincing; we've been uploading videos of them non-stop since the 90s.
What's weird tho is I don't see any misspelled captions... Why are the cats not asking "can i haz more plzz???"
Shocking twist the real peak doom AI has toxoplasmosis!
Wow, as a story teller I think this is the first time I've actually felt excited about having a new tool to tell stories with
I'm already exhausted by the idea of the barrier of entry being too low.
@@2beJT that's what I'm thinking
As a CGI artist this the first time I’ve actually felt I won’t have a job anymore soon
@@2beJTgatekeeping much?
Yup "If everyone every is special, no one is" @@2beJT
Excellent post, thanks for sharing this to us.
Mind blown by these video generations! VO2 is a game changer, the realism and detail are insane. Can't wait to see what people create with this tech. Wes, your examples are convincing, I'm sold!
This video looks like direct footage from Sam Altman's head... during his worst nightmare.
lol. Openai is cooked
And 2024 isn't even over yet. AI video will be flawless by the end of next year or maybe sometime in 2026. After that, if we're lucky, video lengths will be much longer and censorship will be nil. I can dream, can't I?
I think consistency and cost will still be limiting factors for the next few years, but I have friends in film who are still adamant that this will never replace them. I understand how they feel, and perhaps it’s just blind optimism, but I’m like… you guys really need to consider what you’re going to be doing in 5-10 years. And I don’t tell them that that’s probably their most optimistic scenario.
@@wonmoreminute do we even need streaming platforms when we have an AI storyteller generating whatever movies we want every night?
@@2beJT well you wont have any money to afford them since you and everyone will be replaced by AI
Uncensored? Only for open-sourced models probably.
This stuff can be genuinely dangerous, so no, censorship will not be removed. There may be ways around it eventually, but no profit-motivated company is going to allow you to make fake r-ape or snuff films, or realistic fake combat footage from an ancient battlefield, or make world leaders do and say totally inappropriate things, unless it's Trump, and it would be weird if he _weren't_ saying inappropriate things.... Imagine people asking for details Colosseum fights, or mass decapitations and so on. The fact that it's AI generated won't make it any less gruesome. And if the training data ever contains video of REAL violence.... like beheadings, etc, scrubbing LiveLeak.... then that's going to be really nasty. Imagine seeing some dude having his head removed with a knife and you have no idea if it's real or AI. Ick!
16:53 Crocodiles often rely on the element of surprise, lying in wait for prey to come close. Capybaras, by being in groups and alert, reduce this advantage. Capybaras live in groups which can provide safety in numbers. A group of capybaras is more likely to spot predators early, giving them a chance to flee or hide. Moreover, their communal lifestyle includes warning signals; if one capybara senses danger, it can alert others. So mabe the crocodiles never really developed a taste for them since it would be super rare to eat one...
My understanding is that capybaras most certainly do get eaten by big things in the Amazon such as crocodiles, anacondas, and leopards. But yes, they travel in herds, so it's usually a weak, young, or old one that gets picked off from the side occasionally. It's like zebras. They are an important food source both for people and for other animals in the Amazon and they are not immune. They're just giant rodents, after all.
zebras in Amazon? @@johnshite4656
I'm blown away by V2's video generation capabilities. The level of realism and coherence is incredible. Can't wait to see what the future holds for AI-generated video!
We are not far from the Matrix becoming a reality. 👀
Wow, these are surprisingly good!
Still having trouble with human movement. The karate scene at 19:23 seems to be the product of strict guardrails against "violent content" but even if you allowed raw violence I imagine it'd be a mess.
Now imagine when AI video can be quickly turned into an Unreal Engine scene, with the AI being intelligent enough to make all of the relevant assets, models, physics code, etc. Then have that in a loop structure, using the scene to make further video to iterate the scene.
But that would be 'the matrix'
Skip Unreal Engine and have the model simulate it all.
@@kumarmanchoju1129holodeck
You are correct. Hyperlapse is timelapse with a moving camera. While that shot looks great, it’s not technically hyperlapse.
For the first time I see something that would be just good enough for B-roll on pretty much any video. Which is a MASSIVE milestone
we are SO close!
I am blown away with this technology! 🎉🎉🎉
AR with genAI will be insane
Great one, thanks!
@WesRoth problems with the historical scene I have spotted: - The lady faces the mirror straight, in the reflection the head is slightly durned towards right. - In the maquillage box on the left of the table there is a candle which does not belong with the makeup tools. It is not lit, but the stand of the candle took the texture of the box in which is positioned. It is out of place in my opinion. Regardless, it is a really good video for an AI. Thank you for your videos.
The fact that most of those were twitter posts from real people, it is mind-bogglingly good. The physics is 9.5/10 superb barely distinguishable from real life. Even the ones that had some glitches/artifacts looked amazing with only a few minor hiccups that, honestly, can be forgiven.
Looks good, more competition, better prices :D
thanks Wes
The hippo learning to ski- icing on the cake is the girl looking at the hippo with a “WTF?” kind of expression
I wonder if one of the reasons all models seem to struggle with fast action (like the skateboard video) is due to the fact that most videos people have made throuh the decades are captured at a relatively low frame rate, so while they make sense in motion to our eyes, looking at them frame by frame reveal a confused mess, which is actually what these models are dealing with.
17:52 car just casually going backwards against traffic, I can't lol
One of the distinguishing factors of VEO 2 is that it doesn't force the video clip to be in slow motion like literally all the competition. Not to mention it's mind-blowingly realistic and coherent. I really hope this is not just the best cases that were cherry picked, because....... i don't even have words.
What is the price?
The cleaning streaks on the mirror in the 1800's scene is quite amazing
When generators will stop finally making all the videos in slow motion?
i think you're referring to the extremely smooth and unnatural camera movements, and yeah that gives AI videos a certain shared vibe. That should be near to end though, models are gonna gain more and more precise control over how the video works and how everything moves so i bet you could ask it to make the camera move as if a human was grabbing the phone for example in a few months if not already with Veo 2
Skateboard ai videos are awesome 😂
2:50 In that example it completely ignored all the stylization requested in the prompt, and it said several times to turn it stylized and abstract, so the video quality in that one is good but prompt adherence isn't great.
This is just beyond! I wish we all get access to this soon and it's affordable.
ISSUES:
The Victorian model: A Candle in the pen box
The Penguin Paragliding: The strange object on the back of its head
Ball in the coins: There wouldn't be dust in that.
got busy after that....
In the mirror scene the main issue I see is the main subject and her reflection are not facing the same way. The reflection is lookin at the camera while the subject is looking straight ahead the entire shot.
Pretty sure the one with the mirrors, the issue that sticks out to me is a slight difference in where her head is facing in the mirror as opposed to not. She's far more straight on when looking into the mirror as opposed to in the reflection where we see her head is rotated to the side more.
What needs to happen is dialogue. We need audio connected with video and the ability to generate mouth movements that coincide with specific speech. I haven't seen any examples of that yet, most of our videos 98% are dialogue oriented.
10:52 as the camera is sliding behind the woman in the mirror. There is a man appearing over her left shoulder. It seems that he is being generated as he comes into view and you can see his shoulder warping. I am spotting this because when I edit videos and need to use frame interpolation or speed warp for stabilization this sort of thing happens from time to time so I pick it up very quickly now. Outside of that, overall the entire scene is really really well done.
i think the coins glitch was a physics issue. it knows the ball should displace some coins and therefore raise the level of the coins but thats not really how it works in reality
The problem with the reflection/mirror output seems to be the reflection is at a different angle to the subject and the subject blinks but the reflection doesn't.
Wait, Google? Seriously? I was expecting this level of genius from SORA or Kling next year-not Google coming out of nowhere and stealing the show. Plot twist of the year!
I keep thinking about the fact that, the people and "imagined" places don't exist. Its really loopy- 🤯
Better than Sora you think? I’m on the waitlist. Can’t wait.
Competition good, we just want good ai models no matter from which side
This is much better than Sora
certainly is better. agreed.
Wow, that skateboarder. Did a 10 foot Ollie and he’s got the video to prove it.😎
7:50 The ability to generate realtime story telling with minute glitches and scenes that is perfectly coherent and consistent is just striking. Hollywood will be knocking on Google door to make a sweet deal. Google has set the bar really high because many of their geneated videos has no glitches and the stories are develop dynamically with perfect frame synchronization. I'm flabbergasted 😮😮wow.
They are definitely getting better but we're still in uncanny valley, especially for longer clips.
Dialogue, consistency between scenes, and artistic control will be the key. If they will be able to provide that, then they will be in the movie business.
But still, this level already lands them in the vfx for ads business. Ad agencies will salivate at this and will flood our minds with more and more creative works that steal our attention to present a product once it's stolen.
Who needs Hollywood anymore.
HEACK YEAAAHHHHH!
9:59 I hope we’ll be able to keep these imperfections / Hallucinations in the future using prompts .sometimes i Miss some of the early stage strong hallucinations
Is video AI able to create a charatcter and then continually there forth reference that character in multiple scenes? Can you like set a sort of image tag for the AI to there forth call back and place into various scenes?
If google combines Veo 2 and Willow, it’s over with for all other Ai video models. 😮
brutally amazing
This is amaaaazing! Storytelling will need consistency inc multiple character (Inc clothes) and background from different angles eg to cater for camera shots switching between characters sitting opposite each other in a restaurant but I have no doubt this won't take long to achieve. I can't wait!!
The videos are amazing but Google’s ability to generate them isn’t surprising. I mean, they own TH-cam, they have tons of training data.
i thought that too but remember that all content on google and youtube is completely public and therefore can be scraped, any company with enough resources could scrape all data on google and youtube and train their AI on that data. ALL data. Its all available at the end of the day
@@alvaroluffy1 Not anymore. We used to access TH-cam to generate summaries and transcripts in automated jobs, but they now aggressively block IPs and close accounts that use TH-cam programmatically. They even block transcripts, so only Gemini can search over them.
@@alvaroluffy1 having the footage on your own servers with all the analytic and etc without having to scrape still gives them a leg up.
thanks!
Dude you definitely need to upgrade your channel to 4K. It's hard to tell anything from these samples when in HD only.
Incredible.
This will perform better than other text to video models because it's trained on all the videos collectively available on TH-cam. They can anything video legally that is connected to google. Make it auto train , even this video can be used for further training
The issue I see with the woman facing the mirror: I see the “real” woman blink (right when the video seems to skip a frame), but the reflection never blinks.
I wonder how Google will screw this up. They have managed to nerf everything AI that they have released with perhaps the exception of NotebookLM 🙂
Omg it's almost like Google hates making money or something. They keep making the same mistakes over and over again
By being too censored and woke, maybe, to the point of the model refusing to generate the level of violence and nudity present in an average HBO series. This will give competition an opportunity to create a model that will be able to do it.
10:08 bro has become Dora at this point..
"I don't know where the problem is."
"can YOU spot the problem??"
10:20 when the truck is driving on the road it has a video game quality, it appears like a pool of water is moving with the tires rather than the tires driving through water that is already on the road. Uncanny valley is deeper than it appears.
Bro those prompts are a whole book 1:49 😂
10:40 missing camera in the mirror, Wes.
Only 10 years then ai models will make movies like Avengers❤
It's 2030 already
Does this mean that reversing the generation from video to accurate instructions is possible now. If that is the case then I guess Google can also generate accurate instructions for robotic action just from watching video
Incredible stuff.
(hyperlapse is a timelapse where the camera is moving)
@0:37 that speedometer though... still apparently struggling to get symbols and numbers right!
10:53 Mirror: its the cameras perspective of the person, it basically doesn't change as the "camera" moves where it would in real life. If you watch it back carefully its still the exact same perspective as though they haven't moved the perspective. The background is good but the person themselves are stuck frozen like you're walking past a picture not a mirror.
the video with the mirror its the direction the reflection faces and it doesnt change perspective
10:55 the candles.
10:55 I did notice (on the left side of the vanity) the weirdly placed candelabra in what seems to be the ink pen holder.
On the right side (focusing on the mirror reflection of the pen holder there's a noticeable difference in the container and its contents. Looks to also be depicting those as pens and the reflection shows them as more like paintbrush handles.
8:46 - The driver can "see" the reflections. Why are 'reflections' any different than 'steering wheel'?
Because it's not really , how is that hard to understand
@Vinei yeah, same with the flamingos and their reflections.
Wow
Capybaras are just chill like that.
Absolutely crazy.
Yes, very good. Now let's see full Sora.
How is every model out there incapable of generating readable text!!???
Geezus... crazy impressive. This is where I was expecting SORA to be last week. Honestly, this is moving so fast. I think I need a career change.
We need a global registry of ‘News’ videos and locations they were taken at or no one will believe nothing. I was born in 76’. We used to whine about where our flying cars are, now we’re just buckling up.
15:48 is neither Hyper- nor Timelapse.
So excited to make some crazy videos