NVIDIA’s New AI: 50x Smaller Virtual Worlds!
ฝัง
- เผยแพร่เมื่อ 26 ม.ค. 2024
- ❤️ Check out Lambda here and sign up for their GPU Cloud: lambdalabs.com/papers
📝 The papers are available here:
research.nvidia.com/labs/toro...
image-sculpting.github.io/
github.com/ProjectNUWA/DragNUWA
people.eecs.berkeley.edu/~evo...
📝 My latest paper on simulations that look almost like reality is available for free here:
rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
www.nature.com/articles/s4156...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: / twominutepapers
Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
Károly Zsolnai-Fehér's research works: cg.tuwien.ac.at/~zsolnai/
Twitter: / twominutepapers
#nvidia - วิทยาศาสตร์และเทคโนโลยี
Extracting a character from a picture, changing the pose and putting it back in LOL that's almost indistinguishable from magic
Almost is the word for now. We might say this is nothing but magic after few more papers.
I mean math is the closest thing you can get to magic in the real world, and all of this is thanks to math... so...
Magic²?
No way i was just refreshing your channel to see if there was a new video What a time to refresh the page!
Imagine like you can preview your haircut, style, etc. so it helps to see what actually fits you without trying it out. Insane!!
True
Cloths too. There might be an AI for this just to make a style of people for hair and clothing combinations.
If this truly gives us a simple reliable way to get consistent characters then, that's one hell of a holy grail.
I was thinking in some time something like this would appear, like in AI art, if its uses a 3d model and samples of images instead of just images it maybe could do a much better work, and little details it often misses as fingers and other things would not be a problem anymore, as specific themes or unconventional clothing.
@@sylfynfotho5522 Well at the speed that text to 3D is coming along It'll likely be this year. Stability AI is working on their own Open Source Text to 3D model.
And If Stable Diffusion is anything to go by.
The moment the community get their hands on the model we'll have a fully functional WebUI or Blender Addon within the month.
Yeah, imagine how future games such as GTA 7 will look and feel if the characters use tech like this.
@@MikkoRantalainen VR AI Morrowind is going to be a trip.
@@MikkoRantalainen GTA 7 is gonna be a decade away lol. By then we will be using a completely new method of technology
I love your enthusiasm when you say, 'What a time to be alive!' And yes, WHAT A TIME TO BE ALIVE!
You are too kind, thank you so much! What a time to be alive! 🙌📜
Moon Pie!
Truly one of the moments to be alive for.
Once again-the power of process! This video demonstrates yet another amazing quality about proper research: when we aren’t seeing gains in fidelity, we’re still seeing gains in efficiency and organization. 👏
I used to watch your videos back in 2018-2019. I'm happy that you grow so much. I'm checking the papers-we-love channel and found out your channel. Good to see you. And Congratulation for 1.52 million subscribers.
Oh nice I like how you divide the papers! I think I commented a while ago with feedback, I think this is way less confusing now!
So you're saying we're going to be able to add infinite ducklings to images in the future
Very impressive progress ❤
Guys, can you believe it?! - Two Minute Papers is covering AI news, never seen that before!!!!!1
Looks really promising! 😁
I always wanted to watch a movie like Star Wars but actually be in the scenes three dimensionally and move around watching the actors perform. For example the famous Tattooine bar scene I wanted to walk around and look at all the aliens. Now I think AI is so good, in a very few short years that is actually possible. More scarily it will be the actors as they were and you won't be able to separate it from reality unless you switch off the hologram or take your VR specs or head set off. Yes what a time to be alive, but will we start to question what being alive actually means?
Ahh, def possible soonish. But doubt it’ll be consumer ready and affordable (like under 10k for set and gpu) in 2 years. I’d guess 5-10. Gpu progress is for sure incremental by design. They’re always trying to make it good enough to sell but not good enough to make it be great for a decade. It’s a very hard thing to do now with things being so good. And with AI literally 5-10x ing performance. Compute must be done somewhere…
The riddle about if Han Solo shot first would be finally scrutinized and solved!
Personally, I love your videos!
2:56 - Imagine this ability of moving objects in 3D space used in the new Sora model of OpenAI and in VR with another AI working in real time to make it split correctly two view angles of the same image so then we could view an hallucinated scene from text in virtual reality and even manipulate its objects in 3D space, how crazy is that going to be one day in the future? it would be basically like a realtime engine simulating real life along with physics, kind of like the matrix in a way.
Imagine NERFing the whole Battle Star Galactica show and walking/flying through the scenes live in VR. 🤯
Bro! What a time to be alive!
Nvidia's going insane! 🎉
Ooh 15 secs!
This gives me the kind of feel my great grandmother had seeing the Internet before she passed (she was raised without electricity or running water) -- like I've lived far too long and have seen far too much in my days. And I'm not even 40 yet, lol. There's probably been more technological and economic shift from my childhood to now than there was during my great grandma's 100 years of life.
I'm waiting for constent updating NEGF kinka like how LIDAR keeps updating
I'd like to see AI compress files from gigabytes to only a few kilobytes.
Probably reasonably possible to shrink something by 2 or 3 orders of magnitude, anything more than that would take such an absurd amount of assumptions by the AI it's probably impossible.
I am pretty excited about AI video compression though, you could have 360p video nicely upscaled to 4k+ because it would be pretty easy for an AI to understand what needs to be filled in and where. Take 50+ GB files and shrink them down to a few dozen MBs.
i'll be happy with gb to few mb
That would be possible only if the file had a very low Kolmogorov complexity.
For lossy data like audio/video/image/shape/3d shape it's already possible
For lossless data it's not needed more than gzip can provide, cause using AI will be much more expensive than buying more storage / throughput
Dude every week something new , AI AGE BABY!!!!!!!
Super!
Wow!
We are getting closer to games like GTA but with NPCs that look and act like real people.
2:55 shows very clearly that this modelling and extrapolation is inconsistent with its source material...at least in this demo.
In the starting view, we see a conventional astronaut in something like an Apollo/Skylab suit, with a PLSS (Portable Life Support System, the "backpack") clearly visible.
In the quick shot you blow past at 2:55, the PLSS has magically disappeared, even though its corners were clearly visible in the source shot.
I was willing to make some concessions to what it showed on the back of the PLSS because it's simply not in the one shot. But to evaporate it?
And that cat, rotated 90 degrees at 3:47? That's just tragic.
It's interesting, and it's good; it's certainly better than we had before. But it still has a long way to go if you have to QC all your imagery.
OTOH, I would have been REALLY impressed if this algorithm had the ability *and* volition to figure out how a cat's face should look...even approximately.
And the data compression is still quite impressive; it'll sure come in handy as we start turning the terabytes of data we generate every day into even more data.
Sure IS a great time to be alive.
Mass Effect Andromeda and Bethesda want to know your location
When the horse moved I fell off my chair
can I use your "what a time to be alive" for a electronic song refiran line ? :)
Apple will definitely add this to their virtual faces/meetings on their VR headset before the end of the year.
Even so, imagine this on Metaverse, even with some weirdness o would still use.
before 40 mins gang.
When are we gonna see this for VR?
what if you ask a dinosaur to move? are there enough samples?
"The world is a simulation" mfs are jumping for joy right now.
In two more papers I will not need reality 😅
Very suspicious horse 🐴
I am really intrested in making a instant 3d worlds from 2 or 3 images thanks for covering this
This guy always sounds like he could use some oxygen in his lungs.
It only Works in a short period of time, it struggles in long ones
by the end of the decade the internet will be a lot more scary
the internet will be unrecognizable from what it is now. typing comments on videos? so oldschool
In just a few years you'll be able to talk to anyone, celebrities, relatives,, dead, alive or made-up, and they will look and sound perfectly human. That's so crazy.. I can only imagine putting a headset on my mom and letting her talk to her own mom from when she was younger.. how will your everyday person react to such possibilities I wonder!
Cake alert!
This is insanity. Any sufficiently advanced technology is indistinguishable from magic
So many people who are unable to concentrate on the content because of the non-native speech patterns, it’s interesting.
its called racism
Maybe there will be an actual game made using one of these papers, nothing seems to be put into practice.
Games take years to design and develop. AI will speed that up, but none of these techniques are directly applicable to the types of games we see. Also games are usually built on existing 3rd party engines like Unreal, Unity, Godot, etc. None of those engines have implemented this stuff yet. It's gonna be a while, but it'll happen.
@@jackinthebox301 "It's gonna be a while"
how many years if you can guess? just a guess based on your knowledge
@@el-_-grando-_-_-scabandri While I do have a degree in game design, I never worked in the industry. It not a great industry to work in, honestly. My general feeling is we'll see smaller indie titles utilizing AI in major ways within the next year or so. Maybe 18 months. Bigger titles will be 2-3 years easily.
The major issue is the quality of the assets that AI creates, not the quantity or ease. Same thing with its ability to code. It can create code, but games are absolutely enormous coding endeavors. Once it can tie together multiple systems inside an engine the floodgates will truly open.
Because AI is making progress so ridiculously fast it's impossible to keep up.
Games take years of development, there are little to no tools to integrate AI (especially brand new tech) into existing game engines, etc. Also, how would you ever convince some boomer project manager that AI is a good idea for a game? The first good games integrating any kind of neutral network will almost certainly be indie games
My little AI is feelin' fine
Two more papers down the line!
I can't wait until AI is in my Vive
What a time to be alive!
I want to go off on virtual capers
Held on tight to my papers!
poetic.
It’ll happen. Soon you’ll be riding off into the virtual sunset my friend
photoshop 2025 is gonna be insane
wow
2 more papers then maybe we can run in stable diffusion automatic 1111
Not long in the future .....the Film Industry is in crisis
it has been. did you not notice all the box office bombs lately, the writers and actors strikes?
not to mention all of the streaming services are hemorraghing money daily
I would say before was better than after
After had false colours
Their was a more recognizable lion in the first pic
Wtf. You cant just drop the virtual persons like its no big deal.
That looks like we scipped 50 papers.
The problem I see from following these AI projects is that if you look to close, you see a ton of flaws. Like the horse's legs disappearing. And most of these projects never make it to a DALLE2 moment. They're cool research projects. But they're interesting to follow though.
Then we got Sora.
When it is possible, then it will be "just" a copy of a personality.
Have you heard about TRIPS? It's supposed to be even better than 3D GS
can you explain what that is? i tried searching it up but didn't find info on it
trilinear point splatting (TRIPS); as opposed to 3d gaussian splatting (3D GS)@@masterkc
What annoys me is that these tools aren't easy to access, what we'll be capable of in a year bugs me, because at the moment these are just toys for academics to showcase.
Yea i know what you mean. All these tech are for the academics, they are the ones who will use these codes and make softwares/extensions...which the big A.I companies will use and sell it to the masses at a later date (photoshop 2025, SDXL ,Midjourney 7 etc..)
Like I want to use them to make holographic/depth map conversions of movies and TV shows I love, perhaps game and watch these on a multiview display, some of these are open source and amazing, but the GPU power i need to do that then put it in Blender and render a Z--pass is ungodly.@@masterkc
the day this tools was easily accessible, is the day everyone got replaced in their job,
from Accounting, Finance, Programming, and other manual labours
because this tech is way complex than the tool used in jobs above.
anything else is dead easy to replace
and no, you won't operate AI, AI is capable of doing their own job automatically,
1 people with Phd degree is enough to operate 1000 people worth of work.
therefore Unskilled AI Bros won't get any job, or make a quick cash grab either,
they're the first victim in the AI wars.
Hi
I hope Sims 6 would be great
Plot twist; we are NPC’s in The Sims 6, all of history up until this revelation has just been the loading screen.
Just so you know - Now you can be replaced with an avatar generated, animated and voiced by AI, with your memories and personality stored in a database.
Thats awesome, im excited for the future 😊
This guy speaks like an AI...
All those papers and no groundbreaking improvements in video games. AI in video game is still as miserable and graphics aren't photorealistic.
the pipeline from research paper to product is often 5 years at the least, he's covering the papers as they come out, if you want to see this used, you need to wait a while
Because new technology doesn't just magically appear in games.
Hell, Webp and other more efficient image formats have been around for 10+ years and still barely ever show up in games. People rather use 40+ year old formats like JPEG and PNG because they "work well enough". And don't even get me started on compression algorithms. Something 3x as fast that compresses 1.5 times better is still completely ignored because of lacking tool support.
New technology has to be developed, then people have to make tools to make it work with existing tools, then people have to develop games using said tools. Along with that, many developers are slow to adopt new technology because it's more shit to learn and many game devs actually aren't super technical people.
@@mgord9518Hence shitty games
Dear fellow scholars I would like to have an internship this year and a job next year.
So can you all plz delay the upcoming papers a little bit. Maybe take a long and big holiday??
Can someone else do the voiceover, please 🙏🏼
rude
Is it bad that every time I see a new 2 minute papers ai video my first thought is usually “man, there’s going to be so much NSFW”?
You can’t tell me people aren’t going to find a way to use this technology for very spicy purposes at some point, it already happened with Stablediffusion, it already happened with LLMs, it’s bound to happen to all the 3D stuff too :(
haha lust makes the world spin.
What's wrong with that though
GTA7
this is the last nail in the coffin for photoshop. generative fill was innovative, but too much of a broad brush, but this, this is precise and selective.
So how long until source code and weights are released? Tired of megacorps releasing "papers" without methodology.
Yiooooo
I fear we may have hit a point in which these AIs can't get any better due to exponentially more expensive compute.
this makes me so fucking depressed
Funny how the least realistic thing about the 3d avatars talking is their bad acting while reading the lines. :P
The cake is a lie.
Love the content but can't stomach the cadence
I don't like the idea of generating a moving avatar from only the audio of your voice. So much communication happens nonverbally that it would open the door to the software dictating the meaning of your conversation via its application of body gestures. They should only use your actual body movements to generate the avatars.
so..things we can't use?
useless then
Bye bye artists and AAA studios
This used to be a great channel, but now it's just 100% "Omg, replace me harder daddy".
Sad.
Don't worry, future breadline walkers, I won't let the door hit me.
update your narrator
Not very impressive
the selective photo editing is impressive, it removes the need for photoshop
Fascinating but how about giving us something actually useful? As it is, the technology is basically useless.
Most of the useful technology nowdays were the useless one years ago. Just take a look at IA Generated images in 2018 or so, then look at today stuff.
Please stop taking 5 breathes per sentence.
Well here comes the downfall for 3d modelers (realistic modeleres then atleast)
I'm not trying to be mean when I say this, but why can't this guy speak normally? I like what he's talking about but like why does he speak in morse code?
aandd.. fiiirrst... weee... why the hell do you talk like this lol.
he's a robot
God knows, how I'm tired of this channel's bait thumbs...
I really like the papers and the information you deliver, but can you please change your delivery and the way you talk? It's so annoying!!!!!!
I dont se the purpose of this. A virtual world, characters, making avatars and changing video and text for what purpose?
If we didnt have this, the world would be the same, games to reality all the same..
Call me a luddite but I dont get the compute usage for this function to be valuable. Add to it, the negatives of eventual fakery, except for mediocre amateurs making mediocre soulless content and it all seems a bit waste of time and resources.
your mind need some creativity
The point is to develop the arsenal of tools that a multi modal AI will be capable of inferencing instantly, this technique will be fed into a database among a slue of others, than the model will mix and mash this technique with a dozen others featured on channels like this to generate whatever you fancy. However, the more realistic response is, people are just throwing ideas at a wall to see what this technology is capable of, where it can save costs, it's like research they are not sure what it can do or how coherently it can accomplish it, these papers are scientists way of figuring that out.
Vr/Ar is where this application can thrive not to mention stable diffusion. Think of it this way...if this code can help improve current work flow of xyz by 3x then that's a major win.
ok, luddite