NVIDIA’s New Tech Runs A Virtual City!
ฝัง
- เผยแพร่เมื่อ 9 ก.ค. 2024
- ❤️ Check out Lambda here and sign up for their GPU Cloud: lambdalabs.com/paper
📝 The paper "NeRF-XL: Scaling NeRFs with Multiple GPUs" is available here:
research.nvidia.com/labs/toro...
📝 My paper on simulations that look almost like reality is available for free here:
rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
www.nature.com/articles/s4156...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: / twominutepapers
My research: cg.tuwien.ac.at/~zsolnai/
X/Twitter: / twominutepapers
Thumbnail design: Felícia Zsolnai-Fehér - felicia.hu
#nvidia - วิทยาศาสตร์และเทคโนโลยี
Imagine using this at sport stadiums.
Combining the video of all the cameras and then being able to create "Virtual cameras" anywhere on the field. You will be to see the game from the players' perspective, then zoom out as the ball flies through the air, and as it descends the camera zooms back to a first-person perspective.
This has been my dream since I was in 9th grade. (I'm 49 right now btw) a stadium could sell an infinite number of "50 yard line" tickets or just virtual super tickets to SOO many more people than are interested in cramming themselves into a brick and mortar stadium. this should be the future of PPV sports in my opinion.
A roommate and I started on a project aimed at this about 10 years ago. Our plan was to have a bunch of stereo cameras pointed at the field and use disparity mapping to segment them out of the video feeds and then reconstruct their 3D geometry from the multiple angles. In hindsight this was extremely ambitious but we were able to get as far being able to identify objects in calibrated feeds. This NERFs technique would definitely have been a replacement for 3D reconstruction.
Are you South African?
The NFL did something similar last year, look up "NFL Toy Story Funday Football" october 1st 2023.
@@dannyarcher6370what happend to this Project?
@@UnicGamesReviewsmore Life.
Can't wait for tech like this to be integrated with google earth + street view. The possibilities are insane.
They only need a nuclear power plant and half a million gpu cards
"GTA 7: Earth" when?
@@dustinrobinson789exactly. I want this tech to be used to render interior spaces too. I'm sick of going up to doors and not being able to go through them. (Looking at you, Cyberpunk) Even though Bethesda is screwing the pooch lately, their elder scrolls series does not disappoint in the exploration factor. If there's a door, it opens, and there's an interior. I love that.
@@dustinrobinson789 knowing rockstar, we'll get it sometime before the heat death of the universe... maybe
Coupled with super resolution and vr. You have can experience a virtual clone of earth.
VR would be more interesting and immersive with this technology! Particularly when it is integrated with Street View
nerfs dont reconstruct 3d well from all perspectives, but only from the camera view they used mostly to infer 3d.
you would need a picture of the objects being rendered from all perspectives and detail levels, or else in VR you would get a blurry mess from some angles. it would be a lot of work to construct scenes suitable for VR and even more computing power. If you look at the video you would see that there was no random camera movement, it followed fixed paths or rotated in place for a reason. You could use it for movies tho.
@@marsovac i was about to say that 😆 👍
Anything 3d in VR is basically more interesting and works better than any other boring flatscreen device to be real 😂
Man GTA 194 in the year 2997 is gonna be so awesome
i dont know at this rate it will be GTA 10 in 2997
Legend has it that it will come out only some 2-4 years after that, but this time AI will leak in which city it takes place. I would have guessed Alderan, if it's still "around".
in the year 29995
if man is still alive
you're going on the internet
and see rockstar just re relased gta 5
Is that the ue 5 matrix city
Looks like it yes
at 4:11 you can see "MatrixCity" and yes it is Unreal
What a time to be Alive!!
but its moments before atomic analation ;)
I bet your the same people that be saying the 90s was the good old days and that Gen Z is trash.
@@kongowhowilcanson8030 nah bruh I’m 17 big gyatt rizz
@@kongowhowilcanson8030 Imagine walking into a 4 Michelin star restaurant and telling the person in the first booth that they must think McDonald's is gourmet
@@BoyFromNyYT hell yeah brother
NeRF + Diffusion-based Super Resolution will be insane! What a time to be alive!
Diffusion models would be too slow, but perhaps with motion vectors a DLSS style solution could work.
I think nerf or guass splat was part of the way text to vid models like sora were trained as well
It appears that a lot of the footage has a computer graphics look
@@Wobbothe3rd too slow? This system is far from real-time. NeRF's don't use motion vectors, they use wholly independent images with no temporal stability.
Imagine, Google maps where you can walk anywhere the street view is available, this is no longer the realm of sci-fi, this could accelerate the speed of realistic games development, virtual reality worlds are within our grasp, I hope sci-fi writers are alive to see this
If I remember correctly, you can already do this in VR. There were a couple of apps, one named Wander.
@@john_blues I'll check it out
Is not the point of google maps to show what is actually there. AI is just guessing.
@@heww3960 he said it's not AI, it's a hand crafted algorithm, it's photogrammetry but better
@@john_blues doesn't the wander app show you a spherical picture? Like 360 TH-cam videos?
Pretty certain that's the demo city in Matrix Awakens.
What makes you think that? Because it looks similar? -.-'
@@lawrencefrost9063 It is tho. Same map, same size
Yeah it is, they used screen shots from MatrixCity to generate the city.
I remember Nerfs popping up two years ago and everybody was convinced, this is the future of 3D. Except a few good use cases, I don't see this go anywhere. Certainly not in games. Nerfs can't be animated, the lighting is static, making it impossible to rearange or stitch Nerfs together and to be of any use in simulation, you would need to calculate additional meshes to interact with.
Yup, as a graphics programmer I only see this getting useful in games if these neural volumetric rendering techniques got able to be animated, relit or the content could be edited by artists. (collision might not be necessary if we use it to paint skies or HLODs)
@@peremoyaserra2749 True. And dynamic objects need to cast shadow on nerfs and nerfs on dynamic objects. All calculations that would necessary to update effects like shadow, GI, reflections in realtime are just as heavy as realtime path tracing and we already got that at home...without completly falling apart on close distance.
No studio would thow away 30years of established 3D workflows to reinvent a complete pipeline.
I get that the results look fascinating but I have no idea why channels like corridor crew or two min papers, who should know better, keep selling it as "the future".
Looking at this comment section, people really need to calm down and stop holding on to their papers. GTA7 will probably look fine in 2050 but nerfs won't be part of it.
@peremoyaserra2749 True. There are so many limitations. Dynamic objects can't cast shadow on nerfs and nerfs not on dynamic objects. Updating effects like shadow, GI, reflection in realtime would be just as heavy as cyberpunk path tracing and we already got that at home...without completely falling apart on close inspection. Solving those problems isn't simply "two papers down the line"
Given the limited use, the results look incredible and worth reporting. But I don't get why so many people who should know better, from 2min papers to corridor crew and their mothers keep selling it as the future without even mentioning the limits. Looking at the comment section of these videos, people need to calm down. GTA7 will look fine in 2050 but nerfs certainly won't be part of it.
Skies aren't a bad idea tho. Static but far cheaper than volume rendering.
@@peremoyaserra2749 true. There are so many unsolved problems and solutions aren't just "two papers down the line".
Dynamic objects can't cast shadows on nerfs and nerfs not on dynamic objects. Calculating shadow, GI and reflections is just as heavy as path tracing and we already got that at home.... and it doesn't fall apart on close inspection.
It's fascinating tech and worth reporting but I really don't get why people who should know better, don't see the limitations.
The whole comment section needs to calm down. GTA7 will look fine in 2050 but nerfs certainly won't be part of it ^^
Ack! But I do hope that the next leap of nerfs will calculate intermediate images of video footage, because this could be a real game changer for so called 3D (stereoskopic) movies. Today we can not move while watching "3D" content, but how cool would be watching stereoskopic(-360°) footage where you can move your head a little too the side.
Didn't NeRFs get obsoleted by Gaussian Splatting?
Kinda sorta not really Gaussian Splatting leaves weird artifacts everywhere at least for now. They are a lot less resource intensive to run after training but there hasn’t really been Gaussian Splat XL or anything, just “4D Gaussian Splatting” for doing it in motion
Google Earth about to go hard
Dude named Google earth:
@@LinkRammer😂
Dude named Google earth: THAT'S WHAT SHE SAID !!
Do NeRFs actually generate 3D information (i.e. 3D point positions), or do they just render a 2D image of the 3D scene from a requested view?
There is no 3D info with NeRfs but image to 3D or text to 3D is possible using LLM's
2d image from a view...no collision info etc
@@Faizan29353 That's what I thought, but I couldn't remember. Thanks!
Shouldn't they get to work with Google Street, collecting all its images to have a better experience there?
You're wrongly assuming that these companies want to create better experiences and services. That goal is only a means to make more money and have more power which is their main objective
@@suibora Why would what I suggested mean less money and/or less power?
It only will make Street View better, and also allow to have much more realistic and "fluid" experience out of maps.
It might actually have various potentially new products, making more money.
@@LiranBarsisa It would mean less money because these companies would have to collaborate, which is not free.
Also do you think google will give Nvidia its street data for free? Or do you think Nvidia will give google its algorithms for free? For them to do what you suggest it will cost both companies in the short term.
Sure in the long term it may make the service better, but why would they spend money trying to improve something for non-monetary or power gain? I am playing devils advocate here and showing how THEY perceive things.
Their intentions are secondarily to provide good services and PRIMARILY to make money. They won't provide good services if it won't benefit them in the relative short term. Nobody plans for the future anymore, everything's about next the quarters or FY earnings :(
@@suibora But every product requires money in the beginning, and also maintenance and other improvements.
If you hold yourself from doing anything because it costs money, you wouldn't get far as a company, including for existing products.
Besides, they can come into agreements that might not transfer money from one company to the other. A shared project for a common goal.
the thumbnail is a bit misleading
Amazing insights on the scalability of virtual environments! The distributed rendering process is a game-changer. Looking forward to seeing how this technology will evolve and its potential applications beyond gaming.
Lol this is so weird! I remember playin Myst IV years ago and thinking, " what if you had hundreds or more images, maybe you could turn around and walk more freely.. No, the game looks too good, it would never work for a computer" It sounds stupid but i really tried to imagine a smart way to have pre-rendered images melded together some cool way. Awesome
Imagine creating an entire video game level by stitching together some concept art. That's where this is headed.
The mind of the beast being built in front of our very eyes.
This would probably be very useful for TV show producers trying to get the best shot for some kind of documentary
wait, so it's like gaussian splatting but works with just a couple photos and gives you a model of the world instead of a group of a bunch of different stuff that build the model? this is amazing, now anyone with i assume nvidia gpu will be able to do 3d scans of objects
No, GS and NERFs are mainly just different ways of "rendering". Neither method creates a 3D model internally so photogrammetry is still needed.
GS kinda does create a 3D model, it's just that instead of surfaces it's lots of elliptical gaussians. NERF is more like raytracing into a neural model. For each pixel you ask the model what color it should be. That means you need to have the NERF model in GPU memory to create each image. With GS you can use it to create a 3D file you could load into a game engine or e.g. Blender.
Level 3 - City
Level 4 - Country
Level 5 - The World
Level 6 - The Solar System
Level 7 - The Milky Way Galaxy
Level 8 - The Universe is the final boss.
What a virtual time to be run a city alive 🎉
did they take images of the ue5 matrix demo city for the city test
I want Euro Truck Simulator 2 to use this to make the cities ultrarealistic!
Yes! Why not an everything simulator? Planes, trucks, cars, boats, ships, walking, biking, shooting? Essentially GTA, but realistic.
@@jimj2683 Bruh at that stage just go outside and drive a truck 😂People are so focused on iteration and going even further and further that they forget that their real desire is to improve the journey.
That Cybertruck aberration is everywhere in those simulations...
this would be amazing for engineering structures quickly for prototyping
There is millions of pictures of cities in Google street view... Can you imagine take that pictures to use to make a real world with this technic?
WoW!! What a time to be alive!!
Crazy to think that you in the future we could have a game similar to GTA that would in somewhat real-time update based on construction work or changes. So next time you have a quest or something in that city it has changed based on a real time event. I know, it would be super complex but it's fun to play with the idea.
amazing as always
Is there any way we could experiment with one of these technologies? I have a rtx 3080 ti, maybe I could experiment with a small scene, but I don't know how to access this
Now this is more like it!
64 graphics cards for the city geometry. or have i got it wrong? 4:40
I don't think NERF will replace game worlds. What I hope is AI will be able to take NERF data and convert it into modular polygon pieces, and that it will be able to either dream textures onto the 3d space or draw from a collection of textures and combine them in such a way to avoid tiling and create micro details.
Using Street view imagery on the satellite 3d globe view so you can zoom down to the street and get so many more details. Or just a 360 completely smooth view, and being able to move around freely in Street View and related things. Yes please.
how long would it take for having a worldwide NERF made by satelital image?
It would be amazing if NERFs could somehow be used to navigate representations of the latent space in a stable diffusion model.
Can this technique be used for distributed cloud training of LLMs and Diffusion models by communities running consumer machines?
Everyday goes by I believe more and more that we are truly in a simulation
Oh my goodness, the papers are getting progressively capable, almost as if we soon will be able to simulate whole realities. This is probably just like a DALLE 1, wait till we get to DALLE 4
What a time to be alive!
Thanks Duckter. We need this for Google Earth Street view. It still won't prevent me from going on 40+km walks. 🚶
The graphics on the thumbnail looked more exciting.
Having human engineers develop this first is great for AI training data. I believe AI can further improve this technique!
Why not use stuff like octrees and something like some sort of 3d mipmapping LOD kinda thing, streaming details at big enough distances that the popping won't show when the camera moves, and maybe with some crossfading to further mask the popping?
I swear, the videos about papers you posted years ago birthed the age of AI we are having today
Hey guys, do you think it will be less GPU intensive with Gaussian Splatting ? Great vid BTW 👍
Mr. Anderson ... we meet again 😂
Not sure about running out of resources when data centers are now a standard size datacenter is 100K GPUs. And have cooling units the size of an apartment complex.
Is the rendered city a 3D model you could work with in like blender or something or can you chop it into a bunch of models?
Not without extra manual work. For now at least
Its produces a cloud of points, not a mesh
But you can chop it, and you can import into UE
Upd: sorry I meant gaussian splats, not nerf. Not sure about nerf.
You could probably train an AI model to automatically turn it into a 3d mesh.
Mind is blown. Virtual worlds are coming so fast.
I would gladly see a game that recreates real cities in low fidelity. Even if that meant low poly, limited color. There's something about real-world placement of objects that just cannot be beat for some genres of games.
Real world simulation before GTA VI
How long does it take for a full render? Or is this interaction in real time why it needs to be on multiple GPUs? I'd be interested in something that's capable of taking in data, identifying objects, simplifying textures for multiple uses throughout the larger model. It could literally generate entire open world maps from google maps data.
Of course, this kind of tech could also easily enable stalkers with drones, or military action with drones, or many other potentially bad actors with drones. And two papers down, with just camera phones.
Reading the book Permutation City by Greg Egan right now, and in combination with the development of Ai and virtual worlds, and I feel more and more like that meme from Twitter is turning out to be real 1:1: "Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don't Create The Torment Nexus"
For gaming you still a lot more, namely collision detection :)
That's all fine and dandy but I need the blender meshes or the assets to actually make a product with it. 😅
Who else thinks in few decades from now, we will have Racing games with the map size of entire continents. Real copy of our own world. Similar to Flight Simulator, but with real useable roads. Mind blow.
In the future, you will scan the battlefield in real time, run scenarios and perform the most cost effective offensive, defensive maneuver, based in all the chess pieces and others you can simulate... Perhaps we can get the most cost effective wars we ever seen.
Also, play GTA or wtv in whatever city you wish... Plus Flight simulator will be even more accurate..
I hope Im alive when we get the option to just live inside the Matrix....
The fact that nerfs can handle reflections blows my mind
"Could It be hell?" "Oh, yes It can... At least one." Are you sure you wants to cross this timeline?
What a surprise that their solution involves buying a bunch of Nvidia GPUs 😂😂
That city looks like the one in the Matrix demo they put out a few years back.
That is not a city digitized from photos but rather the Matrix City Sample from the Unreal demo.
Or it is digitized from photos of it.
But that sample is already so detailed that nerfs would just do a worse job anyway.
Awwwww... it's Leguna Seca as well, the best racetrack to experience in racing video games!
That citty is the matrix city thing form that unreal engine tech demo....
Imagine using inpainting to remove pedestrians and cars from Street View photos, then combining them with satelite data to generate a 3D model of our entire world
2:02 What's with the truck turning right on red, and snaking that blue car?
So low end RTX 5000 cards are finally going to have more VRAM at a lower price right?
Nerfs have major flaws if the goal is to create an environment you can explore freely. Most importantly, there's no 3D mesh. It's all just predictive changes based on nearby shots. So if you want to explore somewhere that's isn't close to the perspective of a bunch of pictures, then you're out of luck. It works if you're moving through the environment like a train following a track, but the more free you make the movement, the more attractive just making a 3D mesh that doesn't require recalculating every possible position gets. Aside from that, Nerfs are completely static. You can't snip a hole out of it and put something interactive there, or even something that moves. I challenge anyone to come up with an idea for a game where you move through a perfectly frozen environment and don't interact with anything, that anyone would be excited to play - it's less flexible than Zork and more computationally complex than Microsoft Flight Simulator. The combination of those two problems make this seem like a total dead end for gaming tech.
Good news for akt with distributed computing
I think it's kinda disingenuous when people suggest that this technology will soon revolutionize gaming; as far as I'm aware, you can't relight these scenes, ever. They can't respond to dynamic lighting.
Hm? I saw people using Nerfs from movies to relight the scene in Blender and such.
They can, what really bothers is that they have no collision volume.
It obviously can't be dropped into a game, as is, as an asset. But it can absolutely lend technology and advances to real time rendering.
You appear to be mistaken. you can relight a nerf, There are many papers on it; search "nerf relighting" there is even an example in the video at 5:35
Technology improves over time
Károly gratulálok az eddigi eredményekért. Szép dolgot építettél fel. 🎉
that sample size seems small honestly. let's get this scaled! =)
This is the time we hold our papers extra tight and call our neighbors to join in holding their papers as this is a wild leap, such a time to be alive!
It's starting to look like we live in a simulation.
Why does it sound like he recorded each word individually and remixed them to the video script?
What a time to be alive in RTX 9090 + GTA 7 era
This could eventually enable a mix between GTA and MS Flight Simulator! Maybe GTA7?
I thought Neural Radiance fields were too slow compared to Gaussian Splatting
Rendering developers using the same techniques they always do. Splitting the problem into parts and multi-threading.
The results are spectacular!
I cant wait until AI is smart enough to compete with reaserchers in intellegence and competence. Imagen How fast an AI could make conclusions without the need to rest.
AI will eventually find cures to all our worries and diseases.
Let's build a new gaming PC. I need:
1 super massive Mainboard, 64 RTX 4090 graphics cards, 16 TB of RAM, a Intel 18900K i11 CPU and full access to Elon Musk's bank account.
It is not particularly surprising that Nvidia comes out with a paper to be able to use a lot of GPUs to make NeRFs, but a cool advancement nonetheless.
It kinda seems like a Matrix type situation is almost inevitable doesn’t it?
got a garage full of rtxs mining , time to switch things up?
I found it astounding that in case of all that complex programming, rendering and simulation we can still use some analytical equations to describe the problem and what has been done. Because the equations, looking very academic, can only describe the problem and the solution on a basic level, but really not how data was handled and special methods were applied to reach this level of of the model, being described in the paper ;-) This brings me up to my conclusion if we still need papers in the future to make humans understand what is going on. Or we find another format that could be understood by not academic people.
That 25km squared is the UE5 matrix city.
Whole Planet simulations are next! 😎🤖
Digital twin very cool to play with....I would be happy with a part of a beach but there seems to be absolutely no real world apps for Oculus. Hard to know what is serious.
Send this paper to Google map or Yahoo Japan team, and see what the next generation game would become.
is it ready player one distopia ?
Ehm... Area scale like the square, so 6 km2 is 36 times bigger than 1 km2, and 25km2 is (25*25)/(6x6) bigger than 6 km2 😉
I'm still waiting for an open world racing game where the whole Earth is present. MSFS2020 but for cars.
Same. My guess is 2040, so still a while until.
How long before AI writes the next Two Minute Papers?
Good technology for Special Forces training
Dope!
People having an existential crisis about this tech, meanwhile me thinking about how cool GTA 7 would be
So are we sure we're not simulations running on somethings gizmo?
the city looks like from unreal matrix?