NeRF: Neural Radiance Fields

Matthew Tancik

มุมมอง 273 437

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 19 มิ.ย. 2024
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall*, Pratul P. Srinivasan*, Matthew Tancik*, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng
*denotes equal contribution
Project Page: www.matthewtancik.com/nerf
Paper: arxiv.org/abs/2003.08934
Code: github.com/bmild/nerf
ภาพยนตร์และแอนิเมชัน

ความคิดเห็น • 124

@suharsh96 4 ปีที่แล้ว ⁺¹⁵²
What a time to be alive !
@blackbird8837 4 ปีที่แล้ว ⁺²
Everyone would think that during the time they live in as only so much as currently released can be drawn from as reference to make this statement.
@IRONREBELLION 4 ปีที่แล้ว ⁺¹⁴
Dr Károly Zsolnai-Fehér
?!?!
@EnriquePage91 4 ปีที่แล้ว ⁺¹³
Dear Fellow Scholars !
@EnriquePage91 4 ปีที่แล้ว ⁺¹
IRON REBELLION literally same lol
@IRONREBELLION 4 ปีที่แล้ว ⁺¹
@@EnriquePage91 I love him so much haha
@AndreAckermann 4 ปีที่แล้ว ⁺¹⁶
This is amazing! The implications for 3d and compositing workflows alone are mind-boggling. Can't wait to see this filter through to mainstream products.
@DanielOakfield 4 ปีที่แล้ว ⁺⁴
Absolutely exciting time ahead, thanks for sharing Matthew!
@bilawal 4 ปีที่แล้ว ⁺³
2:45 🔥 elegant approach and game changing results - handles fine scene detail and view dependent effects exceedingly well
@Paul-qu4kl 4 ปีที่แล้ว ⁺⁴⁹
Progress in photogrammetry has been very slow in recent years, hopefully this approach will yield some new developments. Especially better reflection and translucency capture, which has so far been huge problem, looks like a possibility based on the shown scenes. Very exciting, especially with VR equipment becoming more prevalent. Can't wait to capture real 3D scenes (albeit without motion), rather than stereoscopic video.
@technoshaman001 ปีที่แล้ว ⁺¹
th-cam.com/video/Pb-opEi_M6k/w-d-xo.html this is the latest update 3 years later! pretty amazing IMO lol
@blender_wiki 4 ปีที่แล้ว
Outstanding work. Congratulations
@user-tt2qn1cj1x ปีที่แล้ว
Thanks for sharing and also mentioning the other contributors to NeRF creation and development.
@5MadMovieMakers 2 ปีที่แล้ว ⁺¹
Looks neat!
@wo262 4 ปีที่แล้ว
Super cool. This will be super useful to sythesize light fields. A lot of people are saying this is not practical right now because it takes seconds to infer one POV. But real time visualization of light field data already exist so you could store precalculated inference that way
@johnford902 4 ปีที่แล้ว ⁺⁵
It’s awesome to live at this time and be participating in history.
@directorscut4707 ปีที่แล้ว
Mind Blowing! Cant wait to have this in google maps or VR implemented and explore the world!
@chucktrier 4 ปีที่แล้ว ⁺¹
This is insane, really nice work
@trollenz 4 ปีที่แล้ว
Brilliant work ! ❤️
@parkflyerindonesia 4 ปีที่แล้ว ⁺²
Matt, this is gorgeous! You guys are genius!!! Stay safe and stay healthy guys 👍
@marijnstollenga1601 4 ปีที่แล้ว
Amazing stuff!
@gusthema 4 ปีที่แล้ว ⁺²
This is amazing!!! congrats!! are you publishing this model on TF Hub?
@ScienceAppliedForGood 3 ปีที่แล้ว
This looks very impressive, progress here seems on the same level when GANs where introduced.
@huanqingliu9634 2 ปีที่แล้ว
A seminal work!
@suricrasia 4 ปีที่แล้ว
this is astounding
@jwyliecullick8976 2 ปีที่แล้ว
Wow. The utility is constrained by the images used to feed the neural network, which may not reflect in varied model environmental factors. If you have images of a flower on a sunny day, rendered in a cloudy day scene, they will look realistic -- for a sunny day. Anything short of raytracing is cartoons on a Cartesian canvas. This is an amazing technique -- super creative application of neural nets to imagery data.
@Fodi_be 4 ปีที่แล้ว
Astonishing.
@GdnationNY 4 ปีที่แล้ว ⁺¹
Stunning! Are these vector points, volume rendered point clouds? These fly throughs are image sequences?
@hecko-yes 4 ปีที่แล้ว
from what i can tell it's kinda like a metaball except instead of a simple distance function you have a neural network
@DanFrederiksen 3 ปีที่แล้ว
Nice. The two input angles are screenspace x y coords? and the x y z is the camera position in the training? how do you extract the depth data from the simple topology then?
@raycaputo9564 3 ปีที่แล้ว
Amazing!
@joshkar24 4 ปีที่แล้ว
Can this be used to add head movement interactive option to vr movies? Would require a bunch of video cameras and expensive computer crunching beforehand, and how to store/stream the finished data? Or is that a use case where traditional real-time 3d engines are a better fit? Or some hybrid blend?
@martinusmagneson 4 ปีที่แล้ว ⁺¹³
Great work! Could this also have an application in photogrammetry?
@wandersgion4989 4 ปีที่แล้ว ⁺⁶
martinusmagneson From the look of it, this could be used as a substitute for photogrammetry.
@juicysoiriee7376 4 ปีที่แล้ว ⁺²
I think this is photogrammetry?
@Jianju69 ปีที่แล้ว ⁺¹
@@juicysoiriee7376 It is in the sense that it converts a set of photographs to a 3D scene, yet it does *not* create a 3D model in the conventional (polygonal) sense.
@AdictiveGaming 4 ปีที่แล้ว ⁺⁶
Can't believe what I just saw. Is there a chance you will make some feature videos? Like is it a real 3d model inside there? Would it be then possible to be imported to 3d software? How the materials are made? How can everything be so perfectly detailed and sharp?and so on and so on
@hydroxoniumionplus 4 ปีที่แล้ว
Bro, literally just read the paper if you are interested.
@letianyu981 3 ปีที่แล้ว
Dear Fellow Scholars !
This is two minutes paper with Dr Károly Zsolnai-Fehér
?!?!
What a time to be alive !
@BardCanning 4 ปีที่แล้ว
OUT
STANDING
♥
@antonbernad952 ปีที่แล้ว
While the hot dogs were spinning at 1:58, I got really hungry and had an unconditional craving for hot dogs. Still nice video, thanks for your upload!!!11OneOneEleven
@ImpMarsSnickers 4 ปีที่แล้ว
This thing will allow to create slowmotion from a simple video, and a SUPER SLOWMOTION from a slowmotion!
And also stabilize a shaking camera.
@alfvicente 4 ปีที่แล้ว ⁺⁶⁰
Sell it to Google so they can render 3d buildings properly
@nonameplsno8828 4 ปีที่แล้ว ⁺⁶
its available for free,they might use it on street view and create a real 3d google planet thing.
@Romeo615Videos 3 ปีที่แล้ว ⁺²
@@nonameplsno8828 wish there was a step by step tutorial to try this with
@hherpdderp ปีที่แล้ว
@@Romeo615Videos there is now but you have compile it yourself.
@roshangeoroy 9 หลายเดือนก่อน
Check the paper. It's under Google Research.
@erichawkinson 3 ปีที่แล้ว ⁺¹
Can this method be applied to stereoscopic equirectanular images for use in VR headsets?
@musashidanmcgrath 4 ปีที่แล้ว ⁺⁶
Incredible work! I'm sure this will be coming to Blender in the future - I spotted the Blender shader balls. :D I'm assuming your team have been using Blender to extract the geometry, especially considering this is all open source Python.
@barleyscomputer 3 หลายเดือนก่อน
amazing
@YTFPV ปีที่แล้ว
Amazing stuff i need to wrap my head on how the depth is generated at 3:22 with the Christmas tree ? I am working on movie where we had to generate depth from the plate and we use all the tool in book but it's always flickering pretty bad never has nice. How would i use this if it's possible?
@azimalif266 4 ปีที่แล้ว
This will be Awesome for games.
@HonorNecris 2 ปีที่แล้ว ⁺²
So with NeRF, how does the novel view actually get synthesized? I think there is a lot of confusion lately with these showcases as everyone associates them with photogrammetry, where a 3D mesh is created as a result of the photo processing.
Is each novel view in NeRF created per-pixel based on an algorithm and you are animating the resulting frames of these slight changes in perspective to show 3 dimensionality (the orbital motion you see), or is a mesh created that you are moving a virtual camera around to create these renders?
2 ปีที่แล้ว ⁺³
It's the first. No 3D model is created at any moment.
You have a function of density wrt to X,Y,Z though, so even though everything is implicit, you can recreate the 3D model from it. Think of density as "somethingness" from which we can probably construct voxels. TO get a mesh is highly non-trivial though
This is kinda what they are doing when showing a depth map, they probably integrate distance with density along the viewing ray.
@davecardwell3607 4 ปีที่แล้ว
Very cool
@jensonprabhu7768 4 ปีที่แล้ว ⁺¹
Wow it's cool..say for example, to capture an average human just standing. How many pictures are required to capture enough detail?
@ArturoJReal 3 ปีที่แล้ว ⁺¹
Consider my mind blown.
@thesral96 4 ปีที่แล้ว ⁺⁴
Is there a way to try this with my own inputs?
@Jptoutant 4 ปีที่แล้ว ⁺¹
archive.org/details/github.com-bmild-nerf_-_2020-04-10_18-49-32
@josipkova5402 ปีที่แล้ว
Hi this is really interesting. Can you tell me maybe how much costs one rendering of about 1000 photos? Which program is used for that? Thanks :)
@ONDANOTA 3 ปีที่แล้ว
Are radiance fields compatible with 3d editors like Blender?
@unithom 4 ปีที่แล้ว
If the new iPads have LIDAR and a sensitive enough gyroscopic sensor - how long before this method can be used to capture objects? (Within 15’ radius of scenes, anyway)
@Nickfies 2 หลายเดือนก่อน
what exactly is theta and phi respectively? is one the rotation around vertical axis and phi the tilt?
@damagedtalent 4 ปีที่แล้ว
Incredible is there anyway I can do this at home?
@TheAudioCGMan 4 ปีที่แล้ว
oh my!
@ZachHixsonTutorials 4 ปีที่แล้ว ⁺⁸
So is this actual 3D geometry, or is it just the neural network "interpolating," for lack of a better word between the given images?
@GdnationNY 4 ปีที่แล้ว
These depth maps could be the next step to a mesh possibly?
@ZachHixsonTutorials 4 ปีที่แล้ว
@@GdnationNY The thing I'm curious about is if there are any ways to translate the material properties to a 3D object. The program seems to understand some sort of material properties, but I'm not sure if there is a way to translate that
@TomLieber 4 ปีที่แล้ว ⁺⁶
It's the "neural network interpolating." Each point in 3-D space is assigned an opacity and view-specific color by a neural network function. It's rendered by integrating that function over each ray. So it doesn't model light sources, their effects are just baked into everything the light touches. You could get a mesh by marching over the function domain and attempting to disentangle the lighting effects, but it'd take some doing.
@ZachHixsonTutorials 4 ปีที่แล้ว ⁺¹
@@TomLieber That's what I was thinking, but there is that one section of the video where the reflections on the car are moving, but the camera is not. That part kind of made me wonder
@TomLieber 4 ปีที่แล้ว ⁺²
@@ZachHixsonTutorials You can do that by projecting the rays from your actual camera position, but evaluating the neural network with the direction vector from a different camera viewpoint.
@Jigglypoof 4 ปีที่แล้ว ⁺¹⁹
So I can finally make my 2D waifus into 3D ones?
@loofers 3 ปีที่แล้ว ⁺¹
@Hajar Babakhani i don't think you understand. this would be able to take a drawing, if done well enough with proper shading, and extract full 3d elements. just google "3d cartoon art" or "3d drawings" and this would theoretically be able to render or paint an alternate angle of an object based on the existing, handdrawn view of that object. obviously with only 1 view, hidden features would be just that; hidden; but again here ai could be used to fill in "possible options" for optional details. AI is definitely advancing insanely fast, so fast that if an ai can piece together the pieces to gain understanding of those objects it is seeing, understand interrelationships between those objects, words, etc, we might actually see general AI within 5 years. i personally think that ai just needs a sort of secondary (or tertiary, i suppose) "overviewer ai" which would be watching the process of the ai adjusting its gan settings, and would tweak them to improve itself/match what it observes in the real world, and which could comment to itself on its own code, change its own code. i think we may need to be very very careful in these next few years in terms of restricting datacenter-level ai research. cause all it would take is an ai getting some money and access to remote mineshaft with robots that can weld and assemble small parts, and you have some real potential issues. it ships some cpus to those mines, and bam, we're in terminator 2 territory :p (plz be kind to me ai, i'm chill with you being god of the universe!) we've already given ai the ability to see, we're starting to give it the ability to draw things it saw, next we just need to get it the ability to dream/imagine/adjust itself and we're in scifi territory. i think a really good example of where this kind of view synthesis will be maybe applied in the future, is in both video compression and upsampling of older or lower quality video, and 3d scene extraction from movies and the like.
take a look at this:: th-cam.com/video/AwmvwTopbas/w-d-xo.html . here you see a sample of some 8k video upsamples that show real qualitative detail improvement; but you can also see the dot/spec/low image quality of the lich king model/character (at the time index in the video and a few seconds after that time index). if an ai would be able to grasp the semantics of the scene, the possiblity of object permanence etc, it could infer from the earlier content in the video that the dot there in the image is the character (the lich king, in this world of warcraft cinematic example) - through rules of object permanence, it could estimate that the spec/smudge in the scene is the lich king, simply viewed from another angle/distance, and thus convert the entire scene drawn herein into a fully 3d rendered construction, in higher resolution, or approximating the resolution of the source cad/blender file data.
it could at least do that after seeing the clip in its entirety, or with human intervention - but i think ai will get to where it can do it completely unaided, and probably quite soon (sub 3 years).
while this kinda sounds a bit scifi, 2-3 years ago i'd have said that the stuff we are looking at now in this NERF video is potentially doable by ai with extremely intelligent programmers/mathematicians steering the process, and look at where we are.
Matthew, you guys are literally creating technological magic. amazing work.
@Binary_Omlet 3 ปีที่แล้ว ⁺²
Yes, but she still won't love you.
@piotr780 ปีที่แล้ว
3:00 how this animation on the right is produced ?
@omegaphoenix9414 4 ปีที่แล้ว ⁺²
Can we get some kind of tutorial, more in-depth as to how to do this on our own? Can this be implemented into a game engine while still keeping reflections? Is this cheaper on performance than computer generated screen space reflections? I actually shivered watching this from how insane it looks. I've been into photogrammetry for quite some time now (I know it's not the same) and I would love to try and replicate this for myself as soon as possible
@omegaphoenix9414 4 ปีที่แล้ว
I may have misinterpreted what this does, but if so, can this be used in games?
@dewinmoonl 3 ปีที่แล้ว
cool research
@nhandexitflame8747 3 ปีที่แล้ว ⁺¹
how can i use this? i coudlnt find anything so far. please hjelp!
@hherpdderp ปีที่แล้ว
Am I understanding correctly that what you are doing here is rendering the nodes of neural network in 3d? If so I wonder if it could have non CG uses?
@BOLL7708 4 ปีที่แล้ว ⁺³
Now I want to see this in VR 😅 Is it performant enough for real time synthesizing at high resolution and frame rate? Sure puts prior light field techniques to shame 😗
@dkone165 4 ปีที่แล้ว ⁺²
"On an NVIDIA V100, this takes approximately 30 seconds per frame"
@BOLL7708 4 ปีที่แล้ว ⁺¹
@@dkone165 Ah, so I guess while not real time, it could be used to generate an image set to be used in a real time environment, although it could turn out just impractical. At some point we'll have an ASIC for this, I'd by an expansion card for it 😅
@fidel_soto 4 ปีที่แล้ว
You can create scenes and then use them in VR
@dkone165 4 ปีที่แล้ว
"The optimization for a single scene typically take around 100-
300k iterations to converge on a single NVIDIA V100 GPU (about 1-2 days)"
@Jianju69 ปีที่แล้ว ⁺¹
@@fidel_soto These "scenes" are not conventional 3D-model hierarchies. Rather, they are 4D-voxel sets from which arbitrary views can be derived within around 30 seconds per frame. Though some work has been done to "bake" these scenes for real time viewing, the performance still falls far short of being suitable for VR. Perhaps a robust means of converting these NeRF-scenes to high-quality 3D models will become available, yet we already have photogrammetry for that task.
@ImpMarsSnickers 4 ปีที่แล้ว
Glass, Car window and interiors, Reflections, only 20-50 photos... perfect scan result...
... On a second thought, I think I get the idea, I realized camera path walks only between frames shot by camera, and it's not so much about making it in 3d, it's about how light works, and a fast render result. It would be great for movies to erase things in foreground, leaving visible what is behind those things that pass in front of camera! In that case it's a great project :)
@Den-zf4eg 4 หลายเดือนก่อน ⁺¹
Якою програмою можна це зробить?
@antonbernad952 ปีที่แล้ว
Nice video, thanks for your upload!!11OneOneEleven
@spider853 ปีที่แล้ว
how was it train?
@WhiteDragon103 4 ปีที่แล้ว ⁺²
DUDE WHAT
@russbg1827 3 ปีที่แล้ว
Wow! This means you can get parallax in a VR headset with a 360 video from a real environment. I was sad that wouldn't be possible.
@THEMATT222 9 หลายเดือนก่อน
Noice 👍
@romannavratilid ปีที่แล้ว
Hm... so its basically something like photogrammetry...?
This could also help photogrammetry right...? Like i capture only lets say 30 photos... But the resulting mesh and texture might look like it was made from i dont know... 100+ photos...? do i understand this correctly?
@ak_fx 3 ปีที่แล้ว
Can we export 3d model?
@driscollentertainment9410 4 ปีที่แล้ว
I would love to speak with you about this!
@linshuang 7 หลายเดือนก่อน
Pretty fucking cool
@maged.william 4 ปีที่แล้ว ⁺²
@ 3:59 How the hell did it model glass!!
@ImpMarsSnickers 4 ปีที่แล้ว ⁺¹
I was thinking the same, also car windows with interior, reflections, and 1:58 where the hell they scanned real life Blender's Material Spheres?
@TomLieber 4 ปีที่แล้ว ⁺²
I'd love to know! I wish we could see where in space the reflections are being placed. Ideally, glass would be modeled as transparent at most angles except those with specular reflections, but the paper says that they constrained density to be a function of position only, so does that mean that in the model, glass is opaque and has the view through the glass painted onto it?
@shortuts 4 ปีที่แล้ว
holy s**t.
@ArnoldVeeman 3 ปีที่แล้ว
That's photogrammetry... 😐 (Edit) Except, it isn't... It's a thing I dreamt of for years
@IRONREBELLION 4 ปีที่แล้ว
Hello. this NOT Dr. Károly Zsolnai-Fehér
@unavidamas4864 4 ปีที่แล้ว
UPDT
@DalaiFelinto 4 ปีที่แล้ว
I believe the link in the video description is wrong, but I found the page here: www.matthewtancik.com/nerf
@Jptoutant 4 ปีที่แล้ว
been trying for a month to run the example scenes, anyone got thru ?
@vinesthemonkey 4 ปีที่แล้ว
It's NeRF or Nothing
@alfcnz 4 ปีที่แล้ว ⁺³
SPECTACULAR RESULTS and 3D animation!
If interested, feedback follows.
1. You start at 0:00 with a completely white screen. No good.
2. The project title does not get any attention by the audience, given that EVERYTHING moves in the lower half.
3. At 0:39 you are explaining what a hypernetwork is without using its name.
4. 1:58 6 objects spinning like tops. It's hard to focus on so many moving things at once.
@blackbird8837 4 ปีที่แล้ว ⁺¹
Doesn't look like anything to me
@kobilica999 4 ปีที่แล้ว
Because its so realistic that u just dont get how big deal that is
@blackbird8837 4 ปีที่แล้ว ⁺¹
@@kobilica999 I was referencing Westworld
@kobilica999 4 ปีที่แล้ว
@@blackbird8837 ooo that is different story xD
@holotwin7917 3 ปีที่แล้ว
Bernard?
@adeliasilva409 3 ปีที่แล้ว
ps5 grafics
@AndreAckermann 4 ปีที่แล้ว ⁺¹
This is amazing! The implications for 3d and compositing workflows alone are mind-boggling. Can't wait to see this filter through to mainstream products.
@AndreAckermann 4 ปีที่แล้ว
This is amazing! The implications for 3d and compositing workflows alone are mind-boggling. Can't wait to see this filter through to mainstream products.

ต่อไป

เล่นอัตโนมัติ

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (ML Research Paper Explained)