NVIDIA Instant Neural Graphics Primitives (nerf)

Dirk Teucher

มุมมอง 24 383

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 พ.ค. 2024
This is a demo of the NVIDIA NGP system which converts video into 3d data (obj). I am talking through the steps needed to install and some of the errors I had while installing NVIDIA Instant Neural Graphics Primitives (NERF)
Steps to compile and requirements
github.com/NVlabs/instant-ngp
Use your own video
github.com/NVlabs/instant-ngp...
Scripts
- Generate the images and training model-
python colmap2nerf.py --video_in "..\data
erf\icemonster\icemonster.mp4" --video_fps 2 --run_colmap --aabb_scale 16
--You can run this code to get more info
python colmap2nerf.py -h
-- Generate the 3d mesh from
"./build/testbed" --scene ./data/nerf/icemonster
Use Cmake 3.22.3 (other versions may not work)
-Environment variables-
CUDA_VISIBLE_DEVICES = 0
NVIDIA_VISIBLE_DEVICES = 0
-Environment variables (path)-
C:\Program Files\CMake\bin
C:\COLMAP-3.7-windows-cuda\bin
C:\COLMAP-3.7-windows-cuda\lib
Make sure that all your requirements that need access from the CMD prompt are working by running the following commands in your command prompt and you should get a response and not errors.
colmap
ffmpeg -h
python
If you found this useful and want to show some appreciation you can buy me a coffee - www.buymeacoffee.com/dirkteucher
/ dirkteucher
ภาพยนตร์และแอนิเมชัน

ความคิดเห็น • 79

@DirkTeucher 2 ปีที่แล้ว
You can watch a more detailed installation video here - th-cam.com/video/aVZO0r16S5U/w-d-xo.html
@DirkTeucher ปีที่แล้ว ⁺¹
@You Tube You can export .obj but the quality is pretty terrible.
@jimj2683 2 ปีที่แล้ว ⁺¹³
In the near future this tech will be good enough to crowdsource 3d scans from around the globe to recreate a digital twin of the entire planet. People could just walk around with their phones (or drones and dashcams) and scan bit by bit until it could be used in a global driving simulator or GTA Earth!
Edit: The tech could also sweep the entire internet and use all the exisiting photos that have been uploaded to websites. I am sure most places already have hundreds of photos lying around on the internet. This tech will be able to extrapolate and make the 3d world from these existing photos and satelite images.
@vblackrender 2 ปีที่แล้ว ⁺¹
That is the absolute dream of everybody!
@brexitgreens 2 ปีที่แล้ว ⁺¹
That's just Google Street View, Google Earth, and Pokémon GO taken further. Not original by a long shot.
@jimj2683 2 ปีที่แล้ว
@@brexitgreens What a strange reply.
@vicmaestro 2 ปีที่แล้ว ⁺¹
@@brexitgreens And cars are just added horsepower on top of chariots. What's your point? Calling something not original isn't very original either, so maybe stop doing that.
@SkeleTonHammer 2 ปีที่แล้ว ⁺¹⁵
Point clouds could be usable... Blender can perform volume to mesh operations.
@DirkTeucher 2 ปีที่แล้ว ⁺²
Yeah I read something recently about how super efficient and fast point clouds are in the viewport now in Blender 3.1 . Though I think the most popular application for this would be with polygons or volume. Point clouds look pretty low resolution and if you convert a point cloud to a mesh in blender it is a bunch of vertices not joined by an edge. I just tried to convert that to a volume and blenders built in mesh to volume does not appear to be able to do that. Let me know if you find a way :D
@uirwi9142 2 ปีที่แล้ว
if you guys are aware of how to achieve this you should share that as well. if you've got the time to spare.
@DirkTeucher 2 ปีที่แล้ว ⁺¹
@@uirwi9142 I have not been able to figure out a way to export point clouds from NVIDIA NRG/NeRF. I was talking about point clouds in general. The only thing you can currently export from NRG/NeRF is .obj 3d meshes.
@Kram1032 2 ปีที่แล้ว ⁺⁶
The thing with attempting to extract a texture from this is, that NeRFs are plenoptic - meaning each location in the scene has colors dependent on the direction you look from. 2D textures are actually not enough to keep this information.
The other thing with it probably is that that'd also require some UV unwrapping which may not be so trivial, especially for really complicated, lumpy meshes.
@DirkTeucher 2 ปีที่แล้ว
Yeah I agree that this is not a trivial problem to solve. However the authors have figured out how to generate a mesh 3:46 and I would have thought that the vertex colors 4:07 already uses projection from camera (but this is just a guess). And if they are not it should be trivial to do this as they have figured out how to generate a mesh (even though it is lumpy) and should be able to project the texture onto the mesh from camera. I bet I could do image projection on the exported obj in blender using the camera data in the transforms json example. (Assuming the values there can translate to camera location and direction) .
I can think of another solution also. Depth maps using the new iPhone .heic image format which includes both the diffuse and the depth channels when you take a photo. I am having some success with this myself using the NVIDIA NeRF built in mask functionality. Though again I will say this is still too early for it to be usable as it does not look like the detail of a blended gradient mask value can be recognized by NeRF or used. It is only 0 or 1 for pixel visibility (not 0.1 - 0.9) when training and applying a greyscale depth map as a mask.
So the authors should be able to use a depth map to blend textures from multiple image projection cameras to improve the image quality and I am sure there must be a way to use this to improve the mesh quality also. The lower or darker the mask value the lower the importance of the texture and the inverse of the depth map could be used for the generation of the 3d mesh.
There must be a way to do this :D
@Kram1032 2 ปีที่แล้ว ⁺¹
@@DirkTeucher which camera do you want to project from? It's view-dependent, so every single camera choice will give different results. The height information is already covered by the mesh itself.
The details vary from version to version but the way NerFs typically work is, you raymarch a volume and, at each sample point, assign a transparency and a color. You do this for many rays from many directions and eventually refine to a pretty coherent scene.
The issues come from things like specular vs. diffuse light, sub surface scattering and more general volumetrics, anisotropy, the lighting the scene was taken in, and any sort of discrepancy between views (because time passed between them so stuff might have changed)
The most naive NeRFs don't compensate for changing scenes at all. They will just encode this in the directional aspect. Like, there was this idea of NerFies where people would just take selfies from many directions but keep looking at the camera. The NeRF would encode this eye movement into the directional dependence instead of recognizing "wait actually the eye moved"
There are ways to rectify this, allowing for time or various extra degrees of freedom. But even so, you get lots of different data that would at best have to be exported into an entire zoo of textures, and likely contain details shaders can't quite capture because they merely approximate reality.
It's possible, but honestly it's not as straight forward as spitting out a single diffuse texture and that's that. And I suspect splitting out all of the relevant textures is less straight forward. Ideally it'd give you an entire shader tree that will do the right thing, but that would have to assume a specific set of shaders and what not
@DirkTeucher 2 ปีที่แล้ว
@@Kram1032 *"which camera do you want to project from?..."* - All of them or perhaps 6 of them (x.-x, y,-y, z,-z) exactly as explained above. You raycast from the camera to the faces to select those with normals facing the camera then project the texture of that camera to those faces and store the uvmap for that region for export later to a single uv map. Colmap looks at all the images, infers common markers and then creates a camera position from where I assume the raymaching magic begins. Now what happens between colmap and when the 3d mesh is generated is well beyond my understanding but none the less a camera position and a mesh with polys are generated and a texture can be projected from that camera onto the mesh.
The UV that would be spit out in techniques like this in photogrammetry is typically horrible but you can also fix this quite easily in Blender or other DCC apps by using the mesh itself to project onto a cleaned up quad version of the mesh by projecting the texture of the mesh onto the clean uv copy. That way you get clean UV's. It does not matter if there are 40 UV islands or 40 UDIM tiles. And I agree that the tricky bit would be blending the textures in a way that worked well and did not create noticeable or blurry UV seams.
However as I already mentioned about the vertex colors not being terribly accurate perhaps this is one of the many challenges the authors had and could not solve at the time. I completely understand it is not as simple as it might seem and maybe that is just a problem inherent to this method.
*"The issues come from things like ..."* - Yeah I can understand how that is difficult to account for as well as things like reflections. I would not expect NeRFies to work in the same way I would not selfies with photogrammetry without the subject remaining completely still though I bet you could capture movement with cameras placed around the subject all recording from different angles and then process and store the data in an alembic cache or animated volume files.
And it does still seem to me that a good portion of this problem can be solved using depth maps to get better accuracy of where the lightfield projects color and creates meshes in the 3d space.
@ZacDonald 2 ปีที่แล้ว ⁺¹
Directional textures might mean it's possible to derive some roughness/specular information from the data, which would be amazing. But I doubt there's any obvious ways to do it.
@Kram1032 2 ปีที่แล้ว ⁺¹
@@ZacDonald It's probably possible with the right model. Regular NeRF is awful at learning reflections. There is a reparametrized version that gives much better reflections. I suspect that version of a NeRF could be used to derive this stuff quite well. Complete with anisotropy and all.
@uirwi9142 2 ปีที่แล้ว ⁺⁴
over and above this tech being super cool!
can i just say that i love the fact you just said, in case this helps someone out in the future.
@DirkTeucher 2 ปีที่แล้ว ⁺¹
Aaah appreciate that thought squirly. Cheers :D
@dylandavis4539 2 ปีที่แล้ว ⁺¹⁰
Doing photogrammetry with clay models. Even that takes 50-60 photos, a perfect environment, and probably an hour of work before even getting to polishing the model. If they include texturing this could save some major time on background assets. Might look even better on objects that aren't semi-transparent.
@uirwi9142 2 ปีที่แล้ว ⁺²
was about start typing something pretty similar to what you've just shared.
it's pretty freakin insane how well this works already.
loads and loads time trimmed.
@dinoscheidt 2 ปีที่แล้ว ⁺⁵
Don‘t forget that this isn‘t 3d geometry. The neural network skips everything and “dreams up” the pixels directly, than walks backward to create some voxels / depth maps. So there isn’t really anything you can texture 😅 Oddly in the future we might have NerF implementations that dream up so photorealistic impressions that you can take that to generate a million pictures and use traditional photogrammetry methods to derive traditional textures and 3d geometry like we are used to. Neural Radiance Fields are quite literally like asking someone “imagine a figure on a table and you walking around it”. He will give you the image, but not build a model - the walk back to triangles and textures is pretty hard (quite literally… why do we want triangles anyway? Well; only because of 30 years of tooling we have around them really).
@dylandavis4539 2 ปีที่แล้ว ⁺¹
@@dinoscheidt Ah understood. I've seen the NerF demo Nvidia gave - looks pretty dang awesome
@wozniakowski1217 2 ปีที่แล้ว
this already seems like a great tool to add to the workflow, especially when it comes to background elements. The current lack of texture export could be somewhat substituted with manual UV projection mapping from a few photos, or even a single one.
@emmanueloluga9770 2 ปีที่แล้ว
@@dinoscheidt So what do you think is next. Do light fields and lighting matter here? Thank you.
@dereklamberti6804 2 ปีที่แล้ว ⁺⁸
I think there is a misconception here on what NeRFs are. This is unlikely to generate "textures" and "polygons" in the traditional rendering sense. This tech should probably be better thought of as a neural network halucinating a light field which can be viewed from different angles. It's not likely to be a good path for photogrametery or 3D model capture...at least in its current form without a bunch of extra new tech.
@DirkTeucher 2 ปีที่แล้ว ⁺¹
I'm not sure what is likely and of course have no idea what the developers are thinking. But the authors do already have polygons and camera position set up as well as vertex colors which has got to be a considerable chunk of the problem solved. So it seems logical to me that it should be possible to project textures from camera if their system is already able to project vertex colors into a volumetric space and blend them together layer by layer while also creating a negative space where no objects exist. Sure the results are currently a bit rough but the software knows where the cameras are located and knows where the 3d mesh is generated so it should be able to project the texture from camera. Especially when combined with the new depth map capabilities in the latest cameras which can generate a diffuse + depth map in each image, this NeRF system includes a mask option where I have been using depth maps to mask out areas for the training to ignore and that helps a lot. NeRF should be able to create very realistic polygonal results and it already does with the lego example which is most impressive. And sure perhaps I am being overly optimistic about it but I tend to think anything is possible, even if these authors are not the ones to do it this should still get some brains ticking and will without a doubt help someone figure out a way to make it happen eventually :D .... also "all glory to the hypnotoad".
@brexitgreens 2 ปีที่แล้ว
@@DirkTeucher As soon as you get a poly mesh from the point cloud by whatever means (whether by manual editing or an algorithm), just project a frame image from the camera position onto your poly mesh. You can also project such images from several camera positions and then combine thus obtained textures into one texture in GIMP using median filter for superresolution and noise removal.
@a.aspden 2 ปีที่แล้ว
@@brexitgreens This sounds interesting. Do you know what this method is called or have any youtube examples/explanations?
@MeinVideoStudio 2 ปีที่แล้ว
Wow that was pretty nice
@AClarke2007 2 ปีที่แล้ว
You could try making a second fly-by video of the generated scene and run that through some more traditional photogrammetry software.
@lucaspedrajas5622 2 ปีที่แล้ว ⁺⁴
I think this is not intended to export 3d models as usual ... but it needs to be rendered with their volumetric rendering engine
@DirkTeucher 2 ปีที่แล้ว ⁺¹
"NVIDIA instant NeRF" does look like it is heading towards 3d polygonal models to me. So we will find out which direction it goes in the future. At the moment volume rendering is very slow on most applications so it's not broadly helpful to the CG community to provide tools to export these out to say OpenVDB files. I hope that NVIDIA figures out how to get better polygonal exports with textures from this tech somehow. That would be super useful to everyone. But volumes would be great too especially if they can figure out a way for them to render faster in other DCC applications.
@maxsiebenschlafer5054 2 ปีที่แล้ว ⁺¹
What kind of graphic card are you using? I also tried it 2 weeks ago, but sadly I couldn't get it working because of my bad GPU(GTX 970 only 4gb).
@DirkTeucher 2 ปีที่แล้ว ⁺¹
I am using a RTX 3090 . Someone else using a GTX 1060 with 4gb of vram got it to work if you were able to upgrade to the 10x series but they said it was very slow. So perhaps the 9x series is too old to run it. That GPU tech is 8 years old so I would be surprised if it did run well. I still have 2x gtx 980 myself that I use for rendering animations they are great cards !
The fox example uses up 2GB of vram or 3.3GB in system total. So 4gb is the bare minimum to be able to use NVIDIA NGP/NeRF. I would not worry about it though. This is still very experimental. You may want to try meshroom instead and wait for this tech to mature.
@maxsiebenschlafer5054 2 ปีที่แล้ว ⁺¹
@@DirkTeucher The problem is because of the old gpu the neural Network have to switch to a more storage intense but less demanding structure, so the storage that you need is around 5-7gb for the 🦊
@zaferec2354 2 ปีที่แล้ว ⁺¹
Meshroom is a photogrammetry program and open source. It looks like him. Meshroom is much more successful with photography. Nvidia is like the video version of it. It will develop more in the future. But until then, I'll scan through the photo.
@DirkTeucher 2 ปีที่แล้ว ⁺²
Yes I agree, Meshroom is a superior and free software that produces better and cleaner 3d mesh results than NVIDIA NGP. And reality capture is even more accurate if you want to pay for it. But NVIDIA NGP can create a visually almost identical volume in 60 seconds which would take Meshroom up to 1 hour to do. I really hope the engineers keep working on this. It could be amazing.
@zaferec2354 2 ปีที่แล้ว ⁺¹
@@DirkTeucher Yes, i can't wait for that.
@Cool-wh6ov ปีที่แล้ว
Did you try using more images to make it more 'useful' as a 3d model? If so, how was it
@Cool-wh6ov ปีที่แล้ว
Like to mesh with more images
@DirkTeucher ปีที่แล้ว ⁺¹
@@Cool-wh6ov If I use more images it uses up a lot more vram and the improvement is not that significant when it comes to creating a 3d mesh and exporting it as obj . So photogrammetry is still currently best for that purpose. However there are other tools NVIDIA are developing that might make this a lot easier in the future. If I find anything that works I will be sure to post about it here on YT .
@florianclaaen7535 2 ปีที่แล้ว ⁺¹
I'm constantly getting a Fatal error from colmap, even after adding everything into my PATH and making sure I'm using the correct version - does anyone else have this problem or maybe a fix?
@DirkTeucher 2 ปีที่แล้ว
Did you add the colmap lib and bin folders to the path? And are you using 3.7 ?
@Garycarlyle 2 ปีที่แล้ว ⁺¹
Instead of always using 3D artists could get a sculpter to make models in a t-pose. That would be great
@Zeak6464 2 ปีที่แล้ว
please post more about this
@DirkTeucher 2 ปีที่แล้ว
Done :D I went into more detail in this video on how to get NERF running - th-cam.com/video/aVZO0r16S5U/w-d-xo.html - let me know if you were wondering anything else about this.
@wolfofdubai 2 ปีที่แล้ว
Hey, where I can download this and test it?
@DirkTeucher 2 ปีที่แล้ว
Link is in the description 👍
@lifemarketing9876 2 ปีที่แล้ว ⁺¹
you don't even need video you can use photos.
@stuffy.design ปีที่แล้ว ⁺²
Doesn't it export vertex color already?
@DirkTeucher ปีที่แล้ว
Yes it does. I found that out after making this video. But it exports it in a way that is not compatible with all software. Blender for example did not seem to be able to import vertex colors when I last tried it a couple weeks ago but other software does load it correctly. However NeRF devs are updating fairly quickly/regularly so maybe they will add better export options. I have my fingers crossed for volumetric point clouds. That would be pretty neat.
@dpredie ปีที่แล้ว
@@DirkTeucher can you show how to export/import to another 3d software
@DirkTeucher ปีที่แล้ว
@@dpredie Sure thing ... I show how it is done in this video - th-cam.com/video/knB9t9PEOVY/w-d-xo.html
@pxrposewithnopurpose5801 2 ปีที่แล้ว
i really want to use this in workflow
@DirkTeucher 2 ปีที่แล้ว
What did you have in mind?
@timeTegus 2 ปีที่แล้ว ⁺¹
i have a 1070 and its ultraa slow there
@nokiaaingel 2 ปีที่แล้ว ⁺³
can you make a step by step tut? for idiots like me?
@DirkTeucher 2 ปีที่แล้ว ⁺¹
Hey Nada. Give it a go, it is easier than you think. If you get stuck drop me a comment and let me know where you are stuck and at what stage of the installation or what error message you got, happy to help. I plan to do another video this week so could include a section about installation in more depth if you or anyone else is having problems :D
@DirkTeucher 2 ปีที่แล้ว
If you were still stuck I went into more detail in this video on how to get NERF running - th-cam.com/video/aVZO0r16S5U/w-d-xo.html - hope it helps you out.
@SkateTube 2 ปีที่แล้ว
Which graphics card version You used, I have GTX 1650, would it work?
@SkateTube 2 ปีที่แล้ว ⁺¹
Actually testing it right now. The training seems really slow, but the problem is finding right settings for best .obj output that would not be more than 500mb large.
@SkateTube 2 ปีที่แล้ว ⁺¹
Doing images to 3d object. Not videos thought.
@DirkTeucher 2 ปีที่แล้ว ⁺¹
I am using a RTX 3090 . And I am not sure but I would expect your 1650 should work along with any Cuda enabled NVIDIA cards made in the last 5 years.
The fox example uses 2.4GB of Vram (50 images) so the 4GB of Vram on a 1650 is probably going to be your biggest restriction I would think because when you generate a 3d mesh that amount needed can double at higher resolutions. The icemonster example I built (96 images) here uses 3.5GB of Vram to load into memory and at the highest resolutions uses 24+ GB of Vram which can crash the app. A 20xx series card with 8gb of Vram would probably be my minimum recommendation. Try it and let me know if it works I would be curious to know😁
I will be sticking with photogrammetry for now as "capturing reality" software produces much better 3d models that also include their 3d textures. But this instant NGP tech while very impressive is still just a nice toy to play with but I will keep an eye on it because it might be the future of photogrammetry.
@SkateTube 2 ปีที่แล้ว ⁺¹
@@DirkTeucher my problem is that UI is not responsive, can't stop training that easily, and yeah memory crashes it a lot. They would just need to take care of that by allowing buttons based on memory usage.
@DirkTeucher 2 ปีที่แล้ว ⁺³
@@SkateTube Oh cool so it works .... I was writing the above comment while getting the Vram info for ya so missed your first few comments :D . Glad it is working .
The UI is tricky for me too. It often does not stop training unless I tap the button just as the button highlights I had to edit out some of this video because it took like 8 clicks to get it to stop. So have to time it just right.... Alpha software huh
Last night I found something really cool that I want to explore more which is the Lego example that is on google drive linked from github. That example uses 3d rendered image + a depth map for the lego digger example. And that 3d mesh is super clean. So if I can figure out a way to take images with depth like the new iPhones has then I might be able to generate super clean geometry for small objects. Just a thought. If I figure anything out I will no doubt post something about how to do it.
@milagro2300 2 ปีที่แล้ว ⁺⁵
In a nutshell: Its a bad photogrammetry
@DirkTeucher 2 ปีที่แล้ว
Kind of :D .... I would say it is the fastest way to create photogrammetry without being able to create texture maps and accurate 3d meshes as seen in traditional photogrammetry.
@thehulk0111 2 ปีที่แล้ว
may you make video for noobs teaching them how to build it ?
@DirkTeucher 2 ปีที่แล้ว
I went into more detail in this video on how to get NERF running - th-cam.com/video/aVZO0r16S5U/w-d-xo.html - hope it helps you out.
@erichylland4809 ปีที่แล้ว
Seems like everyone wants to use this like photogrammetry but that's not really what it's for or how it should be used. Corridor Crew did a great video on what it is and what it isn't - th-cam.com/video/YX5AoaWrowY/w-d-xo.html
@DirkTeucher ปีที่แล้ว
Yeah that's exactly why I made this video - th-cam.com/video/knB9t9PEOVY/w-d-xo.html . Photogrammetry is still the best choice for clean 3d topology and textures. NERF is something else entirely. However it might replace photogrammetry one day. In a few years at least I would think. Maybe even 5 years I would think at least.
@erichylland4809 ปีที่แล้ว
@@DirkTeucher I've been thinking that maybe it could be used to cut down the amount of work photogrammetry takes, by using a quick vid to create a NERF and then use that to generate unique views for photogrammetry. Haven't found the time to play yet though. built into a game engine this tech would be awesome. Have a low res model to represent an object in a 3d space and use its transform matrix relative to the camera to produce a NERF overlay. Blazing fast photoreal games on (possibly) low end hardware.

ต่อไป

เล่นอัตโนมัติ

NVIDIA'S HUGE AI Chip Breakthroughs Change Everything (Supercut)