Does training 3D Gaussian Splats Longer Make a Difference?

The NeRF Guru

มุมมอง 23 989

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 19 มิ.ย. 2024
In this video, I compare 3 different scenes trained at 7k iterations and 30k iterations. I also show you how the launch different checkpoints in the viewer and resize the rendering window.
What do you think? Can you see the differences?
Check out the show notes to jump between the different scenes:
00:00 Intro
00:20 How to view different training checkpoints
01:57 Screening Plant 7k iterations
03:05 Screening Plant 30k iterations
04:07 Ferrari 7k iterations
04:47 Ferarri 30k iterations
05:33 How to launch the viewer with a larger scene resolution
09:01 Forest Scene 7k iterations
09:37 Forest Scene 30k iterations
Please follow my channel for advanced tips and more informational videos on computer vision!
Follow me on LinkedIn: / jonathans. .
Follow me on Twitter: / jonstephens85
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 126

@8bvg300 8 หลายเดือนก่อน ⁺¹⁰
I notice a common trend in your videos... showing stuff at postage stamp sizes. Remember to drag the viewer window to something that people can see and appreciate for themselves if there is significant detail increases.
@thenerfguru 8 หลายเดือนก่อน
Yea. I get much better at this in the video I drop later today. I really do appreciate the feedback though.
@Moshugaani 9 หลายเดือนก่อน ⁺²²
How do the reflections work with gaussian splatting in your renders? I've seen videos where the reflections on surfaces like water is treated like it just a window and there's mirrored geometry beneath the surface. But here the reflections actually look like they are reflections on a surface!
@thenerfguru 8 หลายเดือนก่อน ⁺²
It’s usually more like looking onto a mirror. This specific one is still mirror like geometry.
@fpv_everyday 9 หลายเดือนก่อน ⁺¹
'3D Gausian' is cool. Thanks for the cool intro. I also watched your beginners guide on it I stil have to try it out. Thanks for the nice video. Keep it up.
@PhilipNee หลายเดือนก่อน
kudos for all these great videos!
@electrochipvoidsoul1219 9 หลายเดือนก่อน ⁺¹⁴
When it comes to deep learning, the relationship is definitely more logarithmic (that is, needing an order of magnitude more training to get noticeably better results). That is to say, 7k vs. 70k might be a better comparison.
@thenerfguru 9 หลายเดือนก่อน ⁺¹
I think the point of diminishing returns hits much soon. At 70, it's going to be incredibly rough to find a lot of improvement. Not sure about this project, but in some the data can diverge after too much training.
@catfree 9 หลายเดือนก่อน ⁺²
I thought 3D Gaussian Splatting wasn't using any deep learning/AI?
@KyleCypher 4 หลายเดือนก่อน
@@catfree There is AI being used for the training, But while viewing not AI is being used.
@catfree 4 หลายเดือนก่อน
@@KyleCypher Ok thanks for clarifying !
@wix001HD 9 หลายเดือนก่อน ⁺⁷
I still haven't figured out if it whould by possible to use any other methods to prepare the data for training. Like using agisoft metashape to align the cameras and create a point clody (.ply) which is used during the training. Colmap is extremly slow and not so accurate. Any thoughts?
@thenerfguru 9 หลายเดือนก่อน ⁺³
Currently, COLMAP is the only option. I’ll have to explore.
@narendramall85 9 หลายเดือนก่อน
@wix001HD How much time it took for you, it took more than 2 hours for me on more than 250 images. I used 32GB vRAM GPU, 100GB RAM
@endrevarga5111 8 หลายเดือนก่อน ⁺⁵
Idea!
1. Make a low-poly 3D scene in Blender. It's a 3D skeleton. Use colors as object IDs.
2. Using real-time fast OpenGL engine, quick-render some hundred images, placing the camera to different locations like photographing a real scene for the 3DGS creation. The distribution of the camera should be easy using Geometry Nodes.
3. Using these images, use Runway-ML or ControlNet etc. to re-skin them according to a prompt. If possible, use one image to ensure consistency.
4. Give the re-skinned images to the 3DGS creation process to create a 3DGS image for the scene.
Et voilà, a 3D AI-generated virtual reality is converted to 3DGS.
@thenerfguru 8 หลายเดือนก่อน
Cool idea!
@kozyboiiii1341 หลายเดือนก่อน
i've been seeing on PSNR and SSIM on this 3DGS, how can i know on that metric?
@DGFig 9 หลายเดือนก่อน ⁺¹⁰
Hey, Jonathan! I noticed the same thing. 7 k iterations is already very clear. Maybe creating a checkpoint at 14 k iterations would be perfect, instead of 30 k.
@thenerfguru 9 หลายเดือนก่อน ⁺³
Someone on Twitter who has been working on optimizing the project to run faster did a bunch of independent tests as well. He noticed at 15K, the quality improvements are very minimal. I agree, I will set it to run to 14-15k in the future.
@panonesia 6 หลายเดือนก่อน
is there any method when we only have 8gb vram to process large dataset?, i have 3060ti and maximum number image only 60-70 image (3000x2000 pixel), maybe slow down speed train or split precess? i always have some error message when process 30k iterations (7k iterations success some times), looking for advice
@NEOnuts 9 หลายเดือนก่อน ⁺⁴
Thank you for the tutorials and info, would love to know a little more about how are you capturing data, my tests with splatting always get smoke(ghosts) after i train it. Keep the videos up, also im trying to hack a way to export a camera path from blender.
@thenerfguru 9 หลายเดือนก่อน ⁺²
Please let me know if you figure out the blender hack. I will also be making a video on how to view this with the Nerfstudio viewer and create animations.
@thenerfguru 9 หลายเดือนก่อน ⁺¹
I also need a video on capturing images. It all comes down to camera movement and consistency in lighting.
@narendramall85 9 หลายเดือนก่อน
@@thenerfguru please release that video on capturing images/video
@murcje 9 หลายเดือนก่อน
I believe Luma ai has a thing for blender camera to nerfstudio and back, tried it once but couldn't get it working, 99.99% sure that's just my lack of blender skills
@mik0lvj 8 หลายเดือนก่อน ⁺¹
Forest Scene looks crispy as hell
@thenerfguru 8 หลายเดือนก่อน ⁺¹
Surprisingly captured with just my iPhone 13 Pro. No gimbal or anything with it.
@murcje 9 หลายเดือนก่อน
Great video again! I wonder if the quality could increase even more when training with higher resolution pictures, since when training starts it wil default to resize to below 1.6k. Will try that
@thenerfguru 9 หลายเดือนก่อน
I do believe there are some diminishing returns. I would like to see the test. I will most likely give it a try. I don't think it's worth trying imagery above 4K though. You will be quite VRAM restricted with such high res input imagery. I would rather have more images from new viewpoints than higher resolution imagery.
@murcje 9 หลายเดือนก่อน
@@thenerfguru I did a few tests and can confirm that in these cases it's not worth the exrta hrs of training in 4k instead of 1.6k. I did also notice a big difference in definition when going from 15k to 30k its. I have a couple of codepen sites to compare the results if you are interested. Thanks!
@pixxelpusher 9 หลายเดือนก่อน
How would you view these in VR? Does the viewer allow that?
@user-dg2tr1oh5l 6 หลายเดือนก่อน
Thanks for your guide on using 360 videos. The latest meshroom works with a small change in command.
I use a tourism shooting for training. There were people walking in the video.
The loss decreased to 0.03 very soon, but went back to 0.2 after many iterations
Iter: 140,000 Loss: 0.025812
Iter: 3,129,000 Loss: 0.203647
Should I delete some input pictures with too little scenery or adjust some training parameters?
@caleb5717 9 หลายเดือนก่อน ⁺¹
At 9:00 it looks like only one dash was used with iteration. Would that maybe make it default to 30k for that scene? Either way the detail this method creates is amazing.
@thenerfguru 9 หลายเดือนก่อน
I think you are right. On my social media accounts I have posted a few high quality render stills for comparison. It’s a great way to compare.
@tamiopaulalezon9573 หลายเดือนก่อน
what 3d gaussian splatting did you use?
@jag24x 9 หลายเดือนก่อน
Can you please do a video on how to use the video recording in SIBR. Perhaps also on a way to create a camera path in SIBR and then render the camera path. Thanks, keep the video comings! :)
@thenerfguru 9 หลายเดือนก่อน
The SIBR viewer is not great for animation flythroughs. Try using Nerfstudio. here is how: th-cam.com/video/A1Gbycj0bWw/w-d-xo.htmlsi=Oo5BM5KKIDHJAsbn
@HiHeat 9 หลายเดือนก่อน ⁺³
Hi! Can we somehow export these 3D data to professional software? Export point cloud to 3d software such as 3d max, cinema 4d and the like?
(If possible, it would be great if you record a video tutorial about exporting)
@thenerfguru 9 หลายเดือนก่อน ⁺²
No currently. Also, if you just want a dense point cloud you can get that with photogrammetry tools. This is creating splats which are like spheres that stretch and morph to fill the scene.
@Instant_Nerf 9 หลายเดือนก่อน ⁺¹
@@thenerfguruSo we need a new type of format file model to view, edit etc like we do with fbx, obj .. and software to support it… that can also be combined with other model formats
@coffeeeagle 9 หลายเดือนก่อน
pretty sure you can do cube marching to get a mesh, correct? I know you could with other nerf tools
@matemarschalko4768 9 หลายเดือนก่อน ⁺¹
@@Instant_Nerf Maybe 10 years from now, games and game engines won't be rendered the same way with polygons and textures ... it will be nerfs, light fields, gaussian splats or something else
@adriandmontero5780 9 หลายเดือนก่อน
Hi Buddy, you know if is possible export the project or the model to Unreal Engine?, thanks for your tutorials and for share your knowledge
@thenerfguru 9 หลายเดือนก่อน
I've seen proofs of concept, not official public projects though. My next video will be this in Unity.
@visualstoryteller6158 9 หลายเดือนก่อน ⁺¹
Does transparent surface make a difference? Like the car reflection is good bt i dont knw about basic normal glass or sub surface type material.
@thenerfguru 9 หลายเดือนก่อน ⁺¹
I haven’t tested many transparent surface scenes. I can try!
@Ranstone 9 หลายเดือนก่อน
I understand Regan field rendering and I have no clue how it calculates the reflections...
@oskarwallin8715 13 ชั่วโมงที่ผ่านมา
Wouldve been awesome to see an updated video on setting up a conda enviroment from a fresh windows install. (all dependencies). I've been running through your tutorial on setting it up, but i keep hitting alot of snags (such as cuda / Pytorch issues, Environment vars not working great (Colmap.exe needed to be moved from bin to lib etc, ffmpeg.exe needing a absolute reference to the exe to be picked up, cuda versions in yml files not existing in the right channels, submodule pip dependencies not working etc) following your tutorial doesnt work straight out of the box anymore.
@WhiteDragon103 9 หลายเดือนก่อน ⁺¹
How exactly do the gaussian splats (which, are just stretched out blobs of solid color with blurry edges, similar to 3-D brush strokes) able to model viewpoint-dependent effects like reflections on curved surfaces?
@thenerfguru 9 หลายเดือนก่อน ⁺³
Spherical harmonics. The splats are not uniformly colored.
@WhiteDragon103 9 หลายเดือนก่อน
@@thenerfguru how many harmonics are used? I know SH can be used to efficiently represent radiance appropriate for diffuse lighting, but for reflections with sharp edges you'd need a ton of parameters per gaussian.
@basspig 9 หลายเดือนก่อน ⁺⁷
Can these scenes be exported as 3D geometry with textures to blender?
@culpritdesign 9 หลายเดือนก่อน ⁺⁶
Not yet, but that is mentioned in the white paper as an interesting possible feature
@mattizzle81 8 หลายเดือนก่อน
But the rendering this way is so much nicer, why would you want to? This is primarily a rendering technique not a scanning technique.
@basspig 8 หลายเดือนก่อน
@@mattizzle81I thought it was a 3D capture technology.
@mattizzle81 8 หลายเดือนก่อน
@@basspig Not really not for geometry. Colmap already does that and is part of the pipeline as the first step, but Colmap has been around forever.
@vmafarah9473 9 หลายเดือนก่อน
y r u thinking we are on a 100inch 4k tv , why cant u stretch the vport window ?
@jorbedo 4 หลายเดือนก่อน
It is possible to run it on a linux or over the cloud transfering photos/video to a H100 80Gb GPU?
@thenerfguru 4 หลายเดือนก่อน
Today the best way to train gaussian splats is with Nerfstudio. They include instructions for Linux setup: docs.nerf.studio/quickstart/installation.html
@stefanveselinovic4777 9 หลายเดือนก่อน ⁺¹
Can you explain what are the iterations doing?
Could these splats be splatted onto a 3d mesh constructed from these points for faster rendering?
@thenerfguru 9 หลายเดือนก่อน
Iterations are basically incremental steps to improve the output data. The algorithm is further refining the model to attempt to match the source input images. There are diminishing returns though. After a number of iterations, improvement is practically undetectable.
@Iostal 8 หลายเดือนก่อน
@@thenerfguru Is this independent of the scene size? My first dataset is of a large area with ~3000 high res photos taken from a video of a laneway, i'm halving the resolution and after 100k steps the result isn't recognizable, do you think the number of iterations required scales with the size of the dataset?
@jamesriley5057 9 หลายเดือนก่อน
I'm new here, so I'm having trouble with context. You're taking drone footage and creating a textured mesh using your own code? Can we import the mesh into blender yet?
@trickster721 7 หลายเดือนก่อน ⁺¹
It's not a textured mesh, it's a new method for rendering photogrammetry point clouds directly, like one massive 3D texture.
@jamesriley5057 7 หลายเดือนก่อน
@@trickster721 ok. I'm interested. A 3D texture is the equivalent to a UV unwrapped 3D model. When this tech hits the 3D modeling world, where I live, it's going to be huge
@trickster721 7 หลายเดือนก่อน ⁺¹
@@jamesriley5057 Not a UV mapped texture, an actual 3D image, like a JPEG with a 3rd dimension. Similar to voxels.
@whata7570 9 หลายเดือนก่อน ⁺²
I would say it would depend on your project, but I think 7k is good for me. On a side note, can this software use LiDAR e57 scan images to create the Gaussian Splats models?
@thenerfguru 9 หลายเดือนก่อน
e57 scan images? You would need source images and pose information. I guess you could use an e57 from a scanner in place of a sparse dataset. But not 100% sure
@oskarwallin8715 13 ชั่วโมงที่ผ่านมา
@@thenerfguru im trying to look into this now as well.
@mihalydozsa2254 9 หลายเดือนก่อน
Interesting that in the Ferrari scene when you go higher than the plane of the images it does not know what to show. I does not remember something like that with NERFs, at least when I tried if it could reconstruct it from the side it knew from the top. I guess because it does not know what would be the reflection. I just happened to not try with something like that.
@thenerfguru 9 หลายเดือนก่อน ⁺¹
Ideally, you would have more angles. By chance, I just got 3 image sets of a race car today. We’ll see how it looks!
@MrGTAmodsgerman 9 หลายเดือนก่อน ⁺¹
Does it still need the RTX series of graphic card to run that like NeRFs? And what is the minimum VRAM?
@thenerfguru 9 หลายเดือนก่อน ⁺²
I don’t think so. You can run it on an A6000. What do you have? Speed of training usually is dependent on the number of CUDA cores
@MrGTAmodsgerman 9 หลายเดือนก่อน
@@thenerfguru I have a GTX 1080Ti 11GB 😂
@wasdfg662 8 หลายเดือนก่อน
is it possible to export Gaussian splats as geometry? like obj or fbx files?
@thenerfguru 8 หลายเดือนก่อน
Not yet. Soon I bet. It’s technically possible.
@AlphaSeagull 9 หลายเดือนก่อน
"This is something you'd commonly see on command prompts"
Me eating pretzels never/rarely using comand prompts for anything: "Right of course"
@thenerfguru 9 หลายเดือนก่อน
😂
@Because_Reasons 8 หลายเดือนก่อน
I can't seem to rotate in fps view ever. My mouse does not respond, and in trackbnall mode rotations are odd.
@brianbell3028 8 หลายเดือนก่อน
ijkluo do rotations. It's really weird. (Also, in case you didn't know q and e also move the camera, in addition to wasd)
@realthing2158 9 หลายเดือนก่อน ⁺¹
Is there a way to convert the results to high res geometry with textures? I suppose one way would be to take screenshots from different angles and use those with photogrammetry to create a 3D model. Somebody could automate that process and make use of new AI techniques to improve the photogrammetry output. Could produce good results.
Of course the best thing would be to render the nerfs in realtime directly, but I think we are some time off before that becomes mainstream, especially for animated objects. I need to use geometry for now in the project I'm working on.
@thenerfguru 9 หลายเดือนก่อน
Currently, there is not a great way to produce high quality textured meshes from 3D Gaussian Splats. The goal of this project is novel view synthesis. There has been follow on work to produce meshes, however, they are low poly and you would need to manually texture everything. Photogrammetry is still the SOTA method for textured meshes. Have you tried Luma AI’s mesh export?
@realthing2158 9 หลายเดือนก่อน ⁺¹
Thanks for the reply. Yes I have tried Luma AI's video to 3D feature. I got fairly good looking results for the nerf but the mesh was a bit too blotchy and lacking in details to be directly usable. Development is happening so fast now though, in a year we might be able to create anything in 3D using only a single image from Midjourney. :) And then if it all can be rendered in extreme detail using nerfs and Gaussian splatting it would be mindboggling. @@thenerfguru
@outlander234 8 หลายเดือนก่อน
@@realthing2158 I wouldnt count on it. With developments like this the rate of improvement is high in the beginning but to get to that last 10-15% is massive task and 100% is probably impossible. Self driving cars are best example, sure they were capable for years now but to actually get them to be on par with humans and cover all the edge cases is still daunting task and nobody achieved it yet for a reason despite them claiming they would for years now.
@BenEncounters 9 หลายเดือนก่อน
Can this be hosted in WebGL?
@cannotwest 9 หลายเดือนก่อน
Does 3d gaussian splatting support objects movement/deformations?
@trickster721 7 หลายเดือนก่อน ⁺¹
It's basically just a method for creating a 3D photograph from many photographs, so you could conceivably make a 3D video instead using many crowdsourced video angles of a sporting event, for example. It would take an entire crypto farm of GPUs to run, but it's possible.
@gridvid 9 หลายเดือนก่อน ⁺¹
Is it possible to model, texture and light then render using this tech?
@thenerfguru 9 หลายเดือนก่อน
No. Not with this project. What you are looking at is gaussian splats - no your typical triangle based geometry. The splats come with their own baked in textures. Perhaps with some future development, this would be possible. Especially lighting.
@gridvid 9 หลายเดือนก่อน
@@thenerfguru but you could build photorealistic worlds using blender and use those cgi rendered images to generate realtime scenes with this tech... or not?
@thenerfguru 9 หลายเดือนก่อน ⁺¹
@@gridvid I think it is headed that way. Both NeRF and 3D Gaussian Splatting show promises for this.
@jordivallverdu2568 9 หลายเดือนก่อน
can we extract a mesh or pcd out of this ?
@thenerfguru 9 หลายเดือนก่อน
Possible. Not with this project though. I suggest checking out this project: leonidk.com/fmb-plus/
@jonnygrown22 9 หลายเดือนก่อน ⁺²
Can you share the videos you used to train the data?
@thenerfguru 9 หลายเดือนก่อน ⁺⁴
I can share the forest scene if you are interested! I’ll get around to it tonight and host it on my GitHub fork.
@jonnygrown22 9 หลายเดือนก่อน
@@thenerfguru thank you so much!
@jonnygrown22 9 หลายเดือนก่อน
Can you leave a comment replying to this once you've done that?
@MistereXMachina 9 หลายเดือนก่อน
I'm new to this, and have a 1080ti...is it too weak to do this?
@thenerfguru 9 หลายเดือนก่อน
Yes, I don't think the viewer will work.
@jorisbonson386 9 หลายเดือนก่อน
Glad I'm not the only one whose desktop is a clusterfuck of icons
@mrksdsgn 9 หลายเดือนก่อน
How big is the output file for each?
@thenerfguru 9 หลายเดือนก่อน
Easily 1 gb or more. Sometimes a bit smaller.
@sgproductions6336 9 หลายเดือนก่อน ⁺¹
Can u open that in Unreal Engine?
@thenerfguru 9 หลายเดือนก่อน ⁺¹
I’ve seen someone share a proof of concept but no code.
@qshiqshi2958 8 หลายเดือนก่อน ⁺¹
Wouldn't one be able to output this in VR?
@thenerfguru 8 หลายเดือนก่อน
Yes! I have a tutorial for how to get this in Unity. From there, you can integrate with VR.
@ALexalex-ss4sb 9 หลายเดือนก่อน ⁺¹
im sorry but i dont understand what happens in this video, are u AI generating 3d scenes that good? i tought AI wasnt that advanced yet
@thenerfguru 9 หลายเดือนก่อน
These are not AI generated. These are generated from a set of input photos. Then, a scene is recreated volumetrically.
@Thomason1005 9 หลายเดือนก่อน ⁺¹
hmm i wonder how long you can go without significant quality loss... 3000? 1000?
@thenerfguru 9 หลายเดือนก่อน
Do you mean how quickly do we approach diminishing returns?
@RogueBeatsARG 9 หลายเดือนก่อน
Making maps will be so ez if this works with less ram
@CharlesVanNoland 9 หลายเดือนก่อน
Should be fullscreening the 3D render if you're going to put it on TH-cam. I'm literally looking at a few inches on my screen to discern visual detail differences - and I'm watching this on a PC monitor. Imagine how little your rendering window looks for someone on a phone/tablet. Fullscreen capture or don't bother because nobody else will be able to see anything worth seeing otherwise. We're just taking your word for it!
@luke2642 9 หลายเดือนก่อน
Great video but you have to go full screen when showing the comparison.
@thenerfguru 9 หลายเดือนก่อน
Yea, oops!
@manda3dprojects966 8 หลายเดือนก่อน ⁺²
I cannot believe that the AI can detect a smooth reflective surface, in the past when AI didn't existed, such thing is impossible because the only possible thing was point tracking without AI, and when AI comes, everything changes, even a reflective surface, ...
@thenerfguru 8 หลายเดือนก่อน
So true! Go down the implicit neural representation rabbit hole. It’s a bright future for 3D reconstruction on scenes with featureless surfaces.
@meateaw 9 หลายเดือนก่อน
watching you grab the top of the window and resize the movement handle off the top *every time* was kind of frustrating :)
@thenerfguru 9 หลายเดือนก่อน
Yea. Now I just launch everything in fullscreen.
@Instant_Nerf 9 หลายเดือนก่อน
Why do people, not look great at all. But objects stationary look amazing
@thenerfguru 9 หลายเดือนก่อน
People don't stand still nearly as well as you think.
@Instant_Nerf 9 หลายเดือนก่อน ⁺¹
@@thenerfguru ohh believe you me .. I tried it on me. It didn’t work well
@Danuxsy 9 หลายเดือนก่อน
wait what? but I just took a bunch of photos of myself naked so I could have sex with my digital younger self in the future and you're telling me that ain't going to work? what a bummer!! 😡
@morglod 9 หลายเดือนก่อน
Bruh why you shrink viewport to 10x10 pixels
It's impossible to see difference on TH-cam
@samhodge847 9 หลายเดือนก่อน ⁺¹
You know the PSNR is what you should look at rather than an arbitrary glance
@thenerfguru 9 หลายเดือนก่อน ⁺¹
In my opinion, yes and no. PSNR is a quantitative approach that in itself is not perfect. Plus, this is to help people decide if the difference is enough for them for extra training. If I told almost anyone PSNR is 25 for 7k iters and 29 for 30k but provided no visuals, they couldn’t tell you visually what the difference would be.
@RektemRectums 9 หลายเดือนก่อน
It's difficult to listen to these videos when the guy uptalks like his self-esteem is so low his crippling depression kicks in as soon as he's off camera.

ต่อไป

เล่นอัตโนมัติ