Jonathan thank you for the great walk through. The video structure is excellent. It is also really cool work you’re doing to stylize the base results!!
All github videos are like that. No one cares until it's integrated into some kind of consumer app. Ai stuff was the same way. Gan videos got no plays and no one was talking about ai until dalle and midjourny came out
Guy with worst image quality ever explains technique that produces best image quality ever. Just some geek joke, thank you for such a nice presentation
Bro your ChatGPT history looks a lot like mine, just tons of insane neerdy stuff. How fast is the retraining process? Like could you retrain a scene during gameplay with a good gpu? Especially if you had some trained gaussians already in place. I've been wanting to make a project like this for a while but I can't figure out a way to do it with mesh objects. What I want to do is have an AI NPC on level 1 start talking to you adout something, anything really and let the player lead the conversation. Lets say you mention clouds then it starts googling clouds and starts training based on that as a prompt. By the time you get to level 2 its ready and level 2 is based on some structure but everything has morphed into clouds. That would be nuts when you realize you are just manifesting your own game play with what you are talking with the NPC about. I was thinking about having some primative sphere for most objects then using picture to depth field from stable diffusion help reshape the mesh then wrap a new texture on it. I think it could work on a high end computer but this method seems like it would be almost a dreamy hasey way to do it. If you could buy some time in cut scenes and dialog interactions it might be possible. Anyway this whole process for rendering is just nuts. Trying to wrap my brain around it.
Can you say the final size of the scene, especially compared to the size of the input images and the number of iterations? Would it be viable to embed the scene representation into a video game, for instance, or are they enormous? (Or would we need to use something like Wavelets to represent the scene, and transform back into these harmonic values before rendering?)
Great video and explanation Jonathan! I really enjoyed your approach with CLIP to be able to modify gaussians given a text prompt. Do you have a link to the jupyter notebook you used in the video?
well spherical harmonics (thanks for the explanation btw) are cool and all but they are not adequate way of modeling light caustics. like if you have something refractive in your scene, like a half full glass of water, I wonder if splats can represent how objects behind it would warp. and re-lighting a splat scene (adding a light source, post capture) is tricky IMO
Do the gaussians model view direction dependent effects? Or are they best suited to only represent lambertian materials? E.g. can specular highlights, refractions, non-planar reflections, fresnel effects, etc. be modelled using these? If so, how? As far as my understanding goes, these gaussians are basically single-colored blobs, similar to billboards (as used by particle effects for games).
They model directional effects using something called spherical harmonics, where the color of a gaussian depends on the viewing angle. It isn't perfect, but it let's you get the appearance of reflections and shine.
@@datasciencecastnet I see. SH is efficient for representing lighting as used by diffuse (e.g. lambertian) surfaces, as you only need 4 parameters per color channel. However, I've seen their technique produce sharp reflections (the red sports car has a sharp reflection on the hood). For this to be possible using vanilla SH, you'd need an impractical number of parameters per gaussian. Are they perhaps passing the output color returned by the SH calculation through a clamp or sigmoid, so that sharp edges can form when using very high magnitude SH coefficients?
Can splats emit light (instead of just reflecting)? If not, how difficult would that be to implement? I'd like to try modelling aurora, which would correspond to fully transparent splats emitting light.
It's a voxel representation, the notion of texture gets difficult to talk about since there are no texels. For lighting, the spherical harmonics could be of interest as that implements anisotropy, allowing for reflections. The issue is how to do that efficiently, at base you can make the underlying exist in linear color space by which you can apply highlights and shading on top of that with traditional methods. For animation, you need to define animations in terms of point clouds, instead of textured models.
@@Dan-gs3kg but you could create fictional, photorealistic worlds in blender, render those cgi scenes with circles from different perspectives and use those images to create realtime representations with this tech... or not? Lightning is certainly interesting 🤔 Each Gaussian Slat would need to have information about its roughness I think
They're not voxels - they're an arbitrary amount of objects in a given space at continuous positions. Voxels are defined along more of a discrete grid.@@Dan-gs3kg
No meshes or triangles here! There is some work being done on how to extract meshes from these gaussian representations but it's tricky to do that well.
@@laurenpinschannels well laplacian distributions are like really narrow gaussians with long skinny tails, and is a powerful tool used in source localization. It is also prevalent in how neurons organize themselves to support sparse coding, which can be applied in pruning the neural network.
I never would've thought style transfer possible like that! Nice video!
Very nice video, thank you! Can you also share the notebook code you use in the video?
Jonathan thank you for the great walk through. The video structure is excellent. It is also really cool work you’re doing to stylize the base results!!
Very nice video with very good explanation of the key concepts! Especially the sh are nicely explained.
i like this guy's personality
This is crazy interesting but no one is talking about it
People just started talking about it. This is the second video on it I've watched today.
All github videos are like that. No one cares until it's integrated into some kind of consumer app. Ai stuff was the same way. Gan videos got no plays and no one was talking about ai until dalle and midjourny came out
We are busy working with it
One month later, my feed is literally filled with it 😁
It is starting to blow up
I love the pace, the comedic timing, the content and animations!
Ha, guess who forgot to add the visual aids during the intro..... hand gestures will have to suffice ;) Paper and code show up from 6 minutes in.
Very cool, very well explained, thank you
Guy with worst image quality ever explains technique that produces best image quality ever. Just some geek joke, thank you for such a nice presentation
thanks for the video ❤! As a suggestion, Would like to see something about peft (LoRa, Quantization …..)
Bro your ChatGPT history looks a lot like mine, just tons of insane neerdy stuff.
How fast is the retraining process? Like could you retrain a scene during gameplay with a good gpu? Especially if you had some trained gaussians already in place.
I've been wanting to make a project like this for a while but I can't figure out a way to do it with mesh objects. What I want to do is have an AI NPC on level 1 start talking to you adout something, anything really and let the player lead the conversation. Lets say you mention clouds then it starts googling clouds and starts training based on that as a prompt. By the time you get to level 2 its ready and level 2 is based on some structure but everything has morphed into clouds.
That would be nuts when you realize you are just manifesting your own game play with what you are talking with the NPC about.
I was thinking about having some primative sphere for most objects then using picture to depth field from stable diffusion help reshape the mesh then wrap a new texture on it. I think it could work on a high end computer but this method seems like it would be almost a dreamy hasey way to do it. If you could buy some time in cut scenes and dialog interactions it might be possible.
Anyway this whole process for rendering is just nuts. Trying to wrap my brain around it.
this will be the new live AR option.
Wonderful explanation! Thank you!
Would it be possible to post the notebook that you walk through in the video?
Hi, really good video. Could you share the jupyter notebook you showed in this video? It would be so grateful!
Can you say the final size of the scene, especially compared to the size of the input images and the number of iterations? Would it be viable to embed the scene representation into a video game, for instance, or are they enormous? (Or would we need to use something like Wavelets to represent the scene, and transform back into these harmonic values before rendering?)
Great video and explanation Jonathan! I really enjoyed your approach with CLIP to be able to modify gaussians given a text prompt.
Do you have a link to the jupyter notebook you used in the video?
well spherical harmonics (thanks for the explanation btw) are cool and all but they are not adequate way of modeling light caustics. like if you have something refractive in your scene, like a half full glass of water, I wonder if splats can represent how objects behind it would warp. and re-lighting a splat scene (adding a light source, post capture) is tricky IMO
I'm curious though; Weren't there differential renderers with spherical harmonics already a thing since like 2008 ?
very interesting, thank you! Can you also share the notebook code of your on this video?
Thanks. Could you go over the CUDA code as well
It would've been cool if you could visualize which point in the scene you are showing the spherical harmonics for
thanks, very clear!
Do the gaussians model view direction dependent effects? Or are they best suited to only represent lambertian materials?
E.g. can specular highlights, refractions, non-planar reflections, fresnel effects, etc. be modelled using these?
If so, how? As far as my understanding goes, these gaussians are basically single-colored blobs, similar to billboards (as used by particle effects for games).
They model directional effects using something called spherical harmonics, where the color of a gaussian depends on the viewing angle. It isn't perfect, but it let's you get the appearance of reflections and shine.
@@datasciencecastnet I see.
SH is efficient for representing lighting as used by diffuse (e.g. lambertian) surfaces, as you only need 4 parameters per color channel.
However, I've seen their technique produce sharp reflections (the red sports car has a sharp reflection on the hood). For this to be possible using vanilla SH, you'd need an impractical number of parameters per gaussian.
Are they perhaps passing the output color returned by the SH calculation through a clamp or sigmoid, so that sharp edges can form when using very high magnitude SH coefficients?
@@WhiteDragon103 they use 3rd degree SH (so 16 coefficients in total), as far as I know no extra clamping or activation functions.
Can you please share the git repo for all your code, it would be great to follow along and see the results on my end.
Can splats emit light (instead of just reflecting)? If not, how difficult would that be to implement? I'd like to try modelling aurora, which would correspond to fully transparent splats emitting light.
Can you also model, texture, light and animate than render with this tech?
It's a voxel representation, the notion of texture gets difficult to talk about since there are no texels.
For lighting, the spherical harmonics could be of interest as that implements anisotropy, allowing for reflections. The issue is how to do that efficiently, at base you can make the underlying exist in linear color space by which you can apply highlights and shading on top of that with traditional methods.
For animation, you need to define animations in terms of point clouds, instead of textured models.
@@Dan-gs3kg but you could create fictional, photorealistic worlds in blender, render those cgi scenes with circles from different perspectives and use those images to create realtime representations with this tech... or not?
Lightning is certainly interesting 🤔 Each Gaussian Slat would need to have information about its roughness I think
They're not voxels - they're an arbitrary amount of objects in a given space at continuous positions. Voxels are defined along more of a discrete grid.@@Dan-gs3kg
Can you share this code you have made for this video ?
Very Good.
the description is missing the link to the paper website 😥
nice
Can you share the code please?
Where is this? "GS website (with links to paper):"
The quality of your facecam looks like being in 2002
So, but it doesn't use any poligonal meshes so you can't use it with other techniques, or can you?
No meshes or triangles here! There is some work being done on how to extract meshes from these gaussian representations but it's tricky to do that well.
High Fidelity? Fidelity?
I wonder if this would work with a laplacian distribution rather than a gaussian 🤔
what's your reasoning for why to do that?
@@laurenpinschannels well laplacian distributions are like really narrow gaussians with long skinny tails, and is a powerful tool used in source localization. It is also prevalent in how neurons organize themselves to support sparse coding, which can be applied in pruning the neural network.
Is there a way to continue training from a previously unfinished training?
The video looks like it's from before 2010
So prince Harry is now an IT nerd ?? 🤔
😂
🙄