Artificial Intelligence For the Stereographer

World of Depth

มุมมอง 9 532

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 2 ก.พ. 2025

ความคิดเห็น • 46

@dragonflyK110 ปีที่แล้ว ⁺¹
"Originally presented at the NSA Virtual 3D-Con 2021, 8/15/21 "
I'm not gonna lie for a minute or so I was very confused about why the National Security Agency would have a 3D Conference :)
Anyway thank you for this video, it was quite educational for somebody getting back into 3D tech after not paying much attention to it over the last decade or so. And despite the video's age it seems to have held up quite well. Though if you know of any relevant AI models that have been released since this video I'd love to know, as I'm currently trying to learn as much as I can about this topic.
Thank you again for the time you put into this video.
@WorldofDepth 11 หลายเดือนก่อน
Thanks for the appreciation! In my tests, the best AI depth estimator is still MiDaS, but the new version 3.1, released after this video. There is a very new one called Marigold (huggingface.co/spaces/toshas/marigold), but in my first tests, it's not as good as MiDaS v3.1.
@dragonflyK110 11 หลายเดือนก่อน
@@WorldofDepth Thank you for the response, I have done quite a bit of research since that comment so I have actually heard about Marigold.
Have you tried out Depth Anything? It's even newer than Marigold and is in my testing much better than MiDaS. It has a HF space if you want to try it out.
@WorldofDepth 3 ปีที่แล้ว ⁺¹
Note that as of 8/17/21, the 3D-Photo-Inpainting Colab Notebook is NOT working due to missing files. I’ve opened a new issue report with the researchers on GitHub and will update this comment with developments.
@WorldofDepth 3 ปีที่แล้ว ⁺¹
8/18/21: Files restored and working again :) Reference: github.com/vt-vl-lab/3d-photo-inpainting/issues/131
@clamojoat8543 4 หลายเดือนก่อน
a stereo depthmap is actually called a disparity map its not actually a depthmap. do its its color. a depthmap is gray scale. white to black. a disparity map is actually used to repair a stereo pair. like if its not correctly aliened or maybe something else was wrong with it. you use disparity to fix the stereo.
@666-d5y 3 ปีที่แล้ว ⁺²
when are you uploading about the AIs you talked about at the end? instantly subbed
@WorldofDepth 3 ปีที่แล้ว
Thank you! This workshop was for the NSA 3D convention, so I may possibly revisit this topic with those other AIs and newer ones for next year's Con. In the meantime I recommend checking out Ugo Capeto's YT channel for reviews of additional AIs.
@BrawlStars-jd7jh 2 ปีที่แล้ว
really cool stuff, thanks for sharing!
@BrawlStars-jd7jh 2 ปีที่แล้ว
i have a problem, when i run the last proccess in the 3d Photo inpainting notebook, it says
"TypeError: load() missing 1 required positional argument: 'Loader'"
i already uploaded the depth and the base image
@metamind095 2 ปีที่แล้ว ⁺²
Can I use Midas to convert 2d Video to 3D somehow? What in your opinion is the best tool to do 2d to 3d video conversion? Thx for the video.
@WorldofDepth 2 ปีที่แล้ว ⁺¹
It's possible to do that with MiDaS frame by frame, but the resulting video will flicker, so I think it's best to use tools made specifically for video instead. “Consistent Video Depth Estimation” is near the bottom of my collected links (see video description), plus another, but I haven't used them myself. There are surely other similar AIs out there now as well.
@Rocketos 2 ปีที่แล้ว
@@WorldofDepth can u upload a tutorial for using consistent video depth estimation ? Please
@WorldofDepth 2 ปีที่แล้ว ⁺¹
@@Rocketos I don't have one to upload. Perhaps if I have time in the future.
@Rocketos 2 ปีที่แล้ว
@@WorldofDepth thanks i love your workshops
@jimpvr3d289 3 ปีที่แล้ว ⁺¹
VERY interesting! But if the depth AI produce only depth maps.. what produces the missing pixel information (like the clouds behind Mr. Rogers head?) Is stereophotoMaker doing that and why doesn't AI do that also?
Thank you
@WorldofDepth 3 ปีที่แล้ว ⁺³
So the painting in of those kind of background spaces is exactly what the 3D-Photo-Inpainting AI does-the zoom-in animation at 22:42 is an example. And it does that based on everything it has learned from massive amounts of training. The SPM animation at 23:25 does do some amount of inpainting as well, but I think it's based on straight calculations and copying existing pixels, rather than on AI, and I think it's not as smooth.
@CabrioDriving 3 ปีที่แล้ว ⁺¹
Do you have knowledge how to convert 2D photo to 3D, the way you feel like standing say 1 or 2 feet from a huge 3D-world-window, with proper, deep depth of the scene, without seeing everything flat and too wide? I was thinking about some mathematical function to convert the image to different "lens angle", but not sure if this is the good direction in thinking. Also, there is a formula for 2d to 3D convertion with camera focus point variable, near plane cut variable and far plane cut variable. I wonder which parameter of this (or other formula?), you need to manipulate to have 3D photo with ideal depth, of scene as close to you, as possible. Imagine like the 3D photo would be a huge window to your garden, which is from floor level to the ceiling and you are standing just next to it. Typically I notice that videos/photos are the best to be viewed with 10 feet virtual (perceived distance in VR) distance, but that kills the feeling of presence in that world - it is just like seeing some window with 3D, too far away.
@WorldofDepth 3 ปีที่แล้ว ⁺¹
What you describe sounds more like VR 180º 3D images to me. To have a feeling of standing close to a 3D scene, I think that's the only option, unless you render in full 3D. I don't work in 180º or 360º, but check out the 3D-Con workshops and Special Interest Groups about VR, from both this year and last year.
@CabrioDriving 3 ปีที่แล้ว ⁺¹
@@WorldofDepth I wasn't thinking about super wide angle of VR180, but let's say 100-110. Just with deep scene depth. I will study the topic if it is possible to do what I described. Cheers
@MichaelBrownArtist 3 ปีที่แล้ว
12/16/21 MiDaS v.3 - Failed at the second step (load a model): ModuleNotFoundError: No module named 'timm'
@WorldofDepth 3 ปีที่แล้ว
Hmm, I just tried the ‘upload version’ notebook and it worked fine. Did you miss the first code box, above “Uploading Your Image”? That is the step that installs timm. If you did run that, it may be that the bandwidth limit for a certain external file was reached, and it was temporarily unavailable, but it should notify you if that's the problem.
@MichaelBrownArtist 3 ปีที่แล้ว ⁺¹
@@WorldofDepth , Maybe I did miss it. I'll try again. Thanks for preparing such a great presentation.
@MichaelBrownArtist 3 ปีที่แล้ว ⁺¹
Your suspicion was correct. I missed the first step (instal timm). I was able to run it, but the final depth map was very tiny: 284x217 px. Not sure why.
@WorldofDepth 3 ปีที่แล้ว ⁺¹
@@MichaelBrownArtist ah, good. But yes, MiDaS v.3 outputs very small depth maps. I would recommend trying 1) starting with a larger input image, 2) using BMD + MiDaS v2.1 as a possible alternative, which outputs at original size, and 3) upscaling the v3 depth map and using something like a symmetrical nearest-neighbor interpolatation method to smooth it. Ugo Capeto recommends the latter; I don't have a program which offers that method, so I've use Imagemagick and the "Kuwahara edge-preserving noise filter" to pretty good results.
@MichaelBrownArtist 3 ปีที่แล้ว
@@WorldofDepth , thank you.
@CabrioDriving 3 ปีที่แล้ว
So you have 2D images + depth maps. Which software/github to use to produce the two stereo images? 3d stereo photo maker is not good in my opinion.
@WorldofDepth 3 ปีที่แล้ว
I use SPM. Which, by the way, produces much smoother stereopairs if you appropriately upscale your 2D image + depth map first. For example, if you're using an SPM deviation value of 60 = 6% of image width, for a 256-level depth map, you need 4267px-wide image.
You could also take advantage of the 3PI AI and produce a panning video with it, then extract a stereopair. Not time efficient but would have the best inpainting.
@CabrioDriving 3 ปีที่แล้ว ⁺¹
@@WorldofDepth Hi. Thanks for your time and valuable answer. What I noticed is SPM has problems with depth recognition despite depth map seems to be ok, visually. Also, it tears away surfaces like faces when you produce 3D, even with deviation of 25 or 30 (default). Also, produced images look downscaled in quality. I have a depth map and I see in it, the depth is represented correctly. Then SPM makes flat foreground and correct background hmmm... or vertically sliced/layered depth or some things are flat and other are correct. I have spent a lot of time on this software (even with google AI working from my disk ) and never produced a great 3D photo out of depth maps made with MIDAS or LeRES and some other AI software. So, that is why I asked about some other project to produce correct images. Good point on 256-level depth maps and image size.
@WorldofDepth 3 ปีที่แล้ว
@@CabrioDriving Almost every AI-produced depth map needs manual corrections, I think, especially if it's an image with people's faces at any significant size. As I say in the video, it can be convenient if you're producing 2D animations that are more forgiving of the depth map, but otherwise it's always going to be work, at this stage of the tech…
The sliced/layered depth problem you mention is exactly the image size issue I mentioned. Upsizing for stereopair generation and then downsizing back to original size should smooth those areas.
@CabrioDriving 3 ปีที่แล้ว ⁺¹
@@WorldofDepth Thank you for your priceless comments. 1. How did you calculate that needed resolution of 4267 pixels wide 2. what should be image width for 3% deviation? 3. What deviation % you suggest for best effect?
@WorldofDepth 3 ปีที่แล้ว ⁺¹
@@CabrioDriving 1) If you use an 8-bit grayscale depth map that is normalized to go all the way from pure black to pure white, it has 256 different depth levels. If you want to differentiate all of those in a stereopair, then those levels will correspond to horizontally shifting parts of the original image between 0 and 255 pixels. If the maximum shift you want ( = deviation) is 6%, that means 255 pixels must be 6% (or less) of your image width, so WIDTH * .06 = 255, and WIDTH = 4250px. (My original number was slightly off.)
2) By the same method, you need an 8500px-wide image if deviation is 3%.
3) It depends on the picture and the intended use. 3% was the old rule of thumb, and that's good for TV size, I think. For laptop or phone screen size, I probably use 4.5-6% more, or rarely up to 8% for a very deep scene. For display via a projector at wall size, maybe you'd want to go down to 1.5%.
@importon 2 ปีที่แล้ว
Have you lost interesting in this stuff? Why no new content?
@WorldofDepth 2 ปีที่แล้ว
As you can see, this video workshop was part of NSA Virtual 3D-Con 2021, and other videos of mine were made for similar conferences and some smaller regional events. I presented again at the 2022 3D-Con last month, but unfortunately, the conference was not recorded.
If you want to see the latest 3D I'm producing, follow @WorldOfDepth on Instagram, and/or check WorldOfDepth.com (though I need to update that more!).

ต่อไป

เล่นอัตโนมัติ

How Neural Nets estimate depth from 2D images? Monocular Depth Estimation Explained!