How Stable Diffusion Works (AI Text To Image Explained)
ฝัง
- เผยแพร่เมื่อ 8 พ.ค. 2023
- ✨ Support my work on Patreon: / allyourtech
⚔️ Join the Discord server: / discord
🧠 AllYourTech 3D Printing: / @allyourtech3dp
👾 Follow Me on X: / blovereviews
💻My Stable Diffusion PC: kit.co/AllYourTech/stable-dif...
We've all seen stable diffusion generate some spectacular looking AI Generated art, but how does the technology actually work behind the scenes? Buckle up as I show you how the technology behind this marvel works, and how we go from a text prompt and a static-filled image to a beautiful work of art. - วิทยาศาสตร์และเทคโนโลยี
It's criminal that this only has 13k. Keep it up!!
such a good video, surpised to see so few likes. your explanation is great! since it works fine for a wider audience with minimal engineering or technical skills. please keep making the videos!
Amazing explaination on such a short video !! Keep up the good work !!
amazing. Thanks +wait for more ❤
This was so informative! Thank you, love your videos!
Thank you so much, I really appreciate it!
Very well explained, thank you. And man, I love your studio! (D'oh - just noticed it is a fake background. Rather goes to your point).
Haha! You nailed it
This was great. Thanks!
4:44 Midjourney does not use reactions to the images in production to train their model.
It's a good example to explain it as a hypothetical, but it's untrue.
I still don't understand how Stable Diffusion works, but now I know more. Maybe you can help me understand what's happening when I try to create some art: First, I upload an image to Stable Diffusion in the img2img tab and then I select Interrogate CLIP or Interrogate DeepBooru, then I copy/paste the prompt into txt2img -- Why don't I get an image that better resembles what I started with? How can I get better semblance to my original image? You seem to understand this stuff better than me, so maybe you can explore this in a future video. Thanks!
I will do a video on the subject. There are definitely some tricks to making it work and getting a decent result.
One of the best videos iv'e seen in a while. Thank you for taking the time and making such awesome content. Much appreciated.
Wow, thank you! So glad you enjoyed it
One basic question..What is the need to introduce noise in the first place?
The noise is the starting point when you reverse the diffusion process. It also provides randomness to the resulting image
It is an Erlenmeyer flask, not a beaker. ;)
Thanks Walter White lol
Wow, such a great video man. Finally found the video that clearly explains how exactly images are made from text prompts. And the things you said in the end... yeah man... I agree with you. We should be careful on how to use these AI technologies.
You should change the title, this is definitely not a "detailed explanation". It's more akin to a "summarized intuitive explanation".
Great video 👍 Subbed
Thanks for the sub!
It just clicked at 4:38 why midjourney and others are free to start, they need people to teach the system
Great explanation!!
Thank you!!
Really awesome
Thank you!
Awesome video !
Thank you!
So how does it know which pure noise image to use starting out with?
The software starts with a random number generator that is used as a seed to generate the noise.
@@allyourtechai let say the text prompt is “Rainnbow unicorn” . How does the process starts out ? Where does it get the noisy image of that in order to work back to the desired image?
does it pull stuff only from the checkpoints used or also online?
You can define the source. In my video about how to “ai yourself”, I provided my own photos to train the model.
You can have a completely offline install, where you download the checkpoint and other files, run the Stable Diffusion server on your own computer and control it from the browser on that same computer. No one ever looks at what you generate or charges you for anything. And you can train your own checkpoints or embeddings locally, but that is really slow (several hours for like 10-50 images and a RTX2060).
@@krzysztofczarnecki8238 i think thats what i got rn, its pretty cool running it locally. And yeah i pulled my internet plug and it was still able to draw somewhat accurate drawings of famous anime characters which is pretty awesome.
after watching the video of the video, I still have questions. It turns out that we make Gaussian noise from the picture, and then we make noise back from the noise. But won't we be able to face the fact that the noise can be the same?
Pretty unlikely if you use a random seed to generate the noise, and you train it 1000 times per image. The odds of getting the same noise that many times are mathematically improbable.
You made a strong point on the confusion between really and AI generated really. Not to be pessimistic, but this is a huge risk for humanity. I believe right from the start we should have regulatory institutions to force AI companies to put a disclaimer on any art or content that’s produced. Tools should be developed and make It available to people maybe through their phones, laptops, tv as an extension so they can clearly differentiate between both.
With the consumption of content being already high for most people, these technologies can easily turn into tools of mass control if strong measures are not taken right from the start.
It’s something we need to all pay close attention to for sure.
People will just remove those things.
I'm sure something will happen, the govt will probably step in, and do something stupid because they're all old, ill-informed, and don't understand how e-mail even works
Whatever they do will either be over the top, or a waste of time.
I think most people already know fake images, video and audio are already circulating. People are already questioning anything they see, so I'd say awareness is already out there.
We just have to hope that "trusted" mediums don't mislead people with fake stuff, and actually do a little research. Fortunately (and unfortunately), I think most Americans already don't trust the media as it is right now. Especially with all the law suits these companies have had to pay out over the last few years
(9:08) Training the AI model with your custom data works better if you a) make sure each image is a square 512x512 pixels, and b) take the photos of your models specifically for this purpose in front of a solid color background. Also, I dare you to use "me from behind" in your prompts, as all of your photos appear to be selfies so it has no idea what the back of your head looks like.
Oh shit it's you brina 😂
Haha! How have you been?
"my hope is that it brings us all closer together..." yyyeeaaaa....that's a no from me dawg
I have no idea who Drake is
Well now you do hopefully!
You forget that this poses a huge problem for the legal system as well. Pictures or videos of you doing something are essentially worthless now given how easy it is to fake them.
jessus christ pliz mix yuour voice with some eq you have a terrible amount of sub bases (between 60 and 20 hz). Please ask your musician friend to show you how to do it bc it's unlistenable on many types of speakers
Youre great