I even got this working on a AMD MiniPC with shared VRAM! works brilliantly. On CPU took 10+ mins to produce a picture, on AMD GPU got that down to 1 min!!! so impressed.
@@starmanindisguise5844you can run it technically somehow the eat around or force it to run but should we nope, running and getting good performance is different. Why not fry the GPU why not bro let's push it hard
Just subscribed. I had to a reinstall because I was adding scripts via a CSV file and suddenly I got the dreaded out of memory error then it wouldn't load my GPU. Thanks for being clear and having all the resources right here. You made it easy.
You're the best!! This is the only video that helped me. I'm new to this, and you explained everything very well. I hope you continue uploading videos about stable diffusion for AMD graphics. Thank you so much
By the way, maybe you know a solution for this: I'm trying to load juggernautXL_v9Rdphoto2Lightning , but it gives me the following error "RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!" My graphics card is a Vega 56 with 8GB
Great Tutorial thank you, works smooth with my RX580 with this ARGS : --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --lowvram --opt-sub-quad-attention
I got the same card but got this error message (module 'torch' has no attribute 'dml') how did you fix that? btw how it works? the card I mean are you using vram? does it take to long to make the image? is it fluid? you are the first person that I found who has the same card as me
@@matiasaliste5883 delete the venv folder and run it with --use-directml in the ARGS or just copy my ARGS line and paste it to yours should look like this in your webui-user.bat set COMMANDLINE_ARGS= --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --lowvram --opt-sub-quad-attention
precision full is a must have with amd, might as well add --no-half and --no-half-vae too, as well as --disable-nan-check to work with tiles when upscaling
@@IshanJaiswal26 you used the wrong git repo, you need to use the directml amd repo, else it downloads files for nvidia. The link is in the description
Vou testar na minha máquina, equipada com GPU AMD, eu tinha uma GPU Nvídia e tudo isso apareceu depois que troquei de GPU. Em toda a minha vida sempre usei GPU Nvídia, a primeira vez que Uso GPU AMD e me arrependo que não consigo me consolar. Obrigado por disponibilizar tão rico ensinamento. OBRIGADO! I'm going to test it on my machine, equipped with an AMD GPU, I had an Nvidia GPU and all this appeared after I changed GPUs. In my entire life I have always used Nvidia GPU, the first time I use AMD GPU and I regret that I can't console myself. Thank you for providing such rich teaching. THANKS!
Se estiver usando windows com uma placa inferior a RX6800 você vai precisar botar esse codigo " --use-directml --opt-sub-quad-attention --precision full --no-half-vae --no-half --opt-split-attention --medvram --disable-nan-check ", caso contrario inpaint não funciona, sugiro pegar controlnet tbm, se for superior a RX6800 vc pode instalar o HIP SDK pra usar zluda, só botar --use-zluda ao invés de directml
this worked. i loved your tutorial, you didnt just show how you explained why. i love that, i learned something rather than just barelly got something working.
how can i thank you its really worked after two days attempting just for this line --use-directml --opt-sub-quad-attention --no-half --disable-nan-check --autolaunch, my GPU is rx580 , 8gb vram and it takes 5 sc to generate a pitcher 😍😍😍😍😍
I kept running into "no module for pip" so I learned Stable Diffusion ships with its own copy of Python (specific version of 3.10.6) but doesn't have pip installed. So when you run webui-user.bat it uses local Python version in folder: \venv\Scripts Just cmd into above folder and run: python -m ensurepip Fixed my problem in case anyone else encounters this issue!
bro legit there is no fcking way.. i was searching for the past 3 days 10 hours per day for a fix and i was about to drop out but then i steped on ur video. MAY THE GODS BLESS YOU AND YOUR FAMILY BROTHER T H A N K Y O U
@@robertgoldbornatyout Ohmigod you answered! It never showed on my feed. Thank you! and 30 seconds is a small price to pay for independence and about the same speed as an online generator. Thanks again. Two hours later Edit. It works. With a RX580 GPU and a RYZEN 3600 CPU, it works.
@@IshanJaiswal26 you fix this issue? I have rx 6600 xt and when i generate i have error "RuntimeError: Could not allocate tensor with 471859200 bytes. There is not enough GPU video memory available!"
you may feel a bit sad by having to answer all the questions, but know that I read your answers you gave to other users to fix the issues I had, which was caused by missing the"-" on the "--use-directml" which caused it to download a broken torch which I had to delete venv and fix the command args to fix your attempt to help others fix it helps more than you may notice :)
Bro, you don't even know how much I love you, I've been having problem and trying whithout a reast for 5 hours, you are my fucking hero Edit: It was 8 hours actually, now I saw it, and your video was so relaxing and I perfectly explained, and I don't even speak english
For anyone interested in an alternative, I got Stable diffusion up and running on an AMD Radeon RX 6700 XT in windows using zluda (as I understand it, zluda is cuda functionality for AMD). There are several good tutrials for this out there.
finally a tutorial for amd that did work for me, I only have one problem, sometimes I get "RuntimeError: Could not allocate tensor with 335544320 bytes. There is not enough GPU video memory available!" and the faces generally look super bad, they improve as I increase the steps but it makes the generation much slower, I have an rx 6750 xt, 16gb of ram and a ryzen 5 5600
@@northbound6937 I changed from --medvram to --lowvram but it only made the generation slower, after generating an image the program is using all the vram (12gb of my rx 6750 xt) and all my ram (16gb), it does not go down to Unless I restart the .bat, I'll try zluda
Is it still viable? At my computer when i run the bat it only opens and closes. Not even tries do download the what's suppose to download after the webui-ser.bat edit.
Another manual that does not work (for me in the "as is" way). I followed the manual as usual, almost deleted all files after Errors, but suddenly my curiosity led me to the author of this SD fork for AMD on his page in Git. And the discussion on his page, where errors like mine were reported, really helped. He says: "After installing all this, we need to DELETE the 'venv' folder and re-launch webui-user.bat with "--use directml" args. Looks like a joke, but it works!
I appriciate your video! Short, informative and non tech user friendly. Big thanks to your work ❤Keep it up! Idk if u can help me with that, but if i generated a person with that and i want to make different pictures with the same person. Is there a way for this?
Yeah, XL models are tough. Add the argument --medvram and try again (if that still doesn't help, maybe last Hail Mary attempt and replace with --lowvram)
What exactly is the issue here? I guess it's the software, right? There are people running this at 8GB NV cards ... I also read that Automatic1111 is tanking performance/Vram. Hope that at some point it will be less a struggle! I give your advice a shot! Thanks
As far as I understand, A1111 is bad at memory management and will hog all the memory it can. The arguments are a workaround to delimit the amount of memory that can be used (as a result, it might be slightly slower to generate, but it's less likely to crash)
Ah, thats it, i remember now. When I first started SD on Win, i used that DirectML fork. Before I switched to Linux, becuase working RocM support there. When I recently wanted to toy again with it back on Win to see if anything improved performance wise, I got exactly that error upon installation.
brother can show us the installation of webui forge version which is recently released , i tried to install it but it shows error . i heard its much faster than the normal webui 1111
Couldn't make it work either, keep an eye on the issues section of the github under the amd tag, at some point someone will post a solution (currently the only "solution" I've seen being offered is to use the --skip-torch-cuda-test flag to run with the CPU not the GPU, which is defeating the whole point of using a faster solution) github.com/lllyasviel/stable-diffusion-webui-forge/labels/AMD @geraltofvengerberg3049 @souravroy8834
im actually still getting the error that it cant use gpu. Double checked I used the right version of python and I used your fork you provided. Deleted the folder retried. Still giving me that error. "RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check"
@@northbound6937 Hey I got it working! Not sure what it was but I just kept going back and triple checking stuff. I also mixed your guide with th-cam.com/video/n8RhNoAenvM/w-d-xo.html to use Zluda. Not sure if that helped. But cheers!
Hey bro, thank you so much for the video. You really explain everything in great detail and very well. I've seen many TH-cam tutorials and honestly, I think you're the best at it. I was just reading comments and, like many others, I would like to install Forge. But I saw that you mentioned that Forge cannot yet be used with AMD if you want to leverage the GPU. So, I was wondering if you'll upload a tutorial on how to run Forge as soon as there's any news about it. Thank you very much.
No prob, glad it helped. No promises on the forge tutorial. I assume what you're after with forge is improved performance? If I manage to make zluda work and get a substantial improvement in performance, then I'll probably update a tutorial about that. But not a lot of movement on forge atm with the AMD implementation
@@northbound6937 Thanks for the reply. Yeah, that's the goal. It seems that Forge is quite a bit better in performance than Automatic11111, and I've been checking out a lot of tutorials. I've also been trying it out for myself, but until now, it has been impossible to install. Anyway, I will keep an eye on your channel in case you find a solution. Hehe, thanks again!
I have rx 6600 xt and when i generate i have error "RuntimeError: Could not allocate tensor with 471859200 bytes. There is not enough GPU video memory available!"
Interesting to see your AMD 7800 running at 4its/ sec I get 1it/sec on my 580(8gb) using the "Ish-que-tiger" fork :) . Have you tried the "LCM-LoRA Weights - Stable Diffusion Acceleration Module" on civitai. Load the lora into your prompt set your CFG to 1-2 and your STEPS to 6-8. You get a good speed increase because the steps are so low. You can add more step but it starts to burn/artifact the image with lines unless you set the lora weight down from:1 to :0.5.
Thanks for the suggestion! I tried it and it runs faster, but I got underwhelming results, here's my output and the settings I used imgur.com/a/oMhvSlt Which checkpoint/model did you use?
Finally got it to work on a rx6600 but getting gpu memory errors after a couple of generations not sure as to what i should be putting into my commandline args either than what u have in the description?
Hey there, I'm experiencing an error when trying to generate images. I'm getting the AttributeError: 'NoneType' object has no attribute 'lowvram' error. If you know how to fix it, please help me out
What exactly are you trying to do? If it's creating a new model/checkpoint from scratch, I have no idea how to do that. If it's changing some images that you have (say, if in the original you have a suit & jacket on and want to change that to tshirt and some shorts) you can try different models from civitai, especially those that were trained with real people and not anime/drawn art.
Finally, after months, I was able to get SD working on my RX 6600. Thank you so much. I have a question: is it expected for the rendering to crash when the resolution is higher than 512x512, or when rendering a batch larger than a single render?
You're welcome! It depends, what error do you get in the console window? If it is 'memory exceeded', then yes, it's expected behavior, apparently webui is not very good at memory management. What you can do is add these arguments in the webui-user.bat file: --medvram (and if that doesn't help, change to --lowvram). They'll be slower, but should help it preventing it from crashing. But again, best practice generate to 512x512 and then upscale later
I have tried to no avail. You can try these steps and see if they work for you: github.com/lshqqytiger/stable-diffusion-webui-directml/discussions/149#discussioncomment-8392257
got this The specified module could not be found. Error loading "C:\Users oman\code\stable-diffusion-webui-directml\venv\lib\site-packages\torch\lib\c10.dll" or one of its dependencies. Press any key to continue . .
what do u think about buy new Ryzen 5 8600g without GPU, can stable diffusion work? i would like to wake the RTX series 5xxxm or what do u think a AMD Radeon XFX RX 7600XT Speedster SWFT 210 / 16GB for this CPU? Thanks
I think 8600g iGPU will probably run this, but extremely slow (TBH don't know what will be slower, running it on CPU only or iGPU, maybe the former). If you're interested in mainly working with SD, I would go for nvidia, either RTX40 or RTX50 later this year (although first only RTX5090 will be released, which will cost an arm and a leg). From my limited experience, a 3060 12GB had better performance in SD than a 7800XT 16GB
i use this on 8600g and use directml, got around 2s/it and around 1 min for full pic with 28 sampling steps , tbh for me this is amazing already given its an igpu compared to not using directml which takes around 10 mins for an image💀
Man my computer crashs every time a try to generate an image, i did every thing like you, and the stable difusion opens, but every time i try to generate an image it just crash my pc and reset, do you know how to solve it? Thanks
@Matheus-fb1hy What CPU/GPU do you have? If you try it with a HWinfo64 sensor window open while you generate with SD, what are the temperatures for the GPU? If nothing else, try the new ZLUDA video approach, uses less video memory, might be worth a try.
rx6900xt i use --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --medvram rx6600 probably --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --lowvram --opt-sub-quad-attention the option of --upcast-sampling is favorable over --no-half, you dont use both
Sadly directml is horribly slow, especially with the mid level cards like the 8GB RX7600. in my case it takes 120-150 seconds (over 2 minutes) to create one 700x1200 image. This is after I installed the ROCm HIP SDK and PRO driver. Before I installed those it was taking twice as long to create a single image. By comparison with a 8GB RTX3050, exact same settings, it creates images in 15-20 seconds. Granted image creation would be much faster going down to the base 512x512, but thats just too low of resolution to get any decent images.
I tried and followed the procedure but when lauching the webui-user.bat it show: "RuntimeError: Couldn't install torch" with "No module named pip Traceback (most recent call last)", any idea to resolve this problem ?
AMD is not supported yet by ForgeUI. You can hack files to make it work, but the end result is slower than with webui (in my case, from 4 it/s to 2.6 it/s). The files to change are listed here: github.com/lllyasviel/stable-diffusion-webui-forge/issues/58#issuecomment-1948689419
got it to work after 10 different tutorials. 'BUT when using dreamshaper its really slow, yesterday i went from gtx 1070 to rx 7600 xt 16gb and the nvidia was faster. any tips?
you fix this issue? I have rx 6600 xt and when i generate i have error "RuntimeError: Could not allocate tensor with 471859200 bytes. There is not enough GPU video memory available!"
I even got this working on a AMD MiniPC with shared VRAM! works brilliantly. On CPU took 10+ mins to produce a picture, on AMD GPU got that down to 1 min!!! so impressed.
heyy , can you tell how to do i am having problem i have rx 6600
do you discord ??? pls help me
@@IshanJaiswal26 RX 6600 wont work. You need to have an rx 6700 or better to run ai on amd
@@starmanindisguise5844you can run it technically somehow the eat around or force it to run but should we nope, running and getting good performance is different.
Why not fry the GPU why not bro let's push it hard
@@starmanindisguise5844 no it works fine. My only wish if this canr un sdxl models
Just subscribed. I had to a reinstall because I was adding scripts via a CSV file and suddenly I got the dreaded out of memory error then it wouldn't load my GPU. Thanks for being clear and having all the resources right here. You made it easy.
heyy , can you tell how to do i am having problem i have rx 6600 , discord
Thanks for the tutorial! My Stable Diffusion recently shat itself, and following this finally got it to work again.
heyy , can you tell how to do i am having problem i have rx 6600
@@IshanJaiswal26 you have more problems than just a rx6600
@@core36 nvm bro, i bought a new nvidia graphics card, anyways thanks
You're the best!! This is the only video that helped me. I'm new to this, and you explained everything very well. I hope you continue uploading videos about stable diffusion for AMD graphics. Thank you so much
By the way, maybe you know a solution for this: I'm trying to load juggernautXL_v9Rdphoto2Lightning , but it gives me the following error "RuntimeError: Could not allocate tensor with 52428800 bytes. There is not enough GPU video memory available!" My graphics card is a Vega 56 with 8GB
@@Kybalion3.6.9you have way to less vram then it requiered. Just make the image resolution smaller and it will work.
Adding this argument might help: ' --lowvram' to webui-user.bat (in the COMMANDLINE_ARGS line)
Thanks to both 🙌@@northbound6937 @Official_Memelus
@@northbound6937 do u have discord help me i cant do it
Great Tutorial thank you, works smooth with my RX580 with this ARGS : --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --lowvram --opt-sub-quad-attention
I got the same card but got this error message (module 'torch' has no attribute 'dml') how did you fix that?
btw how it works? the card I mean are you using vram? does it take to long to make the image?
is it fluid? you are the first person that I found who has the same card as me
@@matiasaliste5883 delete the venv folder and run it with --use-directml in the ARGS or just copy my ARGS line and paste it to yours should look like this in your webui-user.bat
set COMMANDLINE_ARGS= --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --lowvram --opt-sub-quad-attention
precision full is a must have with amd, might as well add --no-half and --no-half-vae too, as well as --disable-nan-check to work with tiles when upscaling
Thanks I have an rx580 as well, but it has 8gb vram, do I still use lowvram mode?
@@Antonin1738 Yes, you use lowvram with the RX580 i have also a RX580 with 8gb, have fun creating.
This is the only one that managed to work, not even official guide was doing its job!! thank you very much!!
heyy , can you tell how to do i am having problem i have rx 6600 , discord
@@IshanJaiswal26 Mmm hi, I've just followed the instructions blindly. I have not installed python and git before this.
@@MarcoEscobarg what gpu u have ?
@@IshanJaiswal26 you used the wrong git repo, you need to use the directml amd repo, else it downloads files for nvidia. The link is in the description
Vou testar na minha máquina, equipada com GPU AMD, eu tinha uma GPU Nvídia e tudo isso apareceu depois que troquei de GPU. Em toda a minha vida sempre usei GPU Nvídia, a primeira vez que Uso GPU AMD e me arrependo que não consigo me consolar. Obrigado por disponibilizar tão rico ensinamento. OBRIGADO!
I'm going to test it on my machine, equipped with an AMD GPU, I had an Nvidia GPU and all this appeared after I changed GPUs. In my entire life I have always used Nvidia GPU, the first time I use AMD GPU and I regret that I can't console myself. Thank you for providing such rich teaching. THANKS!
Se estiver usando windows com uma placa inferior a RX6800 você vai precisar botar esse codigo " --use-directml --opt-sub-quad-attention --precision full --no-half-vae --no-half --opt-split-attention --medvram --disable-nan-check ", caso contrario inpaint não funciona, sugiro pegar controlnet tbm, se for superior a RX6800 vc pode instalar o HIP SDK pra usar zluda, só botar --use-zluda ao invés de directml
Sigh* Same amigo, economizei 300 reais e não valeu a dor de cabeça, nunca mais faço essa graça.
this worked. i loved your tutorial, you didnt just show how you explained why. i love that, i learned something rather than just barelly got something working.
I am was trying to make this work and gave up 3 times and this 4th time you saved me! Thanks!!!
heyy , can you tell how to do i am having problem i have rx 6600 , discord
how can i thank you its really worked after two days attempting just for this line --use-directml --opt-sub-quad-attention --no-half --disable-nan-check --autolaunch, my GPU is rx580 , 8gb vram and it takes 5 sc to generate a pitcher 😍😍😍😍😍
Have you tested sdxl models
man, i serarched like for a month of how to do this, and finally i found the perfect tutorial. you have one more sub!
This video is amazing and great. Thanks so much! It was so easy to follow all the instructions and things JUST WORKED! May all your days be wonderful.
Thank you for this, at least I can run stable diffusion with my rx6800. Clear explantions. Merci beaucoup.
De rien!
hi, how many seconds per iteration does it run with that GPU 6800?
I kept running into "no module for pip" so I learned Stable Diffusion ships with its own copy of Python (specific version of 3.10.6) but doesn't have pip installed. So when you run webui-user.bat it uses local Python version in folder:
\venv\Scripts
Just cmd into above folder and run:
python -m ensurepip
Fixed my problem in case anyone else encounters this issue!
Thank you very much sir!
I was stuck with the "Torch is not able to use GPU" error until I watched your video. Works fine with the RX 6700XT.
windows or linux ?
@@herrcrazy4242 I'm on Windows 10, no WSL or anything, just Windows CMD
У меня такая же беда
Finally something that worked! BIG ty man!
i got this problem DLL load failed while importing onnxruntime_pybind11_state: The specified module could not be found how to fix it
Same
Same
saME PROBLEM ME
Guys, were you able to solve it?
@@heratv34 Did it, look up my other comment
bro legit there is no fcking way.. i was searching for the past 3 days 10 hours per day for a fix and i was about to drop out but then i steped on ur video. MAY THE GODS BLESS YOU AND YOUR FAMILY BROTHER T H A N K Y O U
IT WORKED YAAAA ALLAAAH , I WASTED 2 DAYS FOR THIS, THANK YOU VERY MUCH
Thanks a lot! It really worked with me, with adding some other files, but after all, it worked! Thanks a lot for your guide!
Excellent very helpful working for me with my amd rx 580 new subscriber thanks
You got this to work with an RX580? That is my gpu too. How is it?
@@rwarren58 ITS OK TAKES ME ABOUT 30 SECS TO DO 512 X 512 IMAGE
@@robertgoldbornatyout Ohmigod you answered! It never showed on my feed. Thank you! and 30 seconds is a small price to pay for independence and about the same speed as an online generator. Thanks again. Two hours later Edit. It works. With a RX580 GPU and a RYZEN 3600 CPU, it works.
I use fooocus and it took 10min for a image rx580 too i will try this one
My man thank you so much being trying to get this for a while, Thanks for holding my hand through this.
u also hold my hand , heyy , can you tell how to do i am having problem i have rx 6600
@@IshanJaiswal26what seems to be the issue
Your method worked first time! This really cut the render time from 3 minutes on 5950x down to 3 seconds on 7900xtx!
Best video I've found so far thanks bro
Finally a really good tutorial. Thank you so much bro, it worked on my RX 6600
heyy , can you tell how to do i am having problem i have rx 6600 , discord
@@IshanJaiswal26 you fix this issue? I have rx 6600 xt and when i generate i have error "RuntimeError: Could not allocate tensor with 471859200 bytes. There is not enough GPU video memory available!"
Epic, command line codes are what fixed it for me thanks!
Update 5 October 2024 works, it takes 2-3 mins to create with my rx 590, at 1024x768 and 32 steps, no problems yets, thanks
you may feel a bit sad by having to answer all the questions, but know that I read your answers you gave to other users to fix the issues I had, which was caused by missing the"-" on the "--use-directml" which caused it to download a broken torch which I had to delete venv and fix the command args to fix
your attempt to help others fix it helps more than you may notice :)
I finally made it work, thanks to you sir!
working on 6700 XT. oh my god, thanks for your tutorial. each picture generate would consume estimated 20seconds.
Bro, you don't even know how much I love you, I've been having problem and trying whithout a reast for 5 hours, you are my fucking hero
Edit: It was 8 hours actually, now I saw it, and your video was so relaxing and I perfectly explained, and I don't even speak english
For a non english speaker, you are clearly understood. Much Love from the USA. Bro has a good channel.
Thank you very much for your tutorial, it was very helpful. Greeting from Chile.
omg finally works, ty my man, you save me
Bro u r insane, Thanks a lot ❤
Thank you for the video! It helped me make it work! :)
You saved me, thank you! I was using Shark, this is an upgrade
worked for me, thank you so much!
For anyone interested in an alternative, I got Stable diffusion up and running on an AMD Radeon RX 6700 XT in windows using zluda (as I understand it, zluda is cuda functionality for AMD). There are several good tutrials for this out there.
Yes pléiade I need your help
finally it worked on my RX580 . thanks
Whats argumets you uses?
finally a tutorial for amd that did work for me, I only have one problem, sometimes I get "RuntimeError: Could not allocate tensor with 335544320 bytes. There is not enough GPU video memory available!" and the faces generally look super bad, they improve as I increase the steps but it makes the generation much slower, I have an rx 6750 xt, 16gb of ram and a ryzen 5 5600
Are you using the --medvram argument? Try that if you haven't (or lowvram) Alternative, try the zluda AI video, that uses way less VRAM
@@northbound6937 I changed from --medvram to --lowvram but it only made the generation slower, after generating an image the program is using all the vram (12gb of my rx 6750 xt) and all my ram (16gb), it does not go down to Unless I restart the .bat, I'll try zluda
Thank you, it's finally working!
Everything worked great but would you happen to have another guide on fully using Webui and its features?
I have a iGPU with 15gb shared vram, but the stable diffusion take my CPU instead of GPU
thank you so much that was really helpful it worked
You're welcome!
@@northbound6937 heyy , can you tell how to do i am having problem i have rx 6600
thanks you very much, just thanks you ^^
Thanks, bro. You made my day
Is it still viable? At my computer when i run the bat it only opens and closes. Not even tries do download the what's suppose to download after the webui-ser.bat edit.
when i use your args, i see "No module named 'torch_directml'". What do i have to do now? Please show me the way o wise Northbound
FINALLY, THIS ONE WORKS!
I think its named "stable-diffusion-webui-amdgpu" now, but thanks for the 2024 update.
Thank you very much , i tried everything before , now working StableDifusion my old amd pc
heyy , can you tell how to do i am having problem i have rx 6600 , discord
Another manual that does not work (for me in the "as is" way). I followed the manual as usual, almost deleted all files after Errors, but suddenly my curiosity led me to the author of this SD fork for AMD on his page in Git. And the discussion on his page, where errors like mine were reported, really helped. He says: "After installing all this, we need to DELETE the 'venv' folder and re-launch webui-user.bat with "--use directml" args. Looks like a joke, but it works!
Đã thử và thành công. Cảm ơn rất nhiều.
thank soo much my friend
I appriciate your video! Short, informative and non tech user friendly. Big thanks to your work ❤Keep it up!
Idk if u can help me with that, but if i generated a person with that and i want to make different pictures with the same person. Is there a way for this?
works like a charm. thank you! having some problems with xl models ... vram not big enough (6900xt, 16GB )
Yeah, XL models are tough. Add the argument --medvram and try again (if that still doesn't help, maybe last Hail Mary attempt and replace with --lowvram)
What exactly is the issue here? I guess it's the software, right? There are people running this at 8GB NV cards ... I also read that Automatic1111 is tanking performance/Vram.
Hope that at some point it will be less a struggle!
I give your advice a shot! Thanks
As far as I understand, A1111 is bad at memory management and will hog all the memory it can. The arguments are a workaround to delimit the amount of memory that can be used (as a result, it might be slightly slower to generate, but it's less likely to crash)
@@northbound6937 Thanks for explanation! -medvram worked by the way. I can now use 1.5 or XL models up to 1366x768.
yup... this worked for me... thx a lot.
It worked for me!
Ah, thats it, i remember now. When I first started SD on Win, i used that DirectML fork. Before I switched to Linux, becuase working RocM support there.
When I recently wanted to toy again with it back on Win to see if anything improved performance wise, I got exactly that error upon installation.
Thumbs up!
brother can show us the installation of webui forge version which is recently released , i tried to install it but it shows error . i heard its much faster than the normal webui 1111
yeah im having gpu error then it says "nvidia driver cannot found"
same problem@@geraltofvengerberg3049
Couldn't make it work either, keep an eye on the issues section of the github under the amd tag, at some point someone will post a solution (currently the only "solution" I've seen being offered is to use the --skip-torch-cuda-test flag to run with the CPU not the GPU, which is defeating the whole point of using a faster solution) github.com/lllyasviel/stable-diffusion-webui-forge/labels/AMD @geraltofvengerberg3049 @souravroy8834
Just a suggestion, you probably should show task manager side by side. That way it will show how much % does the GPU being utilized when rendering.
thank you
im actually still getting the error that it cant use gpu. Double checked I used the right version of python and I used your fork you provided. Deleted the folder retried. Still giving me that error. "RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check"
Can you paste your COMMANDLINE_ARGS line from the webui bat file?
@@northbound6937 Hey I got it working! Not sure what it was but I just kept going back and triple checking stuff. I also mixed your guide with th-cam.com/video/n8RhNoAenvM/w-d-xo.html to use Zluda. Not sure if that helped. But cheers!
Hi!
Directml work with rx6700 xt?
Hey bro, thank you so much for the video. You really explain everything in great detail and very well. I've seen many TH-cam tutorials and honestly, I think you're the best at it. I was just reading comments and, like many others, I would like to install Forge. But I saw that you mentioned that Forge cannot yet be used with AMD if you want to leverage the GPU. So, I was wondering if you'll upload a tutorial on how to run Forge as soon as there's any news about it. Thank you very much.
No prob, glad it helped. No promises on the forge tutorial. I assume what you're after with forge is improved performance? If I manage to make zluda work and get a substantial improvement in performance, then I'll probably update a tutorial about that. But not a lot of movement on forge atm with the AMD implementation
@@northbound6937 Thanks for the reply. Yeah, that's the goal. It seems that Forge is quite a bit better in performance than Automatic11111, and I've been checking out a lot of tutorials. I've also been trying it out for myself, but until now, it has been impossible to install. Anyway, I will keep an eye on your channel in case you find a solution. Hehe, thanks again!
@@northbound6937 Please do this tutorial if you learn I'm also trying to use forge :) with AMD
Works on a RX 6600M. Thanks a lot.
I have rx 6600 xt and when i generate i have error "RuntimeError: Could not allocate tensor with 471859200 bytes. There is not enough GPU video memory available!"
Interesting to see your AMD 7800 running at 4its/ sec I get 1it/sec on my 580(8gb) using the "Ish-que-tiger" fork :) . Have you tried the "LCM-LoRA Weights - Stable Diffusion Acceleration Module" on civitai. Load the lora into your prompt set your CFG to 1-2 and your STEPS to 6-8. You get a good speed increase because the steps are so low. You can add more step but it starts to burn/artifact the image with lines unless you set the lora weight down from:1 to :0.5.
Thanks for the suggestion! I tried it and it runs faster, but I got underwhelming results, here's my output and the settings I used imgur.com/a/oMhvSlt Which checkpoint/model did you use?
Finally got it to work on a rx6600 but getting gpu memory errors after a couple of generations not sure as to what i should be putting into my commandline args either than what u have in the description?
Hey there, I'm experiencing an error when trying to generate images. I'm getting the AttributeError: 'NoneType' object has no attribute 'lowvram' error. If you know how to fix it, please help me out
thanks it worked well
Hey mate I have some images which I want to train my model on so for that do I need exact image model of those images or any model would work?
What exactly are you trying to do? If it's creating a new model/checkpoint from scratch, I have no idea how to do that. If it's changing some images that you have (say, if in the original you have a suit & jacket on and want to change that to tshirt and some shorts) you can try different models from civitai, especially those that were trained with real people and not anime/drawn art.
Finally, after months, I was able to get SD working on my RX 6600. Thank you so much.
I have a question: is it expected for the rendering to crash when the resolution is higher than 512x512, or when rendering a batch larger than a single render?
You're welcome! It depends, what error do you get in the console window? If it is 'memory exceeded', then yes, it's expected behavior, apparently webui is not very good at memory management. What you can do is add these arguments in the webui-user.bat file: --medvram (and if that doesn't help, change to --lowvram). They'll be slower, but should help it preventing it from crashing. But again, best practice generate to 512x512 and then upscale later
gah damn bro this shi worked
Do you know how to setup and use olive to optimize sd model? Can't find any up to date guide.
I have tried to no avail. You can try these steps and see if they work for you: github.com/lshqqytiger/stable-diffusion-webui-directml/discussions/149#discussioncomment-8392257
thank u bro
Thanks, I had that torch error saying GPU
ty broda! fucking worked"!
got this The specified module could not be found. Error loading "C:\Users
oman\code\stable-diffusion-webui-directml\venv\lib\site-packages\torch\lib\c10.dll" or one of its dependencies.
Press any key to continue . .
ty mate
heyy , can you tell how to do i am having problem i have rx 6600 , discord
what do u think about buy new Ryzen 5 8600g without GPU, can stable diffusion work? i would like to wake the RTX series 5xxxm or what do u think a AMD Radeon XFX RX 7600XT Speedster SWFT 210 / 16GB for this CPU? Thanks
I think 8600g iGPU will probably run this, but extremely slow (TBH don't know what will be slower, running it on CPU only or iGPU, maybe the former). If you're interested in mainly working with SD, I would go for nvidia, either RTX40 or RTX50 later this year (although first only RTX5090 will be released, which will cost an arm and a leg). From my limited experience, a 3060 12GB had better performance in SD than a 7800XT 16GB
i use this on 8600g and use directml, got around 2s/it and around 1 min for full pic with 28 sampling steps , tbh for me this is amazing already given its an igpu compared to not using directml which takes around 10 mins for an image💀
Man my computer crashs every time a try to generate an image, i did every thing like you, and the stable difusion opens, but every time i try to generate an image it just crash my pc and reset, do you know how to solve it? Thanks
@Matheus-fb1hy
What CPU/GPU do you have? If you try it with a HWinfo64 sensor window open while you generate with SD, what are the temperatures for the GPU?
If nothing else, try the new ZLUDA video approach, uses less video memory, might be worth a try.
So that's mean SD can run doing CPU process?? I have RX580 and SD can't running GPU
DirectML initialization failed: No module named 'torch_directml' in a RX5500 XT
rx6900xt i use --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --medvram
rx6600 probably --autolaunch --theme dark --skip-version-check --use-directml --upcast-sampling --precision full --lowvram --opt-sub-quad-attention
the option of --upcast-sampling is favorable over --no-half, you dont use both
thanks :)
Just dropping this comment cause I managed to get 3it/s on my 6650 on SD.next with zluda. Might be a better option?
How do you use SD.next with zluda? Are there arguments for that or how?
its is not using my gpu it is using my ram to generate the image pls help
i have an error {RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same} please help
Sadly directml is horribly slow, especially with the mid level cards like the 8GB RX7600. in my case it takes 120-150 seconds (over 2 minutes) to create one 700x1200 image. This is after I installed the ROCm HIP SDK and PRO driver. Before I installed those it was taking twice as long to create a single image.
By comparison with a 8GB RTX3050, exact same settings, it creates images in 15-20 seconds. Granted image creation would be much faster going down to the base 512x512, but thats just too low of resolution to get any decent images.
True. Maybe try with ZLUDA, that sped up things significantly for me. Video here: th-cam.com/video/gsrhKosljgI/w-d-xo.html
@@northbound6937 should you remove "--use-directml" in command if you use zluda?
I tried and followed the procedure but when lauching the webui-user.bat it show: "RuntimeError: Couldn't install torch" with "No module named pip
Traceback (most recent call last)", any idea to resolve this problem ?
I would delete the \venv\ folder and try again. Other than that, google the error and try the solutions people suggest.
I've got it working on a 7900gre but can't go big like i could with the rtx 3060. Runs out of vram now.
ok now how can we use ForgeUI via AMD ?
AMD is not supported yet by ForgeUI. You can hack files to make it work, but the end result is slower than with webui (in my case, from 4 it/s to 2.6 it/s). The files to change are listed here: github.com/lllyasviel/stable-diffusion-webui-forge/issues/58#issuecomment-1948689419
got it to work after 10 different tutorials. 'BUT when using dreamshaper its really slow, yesterday i went from gtx 1070 to rx 7600 xt 16gb and the nvidia was faster. any tips?
nvidia is generally faster then amd (even old GPUs). Try ZLUDA, I have a tutorial here, it's faster for AMD th-cam.com/video/gsrhKosljgI/w-d-xo.html
after editing "webui-user.bat", I get an error like this "AttributeError: module 'torch' has no attribute 'dml' ", can u help me to fix it?
stackoverflow.com/questions/77337158/module-torch-has-no-attribute-dml
Main reason I'm about to get an nvidia card after so long... also better RTX performance and overall uses less power
I have a 6800xt, and i have a cannot allocate enough memory when using the hires fix, any help?
you fix this issue? I have rx 6600 xt and when i generate i have error "RuntimeError: Could not allocate tensor with 471859200 bytes. There is not enough GPU video memory available!"
which torch version does it run on?
Will it work with igpu (vega 7)?