AMD ROCm under WINDOWS Status Update. ZLUDA with SD.next as the best alternative (Tutorial).

Next Tech and AI

มุมมอง 10 444

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 22 ก.ค. 2024
What is the status of AMD ROCm on Windows - especially with regard to Stable Diffusion?
We install SD.next with ZLUDA to accelerate Stable Diffusion and bridge the waiting time for ROCm on Windows.
If you want to run ComfyUI with ZLUDA, too, watch this video:
• ComfyUI with ZLUDA on ...
If you want to run Automatic1111 with ZLUDA, too, watch this short: • #ZLUDA with #Automatic...
PyTorch/ROCm status and GitHub Windows issues:
pytorch.org/get-started/locally/
github.com/pytorch/pytorch/is...
rocm.docs.amd.com/projects/in...
github.com/vladmandic/automat...
Links regarding ZLUDA and SD.next installation:
github.com/lshqqytiger/ZLUDA
www.amd.com/en/developer/reso...
strawberryperl.com/
github.com/brknsoul/ROCmLibs/...
rocm.docs.amd.com/projects/in...
github.com/brknsoul/ROCmLibs/
github.com/vladmandic/automatic
Videos:
AMD ROCm on Linux:
• How to use Stable Diff...
AMD ROCm on Windows Status Details and GIT & MiniConda-Installation:
• AMD ROCm on WINDOWS fo...
Chapters:
0:00 About ROCm on Windows and ZLUDA
1:34 Status Update ROCm
2:46 Zluda Preparation
4:13 HIP SDK
7:14 SD.next Installation
7:56 Example Generation
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 205

@NextTechandAI 3 หลายเดือนก่อน ⁺¹⁰
What will you do?
1. Give ZLUDA/SD.next a try?
2. Wait for AMD ROCm on Windows?
3. Use Linux?
4. Else...?
@Omen09 3 หลายเดือนก่อน ⁺²
tried all 3, will wait for amd rocm :|
zluda on sd next or on automatic1111 gives me 0 performance boost compared to what i had on directml, only some better memory management on rx 6700
tried linux for a second and realised it's not gonna be so simple xD maybe if my gpu could run in acceptable terms in stock settings, otherwise too much investment to make it work
amd needs to stop slacking, if rocm hits windows with important stuff for ai they may get actually a lot of market share
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Interesting, on my machine I notice a big difference between Zluda and directML - but, agreed, ROCm is the way to go. Thanks a lot for your detailed feedback!
@michahojwa8132 3 หลายเดือนก่อน ⁺¹
I've tried a stable diffusion example from AMD youtube channel but getting Zluda errors. Will try the repo in your video.
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
@michahojwa8132 Thanks for your feedback. Good luck, I'm optimistic for your attempt :)
@mikharju 3 หลายเดือนก่อน ⁺²
I used to have directml A1111. Then I struggled with Linux + ROCm until your video helped me get it setup (many thanks). Now directml is not working due to some bug that Microsoft is taking a long time to fix and dual booting is a bit annoying even if Linux setup is so much faster. Going to try ZLUDA once I have a moment to tinker again.
@rudolfaeschlimann6959 3 หลายเดือนก่อน ⁺⁵
Thank you for this informative video, you helped me a lot with the setup!
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
@rudolfaeschlimann6959 Thank you very much for your feedback, I'm glad that my video is useful.
@Bucky_Bailey 3 หลายเดือนก่อน ⁺³
Excellent video! I went from 8s/it on normal to 6+it/s on ZLUDA. There is however some features that don't work, or work inconsistently, but generally, image generation is a breeze and up-scaling works well.
@NextTechandAI 3 หลายเดือนก่อน
Thank you very much for your feedback! Indeed there are some limitations, but I since have Zluda I don't boot into Linux very often.
@yo-kaiwatchfan-resukoh 2 หลายเดือนก่อน
hi, do you know if zluda can be used in realESRGAN?
@nghilam2205 2 หลายเดือนก่อน
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\Admin\ZLUDA\automatic\venv\lib\site-packages\torch\lib\caffe2_nvrtc.dll" or one of its dependencies.
@Sujal-ow7cj 22 วันที่ผ่านมา
i have installed it but how can i lunch it ,autolunch faild
@Sujal-ow7cj 23 วันที่ผ่านมา
can i install 1111 it will work right
@user-lm5kt9ny4v 3 หลายเดือนก่อน
I would like to clarify, am I correct that with this build, every time we change the prompt the first image generation will be extremely long? Or is this true only for one, the first prompt? I'm just impressed by the speed of generation of subsequent prompts
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
It is only for the very first prompt, that's why I am so excited about Zluda :)
After a reboot or restart of SD.next the first image generation will take a few seconds more, but that's only loading the checkpoint etc.
Changing the prompt does not have any influence on the optimized code.
@stretchvillagebynight หลายเดือนก่อน
Got it started and working so good tutorial overall. One question though. Now when I close everything after I'm done using it what are the necessary steps in starting it again? Should I create the "zluda environment" again and run the webui.bat command like in the video?
@NextTechandAI หลายเดือนก่อน ⁺¹
Thanks for your feedback. No, don't create another environment. No need to chance anything after you have successfully started it once, the start procedure (webui.bat with shown parameters) stays same.
@taipeiperson6846 3 หลายเดือนก่อน ⁺¹
Thanks, you God!!!!!! It's work for my RX 6700 XT.
@NextTechandAI 3 หลายเดือนก่อน
Wow, thank you very much :) I'm happy that my video is useful.
@IshanJaiswal26 2 หลายเดือนก่อน
@taipeiperson6846 i did all the process thrice but i don't know what i am missing i know i am doing something wrong but i am not getting it what i am missing , i am bad in command and these thing i try three tym , please help me , i cant install it , could u help me to install it properly this hardly take 15-20min , via discord , meet whatever u say , your help will mean a lottt to me 🥺🥺
@LeshaKhaletskiy หลายเดือนก่อน
same vrm, but doesn't work for me(
@manaphylv100 3 หลายเดือนก่อน
When you say the first generation takes a long time, do you mean only the very first run after installation, or the first image in every session?
In other words, if I restart the program or my PC after the installation and initial run, do I have to wait it out again?
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Don't worry, it only takes this long the very first run. After a reboot the speed increase is still there. Thanks for asking!
@jovabre 3 หลายเดือนก่อน
Like in some game, on first run you have to wait on crating shaders then it's all fast.
@poltergijstt 2 หลายเดือนก่อน
Would you by any chance know of a way to combine samplers? I want to use the DPM++ SDE sampler, SD Next only shows the 'regular' DPM SDE. Someone on Reddit recommended me to 'combine' samplers, so I would be able to use the specific one I wanted. I have no idea how, and after browsing through all the settings I come to you in shame :(. Would you know how to do that?
@NextTechandAI 2 หลายเดือนก่อน
@poltergijs9959 Watch my Short in order to have an installation of Automatic1111 in parallel. Then try again :)
@poltergijstt 2 หลายเดือนก่อน
@@NextTechandAI Thank you for your response! I will get back to you once i have tried it :)
@NextTechandAI 2 หลายเดือนก่อน
@@poltergijstt Once you have installed SD.next according to my vid, it's up to you whether you choose Automatic1111 (see my Short) or ComfyUI (see my other vid). In both cases there's not much additional effort to be spend in order to have 1 or 2 additional SD UIs :)
@Bigchumpsgaming 3 หลายเดือนก่อน
hey! ive had zluda for a while but ive noticed recently that when i restart my computer it generates pretty fast but over time it gets slower... do you have a fix for this?
@NextTechandAI 3 หลายเดือนก่อน
Hey, I haven't noticed that. You could try latest Zluda files (github.com/lshqqytiger/ZLUDA/releases) after making a backup. First generation might take very long, again.
@Bigchumpsgaming 3 หลายเดือนก่อน ⁺¹
@@NextTechandAI thanks!
@alanreynolds4262 2 หลายเดือนก่อน ⁺³
Hello, when I try to generate an image I get this error: Building PyTorch extensions using ROCm and Windows is not supported. I have followed everything perfectly. Could it be because I have an RX 7900 GRE, which isn't listed at all?
@NextTechandAI 2 หลายเดือนก่อน
@alanreynolds4262 Hello, frankly speaking, I don't know. The architecture of your GPU is very similar to the RX 7900 XT, but e.g. for the RX 6xxx series below 6800 special libraries are required for ROCm HIP as described in my vid. Maybe this is the case for the GRE versions, too.
Currently there is an open issue in SD.next main branch leading to an "Diffusers failed loading" error. So just in case you can go one step further you might have to wait for an update or use an older commit of SD.next.
@alexesipenko3413 หลายเดือนก่อน
Same problem on the RX 6750 GRE.😞
@jasondsouza3555 2 หลายเดือนก่อน
Can I run LLMs locally using ROCm/ZLUDA? I was looking to make an RAG Chatbot on custom data with something like Llama-2
@NextTechandAI 2 หลายเดือนก่อน
Why not using GPT4All locally with Llama-2 or Llama-3? It supports AMD GPUs and has a server mode and an API. See my vid: th-cam.com/video/TcXzyutfmOw/w-d-xo.html
@jasondsouza3555 2 หลายเดือนก่อน
@@NextTechandAI oh I didn't know about this. I was only making RAG chatbots on kaggle and wanted to shift to local. Do you think an RX6600 can work with GPT4All?
@NextTechandAI 2 หลายเดือนก่อน
@jasondsouza3555 GPT4All uses the Vulkan API for GPU support, so I think yes, it should work with an RX6600.
@jiuvk8393 2 หลายเดือนก่อน
Hello, 2 or 3 questions:
#1: I have a Radeon RX 5600M (will this work for it?) (I already made comfy work but only for 1.5, it doesn't work for sdxl or stable cascade etc.)
#2: for brknsoul/ROCmLibs/ do I just download the one from the link or do I have to download a specific one for my gpu?, also there are 2 now instead of one: Optimised_ROCmLibs_gfx1031.7z (Optimised Libs for gfx1031) and Optimised_ROCmLibs_gfx1032.7z (Add files via upload).
thank you very much.
@NextTechandAI 2 หลายเดือนก่อน
@jiuvk8393 Regarding #2: When I created the vid, there was only ROCmLibs.7z. Choose this if you don't have gfx1031 or gfx1032, Optimised_ROCmLibs_gfx1031.7z or Optimised_ROCmLibs_gfx1032.7z if you have one of these. If this doesn't work, I would try ROCmLibs.7z and additionally overwrite with Optimised_ROCmLibs_gfx1031.7z or Optimised_ROCmLibs_gfx1032.7z depending on your GPU. Alas, in the readme it doesn't tell whether Optimised_ROCmLibs_gfx1031.7z and Optimised_ROCmLibs_gfx1032.7z include all required files or only customized files.
Regarding #1: I'm sorry, but so far no one has reported in the comments that they have successfully gotten it to work with a GPU below an RX 6xx0.
@semenderoranak2603 หลายเดือนก่อน
I have rx 6800 yet keep getting the error "Building PyTorch extensions using ROCm and Windows is not supported" and "Model not loaded"
@NextTechandAI หลายเดือนก่อน
Do you have in the command window an output entry similar to this one? "Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver=" and you have used the --use-zluda option?
Are there additional error entries when starting up/generating? Have you used a new Conda/Venv environment? Are you using different drives for SD.next and ZLUDA? Are you using this ZLUDA version github.com/lshqqytiger/ZLUDA (there have been updates)?
@pinkpig7505 3 หลายเดือนก่อน ⁺²
I am getting the following error :
Error(displayed on SD.Next site): model not loaded
Time: 0.02s
does this mean zluda is not working? this is my laptop model - AMD Ryzen 7 7730U with Radeon Graphics.
error displayed on terminal: ERROR sd shared Reading failed: ui-config.json [Errno 2] No such file or directory: 'ui-config.json'
@NextTechandAI 3 หลายเดือนก่อน
Well, it seems you haven't loaded a checkpoint and I'm not sure about the support for internal GPUs.
The missing ui-config.json hints to a wrong or missing path. Maybe a simple restart of the SD server in a new command window and a restart of the browser might help. You can load a checkpoint by clicking on it in SD.next or downloading it manually. Anyhow, only the GPUs listed on the Website mentioned in the vid are officially supported by AMD HIP.
@J2thaBgeerie 2 หลายเดือนก่อน ⁺¹
SD.next does feel a bit buggy sometimes, definitively after changing some settings and restarting UI/server. But man is it a massive difference over directml. On my 6700xt I used to get about 1.5it/s, now I'm getting about 9.5-10it/s a second, so that's about the 8x increase. Using deepcache aswell and the optimized rocms for gfx1031
@NextTechandAI 2 หลายเดือนก่อน ⁺¹
Thank you very much for your feedback, I'm happy to see such a speed increase. If you want to use Automatic1111 instead of SD.next, this can be done quite fast following this short: th-cam.com/users/shortsYzFRlsEYyEE
@J2thaBgeerie 2 หลายเดือนก่อน
@@NextTechandAI couldn't get it to work, it doesn't recognize the --use-zluda command, not sure what's going on and I'm bored of debugging for a while :D, and SD.next does have some nice functions, but the UI is indeed a lot worse.
@NextTechandAI 2 หลายเดือนก่อน
@J2thaBgeerie What a pity. It must be a fairly recent installation of A1111, but I can understand that you want to enjoy the speed of SD.next first and don't want to configure it any further :)
@ajphilippineexpat 2 หลายเดือนก่อน
Hi, I followed your instructions right to the end, but the Automatic requires python 3.11. I've tried plenty of things to roll back from 3.12 but can't make it it work. Ugh. Using AI on my CPU is hopeless when I've got an AMD 7900XTX to use instead. If I can't roll back python and continue the last step then I've got to wait for ROCm on windows...
@NextTechandAI 2 หลายเดือนก่อน ⁺¹
I'm sorry, but there was a reason that I've created a Conda environment with python 3.10 in the vid. You have to start over with a new e.g. Conda env based on python 3.10 or indeed wait for ROCm on Windows.
@Plutonium.239 2 หลายเดือนก่อน
@@NextTechandAI I have this same problem and I did create the environment exactly as you did with the 3.10 option but it still show as 3.12 and "Incompatible version" when running the "webui use-zluda debug autolaunch command. I even tried again and started over and ran "conda install python=3.10" and created a new environment and nothing seems to change the python version to prevent this incompatible version error.
@Plutonium.239 2 หลายเดือนก่อน ⁺¹
Actually it looks like the problem may be the version of Miniconda that i installed. there are older version with python 3.10
@NextTechandAI 2 หลายเดือนก่อน
SD.next uses the first python version it finds in the path and creates a venv with this version. You could try to delete (or rename to have a backup) the venv folder of your SD.next.
Make sure to have an active Python 3.10 in your path when starting SD.next after this modification.
@user-ei8kk8vp7k 2 หลายเดือนก่อน ⁺¹
Updated SD.next today. It worked fine before on 7800 xt with ZLUDA. Now it says: "OSError: Building PyTorch extensions using ROCm and Windows is not supported." Do you know by any chance what the matter is, how to fix it?
@NextTechandAI 2 หลายเดือนก่อน ⁺¹
I can only guess. Maybe you have to update your Zluda files. In any case the start and very first image generation will take very long again after an update. When starting SD.next, was there a line like this Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver="? At which point did you get the error?
@user-ei8kk8vp7k 2 หลายเดือนก่อน
@@NextTechandAI The error appears when loading models. And now there is another error: "OSError: [WinError 126] Das angegebene Modul wurde nicht gefunden. Error loading "F:\Stable Diffusion\SD.Next\venv\lib\site-packages\torch\lib\caffe2_nvrtc.dll" or one of its dependencies." Even on frsh SD.next install. Somtheing wrong with Python modules: "F:\Stable Diffusion\SD.Next\venv\lib\site-packages\torch\__init__.py:141 in │
│ │
│ 140 err.strerror += f' Error loading "{dll}" or one of its dependencies.' │
│ > 141 raise err │
│ 142"
Initialization before looks like this:
"Using VENV: F:\Stable Diffusion\SD.Next\venv
10:41:56-762871 INFO Starting SD.Next
10:41:56-765373 INFO Logger: file="F:\Stable Diffusion\SD.Next\sdnext.log" level=INFO size=29636 mode=append
10:41:56-767377 INFO Python 3.10.9 on Windows
10:41:56-846333 INFO Version: app=sd.next updated=2024-05-07 hash=e081f232 branch=master url=github.com/vladmandic/automatic/tree/master
10:41:57-383979 INFO Platform: arch=AMD64 cpu=AMD64 Family 25 Model 97 Stepping 2, AuthenticAMD system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.9
10:41:57-389487 INFO AMD ROCm toolkit detected
10:41:57-402589 WARNING ZLUDA support: experimental
10:41:57-404095 INFO Using ZLUDA in F:\Stable Diffusion\ZLUDA
10:41:57-457633 INFO Extensions: disabled=['sd-webui-controlnet']
10:41:57-458635 INFO Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler',
'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg'] extensions-builtin
10:41:57-461632 INFO Extensions: enabled=['prompt_translator', 'sd-webui-mosaic-outpaint'] extensions
10:41:57-464036 INFO Startup: quick launch
10:41:57-464036 INFO Verifying requirements
10:41:57-470543 INFO Verifying packages
10:41:57-471543 INFO Extensions: disabled=['sd-webui-controlnet']
10:41:57-472673 INFO Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler',
'stable-diffusion-webui-images-browser', 'stable-diffusion-webui-rembg'] extensions-builtin
10:41:57-475185 INFO Extensions: enabled=['prompt_translator', 'sd-webui-mosaic-outpaint'] extensions
10:41:57-479188 INFO Command line args: ['--use-zluda', '--autolaunch'] autolaunch=True use_zluda=True
┌──────────────────────────────────────────────────────────────────── Traceback (most recent call last)"
@NextTechandAI 2 หลายเดือนก่อน ⁺¹
@user-ei8kk8vp7k In one of the comments there was a hint, that it doesn't work on external drives. In order to avoid any problems related to the path I've put my installation of SD.next and Zluda etc. on same drive; best option is C:\.
Other possibility: You have older installations of SD.next/A1111 or Zluda and they interfere with your current installation.
@Mozkiito 3 หลายเดือนก่อน ⁺¹
I use zluda with automatic1111, it works great for generating and regular inpainting, however, a lot of important extensions, such as controlnet, do not seem to work.
@NextTechandAI 3 หลายเดือนก่อน
Right, standard operations seem to work well, but I'm still trying to find a way for controlnet. Thanks for your hint.
@SINEWEAVER- 2 หลายเดือนก่อน
how do i addd my models from 1111?
@NextTechandAI 2 หลายเดือนก่อน
The directory structure is same as in Automatic1111, so put them in models\Stable-diffusion.
@thedausthed 28 วันที่ผ่านมา
To everyone having the "Building PyTorch extensions using ROCm and Windows is not supported" error, just delete line 157 and 158 in automatic\venv\Lib\site-packages\torch\utils\cpp_extension DOT py .
@Thomas-xy6gy 22 วันที่ผ่านมา
Hello, for me there is a error says "model not loaded" i downloaded a Vae and it seems to work but then error appears "ROCM and PyTorch not supported by Windows"
@NextTechandAI 22 วันที่ผ่านมา
@Thomas-xy6gy Does your installation work with the base model like I have shown in the vid?
@Thomas-xy6gy 21 วันที่ผ่านมา
@@NextTechandAI i think i got some wrong with my Windows, now it works fine Thanks for your Vid ;)
@NextTechandAI 21 วันที่ผ่านมา
@@Thomas-xy6gy Thanks for your feedback :)
@poltergijstt 3 หลายเดือนก่อน
I've followed everything you've did here, but I get the error 'No HIP SDK found'. I have added it to my path, tried to reinstall and reboot as well. I have a 6750XT, so I followed the extra steps with the library folder. Looking online doesn't really make me any wiser on the issue.
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
At which step do you get this error? If you open a cmd window and enter "hipinfo", what's the output?
@poltergijstt 3 หลายเดือนก่อน
Thank you for your helpful comment!@@NextTechandAI I got it as one of the first lines of output after running 'webui.bat --use-zluda --debug --autolaunch. I copied my Zluda files to a folder on my C drive, and added this to my PATH. For some reason it didn't work on my external hard drive, the drive containing SD Next. After putting the files on the C drive I didn't have any issues anymore, so I got it working.
Again thank you for the comment, whether I solved the problem or not, the help is appreciated!
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
I'm glad the installation is working for you now. Thanks for sharing your insights.
@poltergijstt 3 หลายเดือนก่อน
@@NextTechandAII should be thanking you, I've been trying for a couple of hours per week for the last 5/6 weeks to get this stuff to work without using directml. Your videos were the only ones among all of them that actually helped me!
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Wow, then I'm even more pleased that my videos are helpful.
@kittykisses2257 2 หลายเดือนก่อน
Thanks for the video. I went with AMD this go around not realizing that I would want to get in to AI image generation. One little critique on an otherwise excellent video: there were some steps where you skipped parts and also some command prompts that were not available in the description. Overall, an easy to follow video. However, I seem to get stuck on "text to image starting". I left it for over an hour and it never progressed. Have you tried using zluda with SDXL models? UPDATE: I swapped to same model as you and it says it will take 700 minutes... Any idea why?
@NextTechandAI 2 หลายเดือนก่อน
Thanks for you feedback. Which steps have been skipped in your opinion? I had to fast forward sometimes in order to keep the video short, but it was my goal to include all necessary steps.
Which AMD GPU are you using? Have you interrupted the very first image generation with Zluda? With my RX6800 it took nearly half an hour, but not longer. When starting SD.next, was there a line like this Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver="?
Yes, Zluda works with SDXL models. All images in my latest video about ControlNet with SDXL have been created by using my Zluda installation.
@kittykisses2257 2 หลายเดือนก่อน
@@NextTechandAI Thank you for getting back to me. I will comb through your video and find the missing step/s. In the mean time, here is the answer to your other questions. I am using a 7900 XTX and the Devices line says this "Devices: " I checked in the system information tab of SD.Next and there is no GPU listed in the GPU section. So maybe it is defaulting to my CPU? I thought I followed instructions perfectly. Any thoughts?
@kittykisses2257 2 หลายเดือนก่อน
@@NextTechandAI Upon rewatching, I found only one step that I remember having to pause and google. At 5:30 you talk about editing path variables but don't show how to get to the appropriate menus.
@NextTechandAI 2 หลายเดือนก่อน
Well, but at 5:30 I'm hinting at a document which describes how to edit the path in case you don't know.
Nevertheless, if your device is not listed then your Zluda installation is not complete and - yes - in this case it's using the CPU.
Difficult to guess which step is missing, but according to one comment you should make sure to have Zluda and SD.next on same drive (in best case C:\).
@kittykisses2257 2 หลายเดือนก่อน
@@NextTechandAI I restarted from scratch, followed every step in video exactly, and my device is still not showing up. I guess I will have to wait for new solutions. Sad stuff.
@thetoicxdude2203 3 หลายเดือนก่อน
Can AMD Radeon™ Graphics GPU be used? I want to deploy gpt-sovite locally
@NextTechandAI 3 หลายเดือนก่อน
Help me to understand your question. Are you asking for internal AMD GPUs or whether ZLUDA can be used for GPTs instead of Stable Diffusion?
@thetoicxdude2203 3 หลายเดือนก่อน
@@NextTechandAI use to ROCm
@thetoicxdude2203 3 หลายเดือนก่อน
@@NextTechandAI I want to know whether my gpu is a type supported by ROCm, because I still get CUDA errors after installation.
@NextTechandAI 3 หลายเดือนก่อน
@thetoicxdude2203 Are you using a notebook or a desktop PC? In the MS Windows settings under System -> Info -> Device Manager you should under "GPU" or "Graphics" the type or your GPU, e.g. "AMD Radeon RX 6800".
@thetoicxdude2203 3 หลายเดือนก่อน
@@NextTechandAI There are signs of installation on my laptop, but it doesn't seem to be working.
@joseffritzl8379 3 หลายเดือนก่อน
I'm suffering a similar fate to a few others here where their SD.Next installs are using their CPUs instead of their GPUs. The crazy thing is - I know I've got the ZLUDA path set up and working correctly because it's active on a stable diffusion install that I have as well - not only did I get the 20 minute or so initial image generation, but I also get prompts in the CMD when I run it telling me I ought to be using SD.Next instead for the optimum experience! I'm gonna poke around at it for a bit, but I'm not sure what else the issue could be at that point. I know ZLUDAs on PATH, I know it works in another program...it's just not referencing correctly in SD.Next's setup. Maybe it's not looking in the right place for it? But PATH should not allow for that...
@NextTechandAI 3 หลายเดือนก่อน
So you don't have in the command window output entry similar to this one? "Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver=" and you have used the --use-zluda option?
Are there additional error entries when starting up/generating? Have you used a new Conda/Venv environment? Are you using different drives for SD.next and ZLUDA? Are you using this ZLUDA version github.com/lshqqytiger/ZLUDA (there have been updates)?
@MHD-ck4mi9qi9p 2 หลายเดือนก่อน
@@NextTechandAI hello , sorry for the interruption
I'm also suffering from the same issue, im sure i added zluda to path and this (Device: device=AMD Radeon RX 7700 XT [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8) showed up. and followed all your steps (except for conda create zluda because it really didn't work . Unable to create process using 'C:\..\miniconda3\python.exe "C:\..\miniconda3\Scripts\conda-script.py" create -n zluda python=3.10 -y')
@NextTechandAI 2 หลายเดือนก่อน
@MHD-ck4mi9qi9p Well, if SD.next starts and doesn't complain about using python > 3.11 that's not a problem. SD.next is using it's own venv if it was installed correctly, hence it will work without conda.
Although SD.next with Zluda is very stable once it's running, there are a ton of reasons for using the CPU instead of the GPU.
Do you have an older installation of SD.next from a previous try? Does it share same environment with an Automatic1111 or ComfyUI installation? Is you installation on a different drive than c:? Which parameters have you passed? Which additional errors or hints occur in your command window?
@tushkan4ik111 3 หลายเดือนก่อน
Sketch img2img is not working on zluda :(
@NextTechandAI 3 หลายเดือนก่อน
Thanks for your feedback. This is a pity. Have you noticed any other limitations in functionality?
@unitrixbase5221 3 หลายเดือนก่อน
Hi, I did everything the same as you, but the CPU still handles the generation of images. the speed is extremely low, the GPU is not used (in my pc rx6800)
@NextTechandAI 3 หลายเดือนก่อน
How long took your very first run with ZLUDA?
In the command window output, do you have an entry similar to this one? "Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver="
@unitrixbase5221 3 หลายเดือนก่อน
@@NextTechandAI this inscription was not there, he simply generated the picture using the processor in a few minutes
@NextTechandAI 3 หลายเดือนก่อน
Then I'm afraid that the installation of ZLUDA is not correct on your PC. The very first image generation took me about half an hour and the above line must look similar to mine. You're actually using the CPU, which of course takes time.
@SimonLange 2 หลายเดือนก่อน
@@NextTechandAI which leads to the question where. i got the very same problem and it just doesnt work. its installed, the system path is set, hip is installed and the path is set the very same way as you did it. i got a 6900xt and the gpu is not used in the first run. i got no errors or warnings, which i would expect.
So did you miss anything in your video? downloading zluda,extracting it and putting its path to the system path. well. whats the point. ;) maybe do we have to register the dll by hand? or do need to copy them to the system32 directory?
@NextTechandAI 2 หลายเดือนก่อน
@SimonLange Regarding your last sentences, no, none of these. No secret registration of DLL, no copying into system32. All I've done is in the video.
Nevertheless, most important point to continue: In the command window output, do you have an entry similar to this one? "Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver="
And if there is no error output: How do you know it's not working?
@IshanJaiswal26 2 หลายเดือนก่อน ⁺¹
(base) C:\Users\ishan>cd \ki
The system cannot find the path specified.
@NextTechandAI 2 หลายเดือนก่อน ⁺¹
As I said in the vid: You have to create an appropriate directory and enter it. I put my SD installations in KI, you can put yours anywhere on your PC.
@noisedark1 3 หลายเดือนก่อน
ZLuda start with 18GB ram usage and when i do a 960x540 goes to 31.8GB and 16Gb Vram
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
@noisedark1 An image generation with same size on my machine uses 2GB RAM and 12.5GB VRAM, but I haven't used --medvram/lowvram so far.
@noisedark1 3 หลายเดือนก่อน
@@NextTechandAI can you give me a advice? Mi specs:
Ryzen 5 5600X
PG rx6800XT
32GB ddr4 3600
2TB nvme
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
@noisedark1 Sure, but what's your question? :) You can reduce VRAM usage with the parameters --medvram or --lowvram. Additionally you can try ComfyUI with ZLUDA, see my new vid.
@noisedark1 3 หลายเดือนก่อน
@@NextTechandAI when stable difusión start it consume 18GB ram and when it generate goes to 31.9GB ram and freezes my pc for a little Time AND then the image is generated
Stable difusión doesn't generate anything with --lowvram
@NextTechandAI 3 หลายเดือนก่อน
Which settings and which checkpoint are you using? Which size for the image?
@leonelguitarra 3 หลายเดือนก่อน
on RX 580?
@NextTechandAI 3 หลายเดือนก่อน
GCN 4.0? I would be surprised.
@raven1439 3 หลายเดือนก่อน
ZLUDA works great, version 3.7
juggernautXL_v9Rdphoto2Lightning at 1024x1024 on default workflow, steps: 5, cfg: 2, sampler: dpmpp_2m_sde, sheduler: sgm_uniform, generates image in 6.14 sec, 1.30 it/s on 7800XT
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Great, you even used the XL version of Juggernaut. Thanks a lot for sharing this!
@SalamiMommie 3 หลายเดือนก่อน
Can you do a tutorial for how to run ZLUDA on linux?
@NextTechandAI 3 หลายเดือนก่อน
Thanks for asking, but I'm not sure about the demand. On Linux there is the faster ROCm and ZLUDA/HIP covers only a subset of the CUDA API. Which applications do you want to run with ZLUDA under Linux?
@SalamiMommie 3 หลายเดือนก่อน
@@NextTechandAII'm trying to train a ML model using audiocraft but I'm running into errors even though I have ROCm. Not sure if ZLUDA will help with it or not, but wanted to try to make sure
@NextTechandAI 3 หลายเดือนก่อน
@SalamiMommie Thanks for your feedback, I haven't expected this demand for ZLUDA on Linux :)
@SalamiMommie 3 หลายเดือนก่อน
@@NextTechandAIthanks for your time!
@shindesigner8051 3 หลายเดือนก่อน ⁺¹
I'm using an RX 6600
I installed everything correctly, I didn't receive any errors.
but zluda is taking 2 minutes to generate an image
and DirectML takes 10 seconds to 30 seconds
@NextTechandAI 3 หลายเดือนก่อน
That's strange. The only difference I can spot are the special DLLs for gfx1031 and gfx1032 based AMD GPUs and the lower VRAM.
What kind/size of image have you generated?
@shindesigner8051 3 หลายเดือนก่อน ⁺¹
@@NextTechandAI
OK I need to fix my answer
I compared SD.next with SD.directml previously...
lshqqytiger SD directml - takes 10 to 30 seconds.
vladmandic SD.next -
using ZLuda takes 2 to 3 minutes and
using Dirctml the same thing 2 to 3 minutes.
I'm using default config 512x512 .
I tried some Sampling methods and models.
but I couldn't reduce the time
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Looks a bit like it is not using the Zluda optimization. How long took your very first run?
In the command window, do you have an entry similar to this one? "Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver="
@shindesigner8051 3 หลายเดือนก่อน ⁺¹
@@NextTechandAI all runs take 2 to 3 minutes...
and my device is empty
20:56:26-409670 INFO Device: ...
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
I think Zluda has not been initialized on your machine and what you are seeing is always the directML performance. I guess you have started with "webui.bat --use-zluda --debug --autolaunch" like in the vid, right? I'm afraid you may have to start all over again, something went wrong somewhere in the installation.
@MrSociofobs 3 หลายเดือนก่อน ⁺¹
Another nice bonus with rocm+zluda on windows is that the whole installation takes up a tiny fraction of the space rocm alone takes up on linux, which is more than 20gb, most taken up by stuff never needed for many specific cases.
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Thanks for pointing that out. If only ZLUDA/HIP would now support the entire CUDA API and not just a subset... :)
@zach.spencer 6 วันที่ผ่านมา
I couldn’t get it to work until I added -use-directml, but either way it’s not using the GPU. Womp
@xLovelty 2 หลายเดือนก่อน
Hi, i have an RX 580, i think this is not for me, right ?
@NextTechandAI 2 หลายเดือนก่อน
Hi, I'm sorry, but so far no one has reported in the comments that they have successfully gotten it to work with a GPU below an RX 6xx0.
@jatinnagar5257 2 หลายเดือนก่อน
USING CPU INSTEAD OF GPU , HELP PLS
@NextTechandAI 2 หลายเดือนก่อน
It's not helpful if you post your requests multiple times or with different accounts.
Which GPU and which options do you run?
You should have an entry similar to this one, how does it look like? "Device: device=AMD Radeon RX 6800 [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8 cudnn=8700 driver="
Which additional errors occur in your command window?
@jatinnagar5257 2 หลายเดือนก่อน
@@NextTechandAI sorry for disturbing u
@pinep1 3 หลายเดือนก่อน ⁺¹
It is strongly recommended to use A1111 + zluda because SDnext is very unstable.
@NextTechandAI 3 หลายเดือนก่อน
Thanks for your feedback. Do you have a source for this statement?
@Aptronymist 3 หลายเดือนก่อน ⁺¹
I second the request for a source. It's not a complaint I often hear.
@DaehaKim 2 หลายเดือนก่อน
10:37:32-465225 INFO Device: device=AMD Radeon RX 5700 XT [ZLUDA] n=1 arch=compute_37 cap=(8, 8) cuda=11.8
cudnn=8700 driver=
10:37:32-469226 DEBUG Migrated styles: file=styles.csv folder=models\styles
10:37:32-497233 DEBUG Load styles: folder="models\styles" items=288 time=0.03
10:37:32-499233 DEBUG Read: file="html
eference.json" json=36 bytes=21493 time=0.000
rocBLAS error: Cannot read C:\Program Files\AMD\ROCm\5.7\bin\/rocblas/library/TensileLibrary.dat: No such file or directory for GPU arch : gfx1010
rocBLAS error: Could not initialize Tensile host:
regex_error(error_backref): The expression contained an invalid back reference. How can i fix it ?
@NextTechandAI 2 หลายเดือนก่อน
I don't have good news. You remember the part in the video where I list the supported GPUs? For the 6600 to 6750 XT, which are not directly supported, there are adapted libraries on the linked website. You could see if something like this also exists for the 5700 XT, I haven't heard of it yet.
@guillaumeguitarian9642 4 วันที่ผ่านมา
Some news ?
@NextTechandAI 3 วันที่ผ่านมา
No
@yy84869 หลายเดือนก่อน
It might not work on my Ryzen 5700g but Still thanks a lot
@NextTechandAI หลายเดือนก่อน ⁺¹
Thank you very much for your feedback. I'm sorry that there's currently no solution for the 5xxx.
@Rishi-ch6jo 2 หลายเดือนก่อน
USING CPU INSTEAD OF GPU
@PSYCHOPATHiO 3 หลายเดือนก่อน
actually zluda is much more faster than linux, tested on the 7900 xt
@NextTechandAI 3 หลายเดือนก่อน
Technically this is not so easy to imagine, what was your test setup? Thanks for sharing.
@PSYCHOPATHiO 3 หลายเดือนก่อน
@@NextTechandAI cpu 5900x with pbo 4.6 based on temp, gpu stock clocks, ram 4000Mhzz 64gb gskil, win 10 latest, its rediculously fast ive tried impossible generations 1024x1024 at 50 SDXL @ 2.7it/s. the generation is in less than 22 seconds
@NextTechandAI 3 หลายเดือนก่อน
@PSYCHOPATHiO I have no doubt that Zluda is good and fast :-) I just wondered about your test setup for Zluda under Windows and ROCm under Linux. When using my RX6800 with exactly same software and image generation process, ROCm is around 15% faster.
@PSYCHOPATHiO 3 หลายเดือนก่อน
@@NextTechandAI I'm not a huge Linux fan, been using it on and off since the early 2000s but when it comes to functionality I find windows much more suitable and user-friendly and with Zluda even if it's on PAR I would still go for a window. Zluda is a game changer for me. Although recently I heard some bad news about nvidia copyrighting zluda or something similar.
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
@@PSYCHOPATHiO Okay, I understand, your comparison was more empirical. I'm very excited about Zluda, too, I think I even said that in the vid. Regarding the copyright issue it might target Chinese GPU makers allowing running CUDA code on their GPUs. If they target Zluda, too, I hope it can stay until we have ROCm under Windows.
@Hazardteam 20 วันที่ผ่านมา ⁺¹
So... nVidia=Works fine out of the box vs AMD=Suck with command lines and hunting tutorials and do complicated tasks then pray.
@souptemba3092 15 วันที่ผ่านมา ⁺¹
Yes, wish i had an NVDIA gpu right now
@Hazardteam 14 วันที่ผ่านมา
@@souptemba3092 Unfortunately i need to use AMD cards 'cause Apple dropped the nV cards support.. Very sad.
@jovabre 3 หลายเดือนก่อน
I'm switching to Nvidia. I so much in SD, so selling my AMD and adding extra money to go to the Nvidia word is ok. From my point it's acceptable. I'm not just testing how how fast it work on my sistem, I'm spending 2~4 making illustrations for my further design work.
Simply if you are really in SD go for Nvidia.
ROCM on Linux is faster then Zluda and Directml, but still Nvidia is faster overall. AMD has cheaper solution for 24GB vram, but if you really need it you probably earning from SD and it won't be a problem to by 4090.
@NextTechandAI 3 หลายเดือนก่อน
Thanks for your feedback. No question. NVIDIA still has a big lead over AMD.
@__-fi6xg 2 หลายเดือนก่อน
just too complicated, AMD just dropped the ball, switching to Nvidia.
@NextTechandAI 2 หลายเดือนก่อน
I can understand that.
@__-fi6xg 2 หลายเดือนก่อน
@@NextTechandAI im sorry, i tried it for a week,im just not good at this... got frustrated
@NextTechandAI 2 หลายเดือนก่อน
@@__-fi6xg Honestly, I can understand that. When I bought my RX 6800, I didn't know that I would get so involved with AI. Inevitably I'm making the best of it now, but with an NVIDIA GPU everything would be a lot easier :)
@spencerfunk6697 3 หลายเดือนก่อน
you look just as annoyed as i am trying to get this stupid shit working
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
That sums it up quite well.
@spencerfunk6697 3 หลายเดือนก่อน
@@NextTechandAI I’m trying to use this with PyTorch so I can use the bitsandbytes library so I can fine tune my llms locally but nothing seems to work for me, I’m at the point it would be easier to keep sinking money into cloud gpus or just get a new computer :(
@NextTechandAI 3 หลายเดือนก่อน
@spencerfunk6697 I keep my fingers crossed, but from what I've learned so far, fine tuning llms requires lots of VRAM. Best option for fine tuning locally is QLoRA, but usually this requires a NVIDIA GPU. So, right, using cloud GPUs is the way to go. Or pass the data in your prompt in case you have special use cases for your llm with a small data set.
@spencerfunk6697 3 หลายเดือนก่อน
@@NextTechandAI ya im able to do literally everything i need except convert to qlora. q lora is so good and you cantrain it with a cpu but u need cuda to create quantize the model to convert it to lora. its really so unfortunate, I can do everything else just fine off cpu.
the purpose is developing and training a1bit llm ive built from scratch lol
@NextTechandAI 3 หลายเดือนก่อน ⁺¹
Interesting. I case you find a way different from cloud GPU... :)
@SimonLange 2 หลายเดือนก่อน
I got everything up and running. SD even recognizes ROCm AND the zluda installation.
BUT
pyTorch still uses cpu. i got no idea why.
here the startup partly:
20:23:33-506037 DEBUG Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False
20:23:33-509538 DEBUG Torch allowed: cuda=True rocm=True ipex=True diml=True openvino=True
20:23:33-523036 DEBUG Package not found: torch-directml
20:23:33-525537 INFO AMD ROCm toolkit detected
20:23:34-014298 DEBUG ROCm agents detected: ['gfx1030']
20:23:34-016797 DEBUG ROCm agent used by default: idx=0 gpu=gfx1030 arch=navi2x
20:23:34-224965 DEBUG ROCm version detected: 5.7
20:23:34-226965 WARNING ZLUDA support: experimental
20:23:34-230047 INFO Using ZLUDA in D:\AI\zluda
20:23:34-231967 DEBUG Installing torch: torch==2.2.1 torchvision --index-url download.pytorch.org/whl/cu118
20:23:34-342515 DEBUG Repository update time: Sun Apr 21 14:25:50 2024
As you can see. looks fine. BUT then this:
20:23:46-687652 INFO Load packages: {'torch': '2.3.0+cpu', 'diffusers': '0.27.0', 'gradio': '3.43.2'}
20:23:47-866616 DEBUG Read: file="config.json" json=30 bytes=1350 time=0.001
20:23:47-874117 INFO Engine: backend=Backend.DIFFUSERS compute=cpu device=cpu attention="Scaled-Dot-Product"
mode=no_grad
20:23:47-878616 INFO Device:
20:23:47-880615 DEBUG Read: file="html
eference.json" json=36 bytes=21493 time=0.000
20:23:48-465794 DEBUG ONNX: version=1.17.3 provider=CPUExecutionProvider, available=['AzureExecutionProvider',
'CPUExecutionProvider']
As you can see pyTorch uses the cpu version. No f**n idea why.
So what is missing?! i octa-checked meanwhile your instructions and my installation. It makes no difference if run via powershell or cmd. it makes no difference as run as admin or normal user. it makes no difference if restarted or without. All drivers are updated, so why is torch not using it?! why the backfall to cpu?
ideas?!
@SimonLange 2 หลายเดือนก่อน
starting with exactly your phrase gets me this:
20:32:28-812973 INFO Base: class=StableDiffusionPipeline
20:33:01-831983 DEBUG Diffuser pipeline: StableDiffusionPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 768]), 'negative_prompt_embeds': torch.Size([1, 77,
768]), 'guidance_scale': 6, 'generator': device(type='cpu'), 'num_inference_steps': 20, 'eta':
1.0, 'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 512, 'height': 512, 'parser':
'Full parser'}
as you can see it starts with device type cpu. weird. where is the secret switch so he utilizes my gpu.
@CASTmpChannel 2 หลายเดือนก่อน
same issue here, i have a 6900xt, no error in logs, i copy only relevant lines here:
>>
Platform: arch=AMD64 cpu=AMD64 Family 25 Model 97 Stepping 2, AuthenticAMD system=Windows release=Windows-10-10.0.22631-SP0 python=3.10.14
ROCm agent used by default: idx=0 gpu=gfx1030 arch=navi2x
>
Diffuser pipeline: StableDiffusionPipeline task=DiffusersTaskType.TEXT_2_IMAGE
set={'prompt_embeds': torch.Size([1, 77, 768]), 'negative_prompt_embeds': torch.Size([1, 77,
768]), 'guidance_scale': 6, 'generator': device(type='cpu'), 'num_inference_steps': 20, 'eta':
1.0, 'guidance_rescale': 0.7, 'output_type': 'latent', 'width': 512, 'height': 512, 'parser':
'Full parser'}
@ssteez67 10 วันที่ผ่านมา
Following the video step by step I get to about 7:50 when I get WARNING about zluda not loading. Here is the message:
08:20:41-769675 WARNING Failed to load ZLUDA: Could not find module 'C:\Program Files\AMD\ROCm\6.1\bin\hiprtc0507.dll'
(or one of its dependencies). Try using the full path with constructor syntax.
In the folder the file I have is: "C:\Program Files\AMD\ROCm\6.1\bin\hiprtc0601.dll"
It appears to be because AMD ROCm is now on version 6.1 and the video does specify to install an older version and which one is needed.
Are there update instructions anywhere?
@NextTechandAI 9 วันที่ผ่านมา
You have installed latest ROCm 6.1 and your ZLUDA is expecting ROCm 5.7. lshqqytiger has updated his ZLUDA fork Yesterday, it's linked in my vid. Please try that one or give him a couple of days to bring ROCm and ZLUDA in sync.

ต่อไป

เล่นอัตโนมัติ

How to Turn Your AMD GPU into a Local LLM Beast: A Beginner's Guide with ROCm