You are now the only channel I watch when it comes to stable diffusion tutorials and similar. I can finally generate awsome pictures of frogs in suits on my amd gpu.
😂😂 yessss. I love it. I am hoping to get some stuff in my website before too long for people to either generate or post SD images. And I would absolutely love to see frogs in suits that you generated. Thank you so much for watching!
I second this.. many suited frogs are being made. Been trying to get this to work for ages and this by far seems to be the easiest set up. Thanks!@@FE-Engineer
You are very welcome! I’m really glad you got it working and are having fun with it! Still working on the website and another site. But will be getting the ability for folks to send up photos or images from AI sometime in the not too distant future!
Thank you, i have followed about 20 different tutorials and none of them have worked, your video was very easy to follow and worked perfectly on my 7900xtx, again thank you, you have put to bed hours of turmoil from me trying to get this to work, excellent video
You are welcome. Thank you so much for watching! I am waiting for ROCm on windows and then everyone can basically do everything with all of the different tools out there without really any compromises. Good speed and good support. One of these days….
Hey man, I want to thank you from the bottom of my heart, I was having trouble with this waiting time, easy and clear tutorial to follow, I complete the process now in 10 seconds, it was around 1m30, thank you for this Christmas gift ! Merry Christmas
Wow that really helped! Not as fast as yours since I had an error that shifted some of the work back to the CPU. But 3 mins is much better than 50! The error was The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications.
Haven't seen any updates on comfyui github page about this ability of running comfy on windows. 😮 But happy to know this finally works. Amd is also cheaper ,and amd holders no need to switch for nvi
Does this run quicker than stable-diffusion-webui-directml on Windows using a 7900 XTX with zluda? I'm trying to avoid using shark because it takes up too much space and takes too long.
rx6650xt (in a SDXL workflow) ~22s/iteration with dreamshaperXL turbo v2, ~12s/iteration with SDXLRefiner v1.0_0.9vae and ~ 11s/iteration with SDXLRefiner v1.0
You are very welcome! Sorry it took a while I’ve been debating about what and how much to include and do for comfyui. And I wanted to make sure it could work on windows without ROCm as a requirement.
does not work in my case. i was able to install everything according to your very clear and nice instruction. no problems. i even can start comfyui. i did however copied some of the checkpoints i already had for my cpu version. i can select the checkpoint and as soon as i hit the generate button i have a bluescreen when it walks to the last node.... i have no idea what i am doing wrong
I followed your tutorial and there were no errors in install but when I "Queue prompt" I get: The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. I have AMD XT 7900 XTX.
After pip install torch-directml i got ERROR: (base) C:\Users\user\ComfyUI>pip install torch-directml Defaulting to user installation because normal site-packages is not writeable ERROR: Could not find a version that satisfies the requirement torch-directml (from versions: none) ERROR: No matching distribution found for torch-directml How to fix who know? Thanks!))
Sounds like a python version issue, or potentially a permissions issue. I would make sure you have things setup appropriately and using the correct python version.
it's sad, every time i try to generate something i receive the message ''Could not allocate tensor with 1221853184 bytes. There is not enough GPU video memory available!'' amd cpu+ amd radfeon 6550 xt
hello mate. so all the steps went well but it's not working. when i press queue prompt, it says reconnecting and then gives me an unknown error. also, the archive i downloaded gives me errors and its not extracting. any advice?
Yea. For some reason my conda was added to path but was being finicky. I ended up more recently largely ditching conda because for videos I end up installing stuff a lot and conda overall was becoming a bit more hurtful than helpful overall in my specific scenario. :-/
i seem to be getting [error executing checkpontloadersimple] the torch not compiled with cuda, followed all the steps. made sure im using the (python main.py --directml). Is downloading the latest miniconda an issue? even though i typed the python3.10.12. And i made sure i clicked on path when installing the miniconda *tried it with miniconda 10 and still same error
@@ed1k37 I managed to back track it, it was on a different file forlder but now im stuck at something else, when I run "python main.py --directml" its showing me "[Errno 2] No such file or directory"
Like several others, I am having an issue where comfyui is falling back to using my CPU instead of my GPU. The error i get is "The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU." Any chance you know how to fix this?
Can you make a video to Stable Video diffusion with ComfyUI on AMD GPU? Always get this: Error occurred when executing KSampler: input must be 4-dimensional Thank you! Great Video
Love the tutorial, but I got all the steps as you said but when executing "python main.py --directml", I have the error "module 'torch' has no attribute 'Tensor'" Tried to search the issue but with no results ...
hi, I am encountering a problem. the installation went well and I managed to launch comfyUI but once in front of the panels it is not possible to generate any image, because the panel which loads the checkpoint models does not work. it indicates "ckpt_name null" and when I interact it does not open any pop up with the list like in the video but goes to "ckpt_name undefined" and it is no longer possible to interact with the model selection line, although I have two models in my models folder. I don't understand what I did wrong. Thank you for answering me.
According to the instructions, Comfyui has been installed on drive C. You can check drive C and it will be there. The checkpoint must be saved on drive C.
De-Install Torch and Re-Install it again that will fix it... I had same Problem ✌(Un-Install = pip uninstall torch) Then use the Line in the Video to Re-Install
Hey try this: python main.py --directml Otherwise I had the same error! Comfy tried to call torch.cuda.current_device(), and it could not of course: [...] File "G:\AI\ComfyUI\comfy\model_management.py", line 83, in get_torch_device return torch.device(torch.cuda.current_device()) File "G:\AI\VirtualEnvs\win_comfy\lib\site-packages\torch\cuda\__init__.py", line 674, in current_device _lazy_init() File "G:\AI\VirtualEnvs\win_comfy\lib\site-packages\torch\cuda\__init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled
Do you know if ComfyUI has the same problem with inpainting? The Automatic1111's couldn't inpaint with directml and it was only solved by using the commands: "--no-half --precision full --no-half-vae --opt-sub-quad-attention --opt-split-attention-v1". ComfyUI doesn't have these exact commands and like the extension "sd-webui-inpaint-anything" without them, the Face Detailer and other inpaint segments from "ComfyUI-Impact-Pack" throws the error: "The size of tensor a (0) must match the size of tensor b (256) at non-singleton dimension 1"
Honestly. I am not sure. I thought since it basically has a different overall architecture that it might not have the same fallbacks as using the directML fork of automatic1111.
@@HeinleinShinobu I found that Face Detailer from Impact Pack to be the best automated inpainting since it doesn't really search for CUDA (although if used with SAM, they must be loaded with CPU). While for manual masking, using ControlNet as auxiliary or using dedicated inpainting models was the only way to not get bad results
@@jameshenry347 I use SAM too but for some reason, when i click the area and click detect, it doesn't do its thing, i look at the cmd prompt and it has lots of error which i don't understand at all. Havent try Face Detailer yet.
@@HeinleinShinobu Well, when I used with Face Detailer, the SAM Loader node would always throw fatal errors if SAM was processed with the "auto" or "GPU" option and would only work with the "CPU" one, but never tried manually, so I don't know if the processing is the same way. I imagine it has to be CPU forced someway.
I'm getting this Warning torch.load doesn't support weights_only on this pytorch version, loading unsafely. (among other things) and then a big error in comfyUI when it hits the ksampler. Maybe I need a newer version of python or something from the video? (I have no idea about python, I just followed all the instruction :D )
also not working for me, i get an error message "shape '[77, -1, 77, 77]' is invalid for input of size 5929" with a huge stack trace, no idea what it means
For me. On windows. Using directml and NOT using ROCm (like I use in Linux). It is about 40% of my normal it/s speed under similar circumstances. So it is a big decrease in performance. But. It works with everything I think. It is not limited to ONNX, and it should support all the fun stuff like inpainting controlnet etc as best I can tell. I have not played with those things on it in windows though so I will have to actually test to be sure. If someone absolutely wants all functionality and refuses to go to Linux. This is probably the best bet right now.
@@FE-Engineer i dont care speed in windows rather i need stability. i always run into vram issues on windows when using fork of a1111. only thing i want is seamless experience hence i have dual boot ubuntu
not sure what you did differently from me, I followed verbatim and tried to run Flux with this set up. I keep getting a gpu device has been suspended. I also have 7900 xtx, 32 gbs of ram, ryzen 5900x. I've looked at drivers. There must be a better way?
@@FE-Engineer I see, thank you. 32GB of RAM, W10, 7900 XTX Nitro+, Ive been experimenting with many CivitAI models. This stuff is painful. Edit: Oh yeah I do generates 4 images at a time, 512x512 each.
Sadly not working for me - always gives out an Error including ___________________ UserWarning: The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at C:\__w\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.) ___________________ Do you maybe have a hint for me?
After activating comfyui I write git clone and the url and I get this: fatal: destination path 'ComfyUI' already exists and is not an empty directory... any solution?
I actually made a start.bat inside the comfyUI folder with the following code: @echo off call conda activate comfyUI call python main.py --directml The issue is that if you don't have a prompt open and try to run without the double call it does not work. Adding the second call to ensure that the python call is deployed inside the opened up prompt when clicking on the file. Hope this helps!
Hand to hand combat to get this to work. I had it work a few times, but it crashed a lot. But today I have no Idea how to start it again. I may have not used correct name?
Everything works great except I can not get Reactor Face swap working. Every time I load a face_restore_model and run it I get this error message "RuntimeError: Input type (torch.FloatTensor) and weight type (PrivateUse1FloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor"
Hi :) My setup is Ryzen 5 7600x and GPU AMD RX 6800. I've followed your tutorial and no matter which model im running, I get the error "Could not allocate xxxxx bytes. There's not enough GPU video memory available" It is weird cause I got 16gb of vram.. Another thing: when I launch comfyui I use your command : python main.py --directml and in the terminal I see : total vram : 1024 mb, total ram 31963 mb Set vram state to : normal_vram Is there a way that comfyui uses the ram instead of the vram (which is ridiculously small in my case) Thanks again for your video ! It helped me install all the setup!
@@FE-Engineer Could you let me know if it works? I am really wondering. Comfyui does work with cpu only but that is taking way to long and it stresses my pc to the max.
Hi, i do not know how you can Install ComfyUI in a other Path but you can ReDirect the model folders used by ComfyUI with editing the "extra_model_paths.yaml.example" in the main Directory.
i dont know what i am doing wrong, I have tried 4 times, the last time I even fresh installed windows first. these tutorials never work for me and I cant figure out what I am doing wrong.
Hello FE-Engineer! i accure the problem: UserWarning: The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. How to add the support? Or it is my model error?
I just cannot get this running on my 7800XT "Error occurred when executing KSampler: The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action." And then a bunch of script errors. Any ideas?
Yikes. I apologize. I have not seen that error come up. Were you maxing out your vram? Try running it with something up to monitor your vram. My guess is you might have tried to generate an image that was too big for it to handle. This is a big guess though. Monitor system resources and redo and see how it looks. Might give you some clues. Also. Do something like barebones stock. Stock prompt. Stock model. Stock pipeline setup. Stock image dimensions. Make sure you can run it with everything being as controlled as possible to see if you are changing something somewhere that is causing this weird errors. That’s my suggestion.
is this possible without adding miniconda to path? i dont wanna screw my computer up lol UPDATE! I did get it running w/o having to add to path...had to downgrade numpy tho...
@@Sereath i dont exactly, i honestly got it working played around with it a bit and dropped it, but i wanna say 7.x or 2.x, whichever sounds more relevant...
Hello. I have a question. Would there theoretically be any benefits in running comfyUI + Zluda + SDXL as an alternative to A1111 + Zluda + SDXL? And is it possible to run Zluda and ComfyUI?
@@FE-Engineer Ye i think you may be right because I tried SD-next yesterday and it was similar to A1111 performance-wise. Anyway thx for your answer and keep it up👍 your content is very helpful and tutorials well-described.
should also be able to use python/cmd instead of conda also. but is there a way to just create a bat file to launch instead of constantly needing to do all the steps yourself? and with ''low/med'' VRAM etc. ? when I try a 768x768 SDXL generation on my 6750xt it says not enough VRAM.
@@mizuhahato6215 I dont use Comfy anymore, went back to A1111 directml fork, with Lobe theme, and dont have as many issues as before, ofcourse sometimes, depending on prompt length I get net enough VRAM error, but it works.
I know I'm late to the party, as usual. Lol. I have a question: why an AMD GPU and not NVIDIA? Everything that I have read about LLM's Nvidia is the way to go. I'm considering a GIGABYTE AORUS GeForce RTX 3060 Elite 12G (REV2.0); trust me, I would prefer an AMD GPU over a Nvidia. What would I also consider to run this model?
Nvidia cards have less vram than AMD mostly and cost more. When I built my computer to get a 24Gb vram with Nvidia the only cards with 24 Gb still are the 3090’s and 4090. 3090’s while still very fast were 3 years old and cost $~1000. And 4090s might randomly melt the power connector and cost twice that of a brand new 7900xtx.
Supposedly the new AMD CPU’s might have AI capabilities baked into the CPU. I’m very interested to see how those do. Also compared to the 3060 which is now maybe 4-6 years old, you could consider an AMD 7800 perhaps? I haven’t been looking at gpus much recently. Plus newer AMD gpus might now be too far off. Same for Nvidia 5000 series gpus. Depending on your timing needs and budget might be worth waiting potentially.
I have an 7900xtx too, and would prefer sticking to windows for various reasons but not a stranger to Linux. What have you found to be best way to go about setting up AI image generation for this specific card? I think you mentioned some other stuff.
So notes: SDXL will not work to my knowledge. Shark from nod . Ai is easy but is miserable to use. SD . Next is pretty straight forward. But is somewhat limited overall. I think this might be the best overall right now. Automatic 1111 directML is good but converting to ONNX is irritating. You give up samplers in ONNX but get a big speed boost. Comfyui is comfy ui. Complicated to use. DirectML will net you about ~6 it/s and it’s a little finicky. I think that’s the most common current windows run down.
Ah thanks on the advice. Been trying shark to little success. Been having 1111 DirectML generating a few images but I am 90% sure it is actually using the CPU to do the generations instead of the GPU... practically zero GPU activity and my CPU stats and fan speed is high. So try and get SD Next working instead sounds best? What about under Linux? 1111 with the ROCM it can just work out of the gate, right? @@FE-Engineer
@@Knightedskull On Linux usually everything work. SDXL control net lora, dreambooth training, LLM, Wishper, Coqui TTS . Also the performance is usually pretty good. It get about 18It/s on 512px image, up to 23Its if I turn off medvram. With LLM It does over 60T/s with 7Bq4 model, on a 7900xt. It can also run everything at once while playing a game and recording at same time. I made a video playing Minecraft with shader while trying to follow the instruction from an AI that have a voice and 3D Vtuber avatar. I managed to get everything to run at once on a 7900xt. Though seriously a XTX would be better because It was tight on the Vram side.
So no, it is not faster. With this on comfyUI, it is running DirectML, but NOT running ONNX. Which on Automatic1111 you can do the same thing by not using the -onnx command when you start it. You do take a massive performance hit though. So comfyui overall is similar, massive performance hit because it is not using ONNX.
I tested controlnet and it was working for me. I did not test inpainting, honestly my comfyui skills are pretty terrible, I have not taken the time to learn the spaghetti mess. :-/
I have this running un Nobara linux and it works fine. I got this set up in windows with this tutorial and everything installs fine, but when I try to generate an image I get a "Not enough VRAM/Memory" error. I am not sure what I am doing wrong. I am running a 6700XT and a 5700xt. I am not sure if it is referencing the wrong GPU or what. Anyway, this was a good tutorial and easy to follow.
Try using the medvram flag. See if that helps. Although if you are running dual video cards you might run into weird issues. I have never tried with dual video cards and unfortunately in that type of instance I don’t know how much help I can honestly provide :-/
@@FE-Engineer Hi, I'm totally new to this thing and I ran into this low vram thing too. Everything installs fine, the UI is on but it says I have only 1GB and I can't render even a tiny (128x128) image. I have an RX570 with 4GB of Vram and 32GB of RAM on MB, that should be enough. Where and how should I use that "medvram" flag? Thanks.
This is the first time anyone has mentioned this. My guess would immediately be your python version perhaps. Or the install did not finish or had a problem.
Thanks for the video! Everything installed ok and I have loaded a checkpoint however when I queue prompt everything seems to work but the image generated is just blank black screen? Any ideas what is wrong? Thanks for your help!
So I have updated my GPU and installed a new model. If I load comfyui up fresh from the mini conda terminal I can generate an image and it works. However, if I try to swap to a different model in comfyui I will get a blank image again.
It might be the model. Some get a bit weird. Try the standard 1.5 model and make sure that works reliably. Some of it is a bit of trial and error. Some need a vae. Some need clip skip. Unfortunately it’s hard to say why this is happening exactly.
Half the price for a better GPU in many ways (vram specifically). 4090s were melting power connectors when I bought it. And 3090s were 3 years old and the same price as a top of the line amd card. I wanted to run AI stuff so vram was pretty critical. And dropping off of the 3090 or 4090 meant big drops in vram on Nvidia cards unless you got an A series card.
You are now the only channel I watch when it comes to stable diffusion tutorials and similar. I can finally generate awsome pictures of frogs in suits on my amd gpu.
😂😂 yessss. I love it. I am hoping to get some stuff in my website before too long for people to either generate or post SD images. And I would absolutely love to see frogs in suits that you generated.
Thank you so much for watching!
I second this.. many suited frogs are being made. Been trying to get this to work for ages and this by far seems to be the easiest set up. Thanks!@@FE-Engineer
You are very welcome! I’m really glad you got it working and are having fun with it!
Still working on the website and another site. But will be getting the ability for folks to send up photos or images from AI sometime in the not too distant future!
Exactly what i was going to post!
Thank you, i have followed about 20 different tutorials and none of them have worked, your video was very easy to follow and worked perfectly on my 7900xtx, again thank you, you have put to bed hours of turmoil from me trying to get this to work, excellent video
You are very welcome! I’m glad it worked and fixed your problems! Thank you for watching!
How to install the ComfyUI Manager?
Thank you. The fact that I dont have to convert the models is a bonus. I know it's slower in image generation but less headaches for sure. Great job.
You are welcome. Thank you so much for watching! I am waiting for ROCm on windows and then everyone can basically do everything with all of the different tools out there without really any compromises. Good speed and good support. One of these days….
Lost me at 2:23 - ModuleNotFoundError: No module named 'safetensors'
same error
solved it with '' pip install -r requirements.txt' command
@@Remzicaliskan Thanks for this, was stuck here, too, and this fixed it.
Hey man, I want to thank you from the bottom of my heart, I was having trouble with this waiting time, easy and clear tutorial to follow, I complete the process now in 10 seconds, it was around 1m30, thank you for this Christmas gift ! Merry Christmas
Merry Christmas and happy holidays to you and your family as well. Thank you so much I am glad it helped! :)
Thank you! FE-Engineer, I can run it now with my RX580 on win10,Good for you and Happy new year!
I’m glad it helped! Thank you for watching and the kind words! Happy new years to you and your family as well!
Does someone know where i can tell comfy to use more than 1024 ? i have 12gb
Same here, but I have 16gb.
same
Wow that really helped! Not as fast as yours since I had an error that shifted some of the work back to the CPU. But 3 mins is much better than 50! The error was
The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications.
And DirectML is saying I've only got 1GB VRAM, but I'm also using a 7900XTX
@@ajphilippineexpat Same error
@@ajphilippineexpat same for a friend of mine as well. Anyone knows how to fix this?
insanely straightforward, straight to the point and no questions left remaining. Thank you very much for this tutorial
You are very welcome thank you for watching!
Haven't seen any updates on comfyui github page about this ability of running comfy on windows. 😮 But happy to know this finally works. Amd is also cheaper ,and amd holders no need to switch for nvi
It worked for me and was mostly straight forward :)
git doesn't come standard with Anaconda, so you have to remember to install it: conda install -c anaconda git
How to fix comfyui detecting only 1gb vram
Same problem here.
Does this run quicker than stable-diffusion-webui-directml on Windows using a 7900 XTX with zluda? I'm trying to avoid using shark because it takes up too much space and takes too long.
No it does not.
Auto1111 with zluda
Or
Sd.next with zluda will be the fastest most likely.
@@FE-Engineerthe auto1111 works for me, but sd.next gives me some issues
rx6650xt (in a SDXL workflow) ~22s/iteration with dreamshaperXL turbo v2, ~12s/iteration with SDXLRefiner v1.0_0.9vae and ~ 11s/iteration with SDXLRefiner v1.0
After hours of various tutorials this was the only one that worked for me. Thanks!
You are welcome! Thanks for watching!
Any knows fixes for
RuntimeError: Device type privateuseone is not supported for torch.Generator() api.
?
I have never seen that. What were you doing when you got that error?
Same issue, when it hits ksampler
Edit line 31:
C:\Users\[USER]\miniconda3\envs\comfyui\Lib\site-packages\torchsde\_brownian\brownian_interval.py
From:
generator = torch.Generator(device).manual_seed(int(seed))
To:
generator = torch.Generator().manual_seed(int(seed))
Restart comfyui from scratch
@@double.parker same here Error occurred when executing KSampler:
getting error on queue. Could not allocate tensor with 10485760 bytes. There is not enough GPU video memory available!
Size down your image gen. Try doing like a 512 by 512 to start and make sure stuff is running right first.
Thank you for your video FE-Engineer! It works so perfectly!
You are very welcome! Thank you for watching!
Thanks a lot! I've been eagerly awaiting this video. I wish you happy holidays!
You are very welcome! Sorry it took a while I’ve been debating about what and how much to include and do for comfyui. And I wanted to make sure it could work on windows without ROCm as a requirement.
Legend status, thank you. Linux was like pulling teeth and didn't want to work properly. Thank you kindly sir
does not work in my case. i was able to install everything according to your very clear and nice instruction. no problems. i even can start comfyui. i did however copied some of the checkpoints i already had for my cpu version. i can select the checkpoint and as soon as i hit the generate button i have a bluescreen when it walks to the last node....
i have no idea what i am doing wrong
I followed your tutorial and there were no errors in install but when I "Queue prompt" I get:
The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU.
I have AMD XT 7900 XTX.
how do we add the manger?
error import safetensors.torch
ModuleNotFoundError: No module named 'safetensors'
bro, thank you so much. i saw many tutorials, but this is the best
Oh sweet baby Jesus, that is so much fast than running off my processor. Thank you so much.
After pip install torch-directml i got ERROR:
(base) C:\Users\user\ComfyUI>pip install torch-directml
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement torch-directml (from versions: none)
ERROR: No matching distribution found for torch-directml
How to fix who know? Thanks!))
Sounds like a python version issue, or potentially a permissions issue.
I would make sure you have things setup appropriately and using the correct python version.
Thanks for this video.. now I can run comfyui on my amd gpu, your video is the easiest tutorial to follow 👍👍👍
Thank you for the tutorial, for some reason it wont let me cd like in 1:40 "The system cannot find the path specified." is the error.
it's sad, every time i try to generate something i receive the message ''Could not allocate tensor with 1221853184 bytes. There is not enough GPU video memory available!'' amd cpu+ amd radfeon 6550 xt
Can I just use python 3.10.6 need it for reforum
hello mate. so all the steps went well but it's not working. when i press queue prompt, it says reconnecting and then gives me an unknown error. also, the archive i downloaded gives me errors and its not extracting. any advice?
I am using StabilityMatrix to run ComfyUI, and it only detects 1GB of VRAM and i have 12GB. How can i solve it? GPU: RX 6700 XT XFX
well desirved sub and like, tried a while back for days on end and no luck and this strait up worked
also can run from bat file if conda is on path, i followed this tutorial:
th-cam.com/video/zFKD2Q9m_nQ/w-d-xo.html
Yea. For some reason my conda was added to path but was being finicky. I ended up more recently largely ditching conda because for videos I end up installing stuff a lot and conda overall was becoming a bit more hurtful than helpful overall in my specific scenario. :-/
Thank you so much! :) I’m glad this helped you!
Hello, I'm using the 7900xt and my speed is only 2it/s, can you show me how to increase the speed? Because your 7900xtx is twice as fast as mine
Ive managed to get it up and running, but every time i try generate an image, my PC crashes and reboots.
What could be causing this?
i seem to be getting [error executing checkpontloadersimple] the torch not compiled with cuda, followed all the steps. made sure im using the (python main.py --directml). Is downloading the latest miniconda an issue? even though i typed the python3.10.12. And i made sure i clicked on path when installing the miniconda
*tried it with miniconda 10 and still same error
Code changed. I’m looking into it. If you roll back the git code it will work.
@@FE-Engineer ah, that explains my problems.
Doesn't work. It uses normal ram instead of GPU. Can you help?
They must have changed something in the code. I’ll take a look.
and how to update comfyUI this way? ty
it works - Thank you so much for saving my AMD card
Great video! Happy Xmas!
Thank you so much. Merry Christmas and happy holidays to you and your family as well!
Spend time with your loved ones!
Lost me at "add directory to path"
same here, I still tried it and now im stuck at "The system cannot find the path specified." when i tried "cd comfyui"
@@spacetartadd .exe at the end of the name "minisota" in the directory path at the installation.
should solve it
@@ed1k37 I managed to back track it, it was on a different file forlder but now im stuck at something else, when I run "python main.py --directml" its showing me "[Errno 2] No such file or directory"
Hey! Please help I have this error (after .python main. py --directml) ImportError: DLL load failed while importing torch_directml_native:
Like several others, I am having an issue where comfyui is falling back to using my CPU instead of my GPU.
The error i get is "The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU."
Any chance you know how to fix this?
Sounds like it’s using direct ml. I’m trying to see if I can get it to use zluda instead
How to install Miniconda and add directory to path?
Can you make a video to Stable Video diffusion with ComfyUI on AMD GPU?
Always get this:
Error occurred when executing KSampler:
input must be 4-dimensional
Thank you! Great Video
hello, i have this problem : CondaError: Run 'conda init' before 'conda activate'
And what can be done to make comfyui work with stability matrix?
okay bro, we need to get a image to video or text to video instructional. pretty please
can work with AMD GPU 4GB?
I don’t know offhand. Probably?
Don't work !i I've an error in KSampler
They seem to have updated ComfyUI and this is no longer working :/
Works like a Charm... I'm just done it myself and im rendering Pictures while i write this.
Love the tutorial, but I got all the steps as you said but when executing "python main.py --directml", I have the error "module 'torch' has no attribute 'Tensor'"
Tried to search the issue but with no results ...
hi, I am encountering a problem.
the installation went well and I managed to launch comfyUI but once in front of the panels it is not possible to generate any image, because the panel which loads the checkpoint models does not work.
it indicates "ckpt_name null" and when I interact it does not open any pop up with the list like in the video but goes to "ckpt_name undefined" and it is no longer possible to interact with the model selection line, although I have two models in my models folder.
I don't understand what I did wrong. Thank you for answering me.
According to the instructions, Comfyui has been installed on drive C. You can check drive C and it will be there. The checkpoint must be saved on drive C.
how do i turn your conda commands into a bat file so i dnt have to type it everytime i start this comfy ui??????
'' ModuleNotFoundError: No module named 'safetensors' '' error after entering 'python main.py --directml'
De-Install Torch and Re-Install it again that will fix it... I had same Problem ✌(Un-Install = pip uninstall torch) Then use the Line in the Video to Re-Install
when i want to clone the link it said git isnt a right command even though i installed it
hey thank for the video, but i get an error saying AssertionError: Torch not compiled with CUDA enabled. Know how to fix it?
Sounds like you aren’t using directml torch to me?
Hey try this: python main.py --directml
Otherwise I had the same error!
Comfy tried to call torch.cuda.current_device(), and it could not of course:
[...]
File "G:\AI\ComfyUI\comfy\model_management.py", line 83, in get_torch_device
return torch.device(torch.cuda.current_device())
File "G:\AI\VirtualEnvs\win_comfy\lib\site-packages\torch\cuda\__init__.py", line 674, in current_device
_lazy_init()
File "G:\AI\VirtualEnvs\win_comfy\lib\site-packages\torch\cuda\__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
now when i run it like this checkpoint are undefined anyone has problem like this?
cuz runin by cpu works well
Do you know if ComfyUI has the same problem with inpainting? The Automatic1111's couldn't inpaint with directml and it was only solved by using the commands:
"--no-half --precision full --no-half-vae --opt-sub-quad-attention --opt-split-attention-v1".
ComfyUI doesn't have these exact commands and like the extension "sd-webui-inpaint-anything" without them, the Face Detailer and other inpaint segments from "ComfyUI-Impact-Pack" throws the error:
"The size of tensor a (0) must match the size of tensor b (256) at non-singleton dimension 1"
Honestly. I am not sure. I thought since it basically has a different overall architecture that it might not have the same fallbacks as using the directML fork of automatic1111.
have you found the solution for comfyui inpainting? Just find it really buggy with the result
@@HeinleinShinobu I found that Face Detailer from Impact Pack to be the best automated inpainting since it doesn't really search for CUDA (although if used with SAM, they must be loaded with CPU). While for manual masking, using ControlNet as auxiliary or using dedicated inpainting models was the only way to not get bad results
@@jameshenry347 I use SAM too but for some reason, when i click the area and click detect, it doesn't do its thing, i look at the cmd prompt and it has lots of error which i don't understand at all. Havent try Face Detailer yet.
@@HeinleinShinobu Well, when I used with Face Detailer, the SAM Loader node would always throw fatal errors if SAM was processed with the "auto" or "GPU" option and would only work with the "CPU" one, but never tried manually, so I don't know if the processing is the same way. I imagine it has to be CPU forced someway.
I'm getting this
Warning torch.load doesn't support weights_only on this pytorch version, loading unsafely. (among other things) and then a big error in comfyUI when it hits the ksampler. Maybe I need a newer version of python or something from the video? (I have no idea about python, I just followed all the instruction :D )
also not working for me, i get an error message "shape '[77, -1, 77, 77]' is invalid for input of size 5929" with a huge stack trace, no idea what it means
Yea. Looks like I need to make a new video it seems
@@FE-Engineer can you show which version you use? so it's possible to reproduce the exact things. i only have 7900 xt
thanks for the video. hows comfyui compared to a1111 for amd gpus?
For me. On windows. Using directml and NOT using ROCm (like I use in Linux). It is about 40% of my normal it/s speed under similar circumstances. So it is a big decrease in performance.
But. It works with everything I think. It is not limited to ONNX, and it should support all the fun stuff like inpainting controlnet etc as best I can tell.
I have not played with those things on it in windows though so I will have to actually test to be sure.
If someone absolutely wants all functionality and refuses to go to Linux. This is probably the best bet right now.
@@FE-Engineer i dont care speed in windows rather i need stability. i always run into vram issues on windows when using fork of a1111. only thing i want is seamless experience hence i have dual boot ubuntu
amazing video! thank you for this straightforward guide :)
You are welcome! Thank you for watching and for the kind words. :)
not sure what you did differently from me, I followed verbatim and tried to run Flux with this set up. I keep getting a gpu device has been suspended. I also have 7900 xtx, 32 gbs of ram, ryzen 5900x. I've looked at drivers. There must be a better way?
Is it normal for it to take all your system RAM + the page file as well? On top of using all the VRAM from my 7900 XTX; 24GB of vram. 💀
From my testing no. That is not normal. But it will depend on how much you have and the gpu and the model and your specific settings…
@@FE-Engineer I see, thank you.
32GB of RAM, W10, 7900 XTX Nitro+, Ive been experimenting with many CivitAI models. This stuff is painful.
Edit: Oh yeah I do generates 4 images at a time, 512x512 each.
Sadly not working for me - always gives out an Error including
___________________
UserWarning: The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at C:\__w\1\s\pytorch-directml-plugin\torch_directml\csrc\dml\dml_cpu_fallback.cpp:17.)
___________________
Do you maybe have a hint for me?
Strange Path... DeInstall Torch and proper Install it into Pyton
After activating comfyui I write git clone and the url and I get this:
fatal: destination path 'ComfyUI' already exists and is not an empty directory... any solution?
Delete your existing directory and everything in it. You already have one cloned.
is it normal for comfyui to allocate only 1gb of total vram? i tried using flux dev, from comfyui but always endup with "gpu ran out of memory" error.
hi great vid but is there a way to chache the install location becouse for me it instalt on sytem32😅
I actually made a start.bat inside the comfyUI folder with the following code:
@echo off
call conda activate comfyUI
call python main.py --directml
The issue is that if you don't have a prompt open and try to run without the double call it does not work. Adding the second call to ensure that the python call is deployed inside the opened up prompt when clicking on the file. Hope this helps!
Hand to hand combat to get this to work. I had it work a few times, but it crashed a lot. But today I have no Idea how to start it again. I may have not used correct name?
Activate Conda environment.
Run the start script.
You can run conda list to see all conda environments that you have. Maybe that will help?
got a rx6600 and can't get it running, always telling me that i don't have enough vram, any idea that could help ?
Is it when you try to load a model?
No module named: 'torch_directml'
You are probably not using the right python version.
Something in your setup is different…
Everything works great except I can not get Reactor Face swap working. Every time I load a face_restore_model and run it I get this error message "RuntimeError: Input type (torch.FloatTensor) and weight type (PrivateUse1FloatType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor"
Might be a problem with torch and directml would be my guess.
Hi :)
My setup is Ryzen 5 7600x and GPU AMD RX 6800. I've followed your tutorial and no matter which model im running, I get the error "Could not allocate xxxxx bytes. There's not enough GPU video memory available"
It is weird cause I got 16gb of vram..
Another thing: when I launch comfyui I use your command : python main.py --directml and in the terminal I see : total vram : 1024 mb, total ram 31963 mb
Set vram state to : normal_vram
Is there a way that comfyui uses the ram instead of the vram (which is ridiculously small in my case)
Thanks again for your video ! It helped me install all the setup!
how do we get the manager working on amd
My models are not showing up, everything works the same as you have. How can i fix that? model i am using is SD XL and the lora is Pixel Art XL
Or should comfyui be opened before using minicoda?
I haven’t tried using Lora’s and sdxl on windows with comfyui. I’ll see if I can get it running.
@@FE-Engineer Could you let me know if it works? I am really wondering. Comfyui does work with cpu only but that is taking way to long and it stresses my pc to the max.
Is there any way to install the Comfyui folder in an external hard drive doing this method?
Hi, i do not know how you can Install ComfyUI in a other Path but you can ReDirect the model folders used by ComfyUI with editing the "extra_model_paths.yaml.example" in the main Directory.
i dont know what i am doing wrong, I have tried 4 times, the last time I even fresh installed windows first. these tutorials never work for me and I cant figure out what I am doing wrong.
Hello FE-Engineer! i accure the problem:
UserWarning: The operator 'aten::count_nonzero.dim_IntList' is not currently supported on the DML backend and will fall back to run on the CPU. This may have performance implications.
How to add the support? Or it is my model error?
Torch settings:
Using directml with device:
Total VRAM 1024 MB, total RAM 32695 MB
pytorch version: 2.3.1+cpu
Set vram state to: NORMAL_VRAM
I just cannot get this running on my 7800XT
"Error occurred when executing KSampler:
The GPU device instance has been suspended. Use GetDeviceRemovedReason to determine the appropriate action."
And then a bunch of script errors. Any ideas?
Yikes. I apologize. I have not seen that error come up. Were you maxing out your vram? Try running it with something up to monitor your vram.
My guess is you might have tried to generate an image that was too big for it to handle. This is a big guess though.
Monitor system resources and redo and see how it looks. Might give you some clues.
Also. Do something like barebones stock.
Stock prompt. Stock model. Stock pipeline setup. Stock image dimensions.
Make sure you can run it with everything being as controlled as possible to see if you are changing something somewhere that is causing this weird errors. That’s my suggestion.
Can we use zluda with comfy ui like in your automatic1111 zluda video?
is this possible without adding miniconda to path? i dont wanna screw my computer up lol
UPDATE! I did get it running w/o having to add to path...had to downgrade numpy tho...
Hello,
do you remember which version you downgraded to?
@@Sereath i dont exactly, i honestly got it working played around with it a bit and dropped it, but i wanna say 7.x or 2.x, whichever sounds more relevant...
@@willismiller7035 Thank you.
why does the terminal say total VRAM 1024 MB when we obv have more ? Im sorry im new
It’s ok. It is a misprint. Mine says that as well.
Hello. I have a question. Would there theoretically be any benefits in running comfyUI + Zluda + SDXL as an alternative to A1111 + Zluda + SDXL? And is it possible to run Zluda and ComfyUI?
I don’t imagine you would see any performance change overall.
At the moment I don’t know if comfyui can use zluda at all. Maybe?
@@FE-Engineer Ye i think you may be right because I tried SD-next yesterday and it was similar to A1111 performance-wise. Anyway thx for your answer and keep it up👍 your content is very helpful and tutorials well-described.
should also be able to use python/cmd instead of conda also.
but is there a way to just create a bat file to launch instead of constantly needing to do all the steps yourself? and with ''low/med'' VRAM etc. ? when I try a 768x768 SDXL generation on my 6750xt it says not enough VRAM.
have you found a solution for this problem ?
i'm actually in the same situation
@@mizuhahato6215 I dont use Comfy anymore, went back to A1111 directml fork, with Lobe theme, and dont have as many issues as before, ofcourse sometimes, depending on prompt length I get net enough VRAM error, but it works.
@@ZeroNyte ok ok, i just found that if you launch comfy with --lowvram, it works fine
(RuntimeError: Numpy is not available )help pls
Check python version. Which python
@@FE-Engineer Same here, Python 3.12.
Nvm. The logs say 3.10.12 and it still says that Numpy is not available.
I know I'm late to the party, as usual. Lol. I have a question: why an AMD GPU and not NVIDIA? Everything that I have read about LLM's Nvidia is the way to go. I'm considering a GIGABYTE AORUS GeForce RTX 3060 Elite 12G (REV2.0); trust me, I would prefer an AMD GPU over a Nvidia. What would I also consider to run this model?
Nvidia cards have less vram than AMD mostly and cost more. When I built my computer to get a 24Gb vram with Nvidia the only cards with 24 Gb still are the 3090’s and 4090. 3090’s while still very fast were 3 years old and cost $~1000. And 4090s might randomly melt the power connector and cost twice that of a brand new 7900xtx.
Supposedly the new AMD CPU’s might have AI capabilities baked into the CPU. I’m very interested to see how those do. Also compared to the 3060 which is now maybe 4-6 years old, you could consider an AMD 7800 perhaps? I haven’t been looking at gpus much recently. Plus newer AMD gpus might now be too far off. Same for Nvidia 5000 series gpus. Depending on your timing needs and budget might be worth waiting potentially.
@@FE-Engineer For me, its a cost thing, and also, my rig is considered a casual gaming rig. I'm running an I5 11400 CPU with 16 GB of RAM.
@@FE-Engineer With what you said about AMD GPU's, I'm now considering a Gigabyte Radeon RX 7600 XT with 16GB of ram. Thanks for the advice.
When i run python main.py --directml, it doesnt choose my gpu for some reason and im using vega 64, i always run into not enough memory after that
might want to try the low or med vram options?
@@FE-Engineer im pretty sure it says somewhere but how do i do it?
my rx580 8gb shows 1gb ?
I have an 7900xtx too, and would prefer sticking to windows for various reasons but not a stranger to Linux.
What have you found to be best way to go about setting up AI image generation for this specific card? I think you mentioned some other stuff.
So notes:
SDXL will not work to my knowledge.
Shark from nod . Ai is easy but is miserable to use.
SD . Next is pretty straight forward. But is somewhat limited overall. I think this might be the best overall right now.
Automatic 1111 directML is good but converting to ONNX is irritating. You give up samplers in ONNX but get a big speed boost.
Comfyui is comfy ui. Complicated to use. DirectML will net you about ~6 it/s and it’s a little finicky.
I think that’s the most common current windows run down.
Ah thanks on the advice. Been trying shark to little success. Been having 1111 DirectML generating a few images but I am 90% sure it is actually using the CPU to do the generations instead of the GPU... practically zero GPU activity and my CPU stats and fan speed is high.
So try and get SD Next working instead sounds best?
What about under Linux? 1111 with the ROCM it can just work out of the gate, right?
@@FE-Engineer
@@Knightedskull On Linux usually everything work. SDXL control net lora, dreambooth training, LLM, Wishper, Coqui TTS . Also the performance is usually pretty good. It get about 18It/s on 512px image, up to 23Its if I turn off medvram. With LLM It does over 60T/s with 7Bq4 model, on a 7900xt.
It can also run everything at once while playing a game and recording at same time. I made a video playing Minecraft with shader while trying to follow the instruction from an AI that have a voice and 3D Vtuber avatar. I managed to get everything to run at once on a 7900xt. Though seriously a XTX would be better because It was tight on the Vram side.
@@Knightedskull u have used the --use-directml on the commandline?
yeah i believe so but i've switched back to linux @@ubuu90
Is it faster than directml A1111? Inpaint, Controlnet, Adetailer etc. works?
So no, it is not faster. With this on comfyUI, it is running DirectML, but NOT running ONNX. Which on Automatic1111 you can do the same thing by not using the -onnx command when you start it. You do take a massive performance hit though.
So comfyui overall is similar, massive performance hit because it is not using ONNX.
I tested controlnet and it was working for me. I did not test inpainting, honestly my comfyui skills are pretty terrible, I have not taken the time to learn the spaghetti mess. :-/
@@FE-Engineer Thank you for your answer 🤝
Can you make stable diffusion videos with AMD cards on comfyui?
For windows or Linux?
@@FE-Engineer windows
I have this running un Nobara linux and it works fine. I got this set up in windows with this tutorial and everything installs fine, but when I try to generate an image I get a "Not enough VRAM/Memory" error. I am not sure what I am doing wrong. I am running a 6700XT and a 5700xt. I am not sure if it is referencing the wrong GPU or what. Anyway, this was a good tutorial and easy to follow.
Try using the medvram flag. See if that helps. Although if you are running dual video cards you might run into weird issues. I have never tried with dual video cards and unfortunately in that type of instance I don’t know how much help I can honestly provide :-/
@@FE-Engineer Hi, I'm totally new to this thing and I ran into this low vram thing too. Everything installs fine, the UI is on but it says I have only 1GB and I can't render even a tiny (128x128) image. I have an RX570 with 4GB of Vram and 32GB of RAM on MB, that should be enough. Where and how should I use that "medvram" flag? Thanks.
Hi,ERROR: Could not install packages due to an OSError: Missing dependencies for SOCKS support.Is there a good solution to this problem?
This is the first time anyone has mentioned this. My guess would immediately be your python version perhaps. Or the install did not finish or had a problem.
@@FE-Engineer Mine is a laptop with an integrated cpu, that's what's preventing it from running with the GPU put.
Thanks for the video! Everything installed ok and I have loaded a checkpoint however when I queue prompt everything seems to work but the image generated is just blank black screen? Any ideas what is wrong? Thanks for your help!
So I have updated my GPU and installed a new model. If I load comfyui up fresh from the mini conda terminal I can generate an image and it works. However, if I try to swap to a different model in comfyui I will get a blank image again.
It might be the model. Some get a bit weird.
Try the standard 1.5 model and make sure that works reliably. Some of it is a bit of trial and error. Some need a vae. Some need clip skip. Unfortunately it’s hard to say why this is happening exactly.
@@FE-Engineer thanks, i tried a different model and it worked :)
how does this work together with your AMD Zluda video from two months ago?
can i just use ComfyUI with that setup as well without redoing everything?
Just curious, why did you chose an AMD card?
Half the price for a better GPU in many ways (vram specifically). 4090s were melting power connectors when I bought it. And 3090s were 3 years old and the same price as a top of the line amd card. I wanted to run AI stuff so vram was pretty critical. And dropping off of the 3090 or 4090 meant big drops in vram on Nvidia cards unless you got an A series card.