Wow! I cannot thank you enough. I have been trying to install flux since it came out with zero success. I even started paying to use it through replicate. But I saw this tutorual. Deleted my whole comfy everything and rei-installed then used this tutorial and it worked. Thank you so much for this tutorial man.
My output is always a garbled noisy mess. Like hyperJPEG compression nothing is distinguishable from anything else. Everything seems to work until I output something.
After following your process, an error was prompted. like this{TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.} What’s happening? Can you help me?
@@tech-practice9805 Thanks for asking. schnell-fp8. I solved the problem by installing different versions of torch torchaudio and torchimage. I can’t remember which at the moment.
@@tech-practice9805 He said shnell. I have the same issue so I tried the NF4 V2 model but that didnt even work. But just found this video gonna be watching it
Hi 👋 How about M3 Max's unified memory (64GB, 128GB), does-it really works with big Generative AI image/video models ? big images? longer video sequences? is it more interesting than Nvidia RTX solutions? more efficient?
In no case will RAM be more efficient than VRAM so far, unless things change with RAM DDR7+ or something else new this is how it is. GPU VRAM is better than CPU RAM and SWAP.
@@tech-practice9805 i dragged the example image from the git page and ran it with my 128GB M3, took 4:20 minutes. quite long still. i'm wondering if theres stuff with either mac or comfyUI thats holding back faster results. we shall see :D i dont know what a nvidia solution would be for that generating time but its definitely gonna be cheaper then macbook lol
@@697_mac can use it’s ram (at least 75% of it) as vram, which makes it really nice for local AI and surprisingly a cheap solution. The problem is the bandwidth. A 4090 has a bandwidth of roughly 1000 gb/s, the m3 max a bandwidth of 400 gb/s. So you expect it to be 40% the speed of a 4090 (which is still very okay).
Please help. Why i do have "Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that type." window notification? How can I fix it?
i always get TypeError: BFloat16 is not supported on MPS error whichever modal i used on my macbook pro m3 max 48gb version . i am using the Simple to use FP8 Checkpoint version of the flux dev
Thanks for your video-simple and to the point. Could you let me know which app you're using to monitor CPU, GPU, and RAM usage in the video? Many thanks! 🙏🏻
thank you. i get this error can you help me? MPS backend out of memory (MPS allocated: 9.06 GB, other allocations: 384.00 KB, max allowed: 9.07 GB). Tried to allocate 27.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
Thanks for your video, that's what i need. Do you know whether mac pro can run the kijia's model for fp8 ? it's only 11GB ,but when I run the model, the error reports that MAC MPS cannot support the dtype.
I tried kijia's model for fp8 and got the same error. I guess it's not supported for now. But the larger sized fp8 file from ComfyUI page runs without issues.
So How about M3 Max's unified memory (64GB, 128GB), does-it really works with big Generative AI image/video models ? big images? longer video sequences? is it more interesting than Nvidia RTX solutions? more efficient?
@@medboussouni7289 I have an M3 Max 128 GB. There are still output errors if you are running flux on comfyUi. I tested various torch and Python versions.
I use Flux-1 model on my MacBook Air 15" M2 with 8GB RAM, slowly but with gorgeous output, using DiffusionBee. Really impressive.
could you please tell how much time it takes to generate full hd flux 1.1 pro on your Mac?
@@OpulentDreams-s2k one-two minutes...
@@marcostuppini7688 for one minute is very long) its better to create multiple glif accounts and generate there for free every 5 -10 seconds
Wow! I cannot thank you enough. I have been trying to install flux since it came out with zero success. I even started paying to use it through replicate. But I saw this tutorual. Deleted my whole comfy everything and rei-installed then used this tutorial and it worked. Thank you so much for this tutorial man.
My output is always a garbled noisy mess. Like hyperJPEG compression nothing is distinguishable from anything else.
Everything seems to work until I output something.
does other model (Stable diffusion based) work or also noisy?
@@tech-practice9805 I also have got the same noisy pic, it took 1700 sec on my m1 max 32gb
+
Same problem here (on my M1 Mac Studio).
@@jaykaslo The problem also exists for Macbook M 2 Max 98 RAM
After following your process, an error was prompted. like this{TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.} What’s happening? Can you help me?
which FLUX model are you using? and what's the version of Torch?
@@tech-practice9805 torch version:2.5.0.dev20240819
flux:flux1-dev-fp8.safetensors
@@tech-practice9805 Flux was used exactly as in your video tutorial, following your instructions step by step.
same issue here
Same :)
I'm on an M1 Ultra. The outputs are all dotted noise. Using Schnell.
what model file are you using?
@@tech-practice9805 Thanks for asking. schnell-fp8. I solved the problem by installing different versions of torch torchaudio and torchimage. I can’t remember which at the moment.
@@tech-practice9805 He said shnell. I have the same issue so I tried the NF4 V2 model but that didnt even work. But just found this video gonna be watching it
I've resolved! Downgrade packages of ComfyUI. torch to 2.3.1, torchaudio to 2.3.1, torchvision to 0.18.1
For flux, it only cost 7 seconds to generate a 1024*1024 image? With M3 Pro? Please tell that’s true!😮
Lol! More than 10 minutes! It's amazing!
another question, have you tried the FLUX controlnet and ipadapter ? not sure whether xlabs nodes can run on mac system
I have a Macbook Pro 32GB Ram M2 Max. Could I ask why Flux render is SO SLOW on my computer? It might take 15 minutes to render 512x512 image.
what version of PyTorch are you using? as my outputs are not being decoded properly and Ive read somewhere that's possibly to do with PyTorch
The torch version in my system is 2.4.0.dev20240326. Pytoch should work on Apple silicon
Hi 👋 How about M3 Max's unified memory (64GB, 128GB), does-it really works with big Generative AI image/video models ? big images? longer video sequences? is it more interesting than Nvidia RTX solutions? more efficient?
yes, I heard its 128GB is a blast for big models. But it's expensive, I don't have one to test
In no case will RAM be more efficient than VRAM so far, unless things change with RAM DDR7+ or something else new this is how it is. GPU VRAM is better than CPU RAM and SWAP.
@@tech-practice9805 i dragged the example image from the git page and ran it with my 128GB M3, took 4:20 minutes. quite long still. i'm wondering if theres stuff with either mac or comfyUI thats holding back faster results. we shall see :D i dont know what a nvidia solution would be for that generating time but its definitely gonna be cheaper then macbook lol
@@697_mac can use it’s ram (at least 75% of it) as vram, which makes it really nice for local AI and surprisingly a cheap solution. The problem is the bandwidth. A 4090 has a bandwidth of roughly 1000 gb/s, the m3 max a bandwidth of 400 gb/s. So you expect it to be 40% the speed of a 4090 (which is still very okay).
I keep getting this error, what does it mean?
Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype.
what's your pytorch version? It needs to be a latest version
@@tech-practice9805 i think i fixed it? but now it says out of memory (i have 16 gb) and diffusionbee works fine so i'll just use that
can we use a gpu for this?
for macbook? apple silicon uses GPU automatically
Please help. Why i do have "Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that type." window notification? How can I fix it?
try to update and use the latest PyTorch
Thank you asia 🙏
i always get TypeError: BFloat16 is not supported on MPS error whichever modal i used on my macbook pro m3 max 48gb version . i am using the Simple to use FP8 Checkpoint version of the flux dev
what's the Pytorch version? try to use the latest one
Thanks for your video-simple and to the point.
Could you let me know which app you're using to monitor CPU, GPU, and RAM usage in the video?
Many thanks! 🙏🏻
Thank you! The tools is called stats, th-cam.com/video/USpvp5Uk1e4/w-d-xo.html
thank you. i get this error can you help me?
MPS backend out of memory (MPS allocated: 9.06 GB, other allocations: 384.00 KB, max allowed: 9.07 GB). Tried to allocate 27.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
can you try add the 'PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0' before the 'python ...' command?
Could you explain a bit more? I didn’t quite understand what you mean. Do you mean that when I write, Python should come and write this sentence?
im having "KSampler - BFloat16 is not supported on MPS", how can i fix that?
please try to update the PyTorch package
Can it work on Mac mini M2 with 16gb ram?
Yes, I think it works
Thanks for your video, that's what i need. Do you know whether mac pro can run the kijia's model for fp8 ? it's only 11GB ,but when I run the model, the error reports that MAC MPS cannot support the dtype.
same here, have u solved this? is it bcz macs can only run the non fp8 models ?
Same here...
I tried kijia's model for fp8 and got the same error. I guess it's not supported for now. But the larger sized fp8 file from ComfyUI page runs without issues.
I have a mac studio m2 but it's not nearly as fast as yours with flux. What changes did you make for it to be so fast?
So How about M3 Max's unified memory (64GB, 128GB), does-it really works with big Generative AI image/video models ? big images? longer video sequences? is it more interesting than Nvidia RTX solutions? more efficient?
@@medboussouni7289 I have an M3 Max 128 GB. There are still output errors if you are running flux on comfyUi. I tested various torch and Python versions.
@@medboussouni7289 I have an M3 Max 128GB it does not work properly.
Wow it's slow, i thought 2 min 30 sec on my cheap 680 dollar laptop is slow
I have M3Max 48Gb - output is very noisy. What am I doing wrong? I'm using flux1-dev-fp8.safetensors
what's your Pytorch version? Can try to use the latest version
@@tech-practice9805 I've downgraded to 2.3.1 and everythning is fine now
@@ЦыхраЦыхра python2 perform even better than 3 ?!
I gave up on Flux for now. I have an M3 Max 128GB. I dont like to install and uninstall the Pytorch versions all the time.