@@NerdyRodent i'm curious, have you played with the optimizer? On CivitAI, in the settings for training we have Adafactor and Prodigy (I believe I saw it in the past in Koyha).
@@ayrengreber5738 I don't know. But you can pretty much use the same workflow as this video, and you just change the settings of the Forence2run node to OCR. There are two OCR options, try both. I didn't try to OCR a whole folder. I just use a load image node, load and run Florence2 nodes, a node displaying the text.
It’s exciting to see how Flux support is developing so rapidly! Between Lora training and ControlNet, we’re just about to the point where we can bid farewell to SD entirely.
True, just like what happened to me. My SSD, where all my SDXL and SD1.5 checkpoints and LoRA are stored, became corrupted. But instead of feeling very bad about it, I thought, it’s okay, I can do much better using Flux soon anyway.
Thanks for the video, but @5:16 it would be nice if you could expand on how to do this entire section with the "run it the first time" (run what?), and how to make that .ENV file (folder?), I'm using WSL Ubuntu in Windows 11 (installing Linux version) Under accepting model license it says, "Make a file named .env in the root on this folder" What root? On hugging face itself; is that "THIS FOLDER", or the ai-toolkit root folder we just made using the instructions above? The instructions on the Github for this part needs to be less vague, or am I just that thick?
followed the steps and im only getting 1 .txt file even though I have many images. In the CMD prompt it describes all my images but only saves 1 .txt file that describes 1 random image .. any idea?
There's a package "ComfyUI-prompt-reader-node" (self-explanatory), this gives tons of outputs along with the extracted FILENAME (without the extension) that you could use for naming or put into metadatas why not
Thanks for this always love your videos! Hardest part looks to be the dataset so really appreciate the auto tagging workflow part. Do you have any tips for datasets for subjects rather than styles? Should I mix close up face shots and full body shots? Have you tried any other training tools like kohya-ss?
I'm not sure if it's right but for those who are reading the comments, in the config yaml file, you can un-comment the line number 15, and use a trigger word, I imagine if we want to train a subject/person, right?
Great video, I've a couple of questions though: What is the huggingface token used for, is it only used to download the flux-dev model? Is any data sent back to HF? And lastly, if I already have the flux-dev model, can I skip this step somehow by placing the model in a folder somewhere? Thanks for the tutorial!
Yes, the token is just to download the model. If you already have the files (not just the single .safetensors one) you can specify the directory in the configuration.
@@NerdyRodent great! i find stuff on simpletuner about it but didn't notice anything under ai-toolkit glad it is possible though will try it with simpletuner.
"For the images, you can just use the relative path" -- relative to what? Sorry for the newbish question--it looks like your images and text files both showed up in the same folder, but I don't understand where to point the paths for the images and the text files.
I dropped a like because I love you man. As for the content, I can barely work out a sky remote. My son literally showed me a button I can press on my Sky remote so I can ask for what I want to watch. I wish I kept up with tech as you have. I started out good because I was a proud owner of a ZX81 and a 16K ZX Spectrum. I bought a chip from Radio Shack to change it to a 128K Spectrum. It broke. Gave up after my Commodore 64.
Gave up 40 years ago it like those movies were people go into comas and wakeup in the future to bad for you robot butlers aren't out yet should have waited 10 more years.
hello , whenever I try to launch the AI script I get the following: ModuleNotFoundError: No module named 'dotenv' any ideas? (have pyton and git already installed)
The captioning works if I run it without save image node. If I run them both it never captions any images, it just loops through the same image node indefinately saving multiple copies of the images until I cancel the queue :(
He mentions the Fal service near the start of the video. Search for 'Fal ai'. There's also Replicate which I found a little cheaper than Fal. I trained a Flux lora with that and it turned out really well, took about 40mins with the default settings. Also, if you head over to Matt Wolfe's tutorial on it (read the comments because he misses some important bits) you can get $10 free credits, which should be enough for 2 or 3 Flux loras.
i find it impossible to download the models. it starts downloading the model at 30mb/s and then it goes down to just a few Kbytes and stays at 99%. i have tried with different hugginface tokens (write, read finegrain....). i also leave the .yaml by default except the path where i indicate the directory of my dataset. by the way i have a 14900k 4090 and 128ram and windows 11
is there a way to do this without all the huggingface connection. i already have the model downloaded to a separate location on my PC...don't wanna waste time downloading it again
It's really weird, I managed to train a lora, it works great but only for a few generation, then the generation gets extremely noisy and unusuable, I have to reload the whole model again and it's fine. I don't know what's messed up.
Hi there! I was wondering if it's possible to train a single LORA model to recognize and generate multiple specific faces or bodies of specific persons. For example, could one LORA model be used to generate both my own face and the faces of others based on their names? How to manage this with the trigger words? I have a single dataset with all people tagged by its name and a short caption in the .txt files
no luck here, during training, even at steps 500, the samples looked amazing but i can't load the lora in comfy, neither with the lora loader nor the flux lora loader.
Why peoples talk only on step count? It used to be epochs as when all your dataset is seen once by model its one epoch. Thats why step count should very different for dataset of 5 images vs 25 images. Has something changed?
A very good tutorial. But I'm not sure if ai-toolkit really works on my computer. How long does it take until something happens here? Generating baseline samples before training Generating Images: 0%|
I'm very sad, because I got a nice new GPU just last year, but it's an AMD, and now I've become very interested in AI and I can't do almost anything with it on my local machine. I've found that at least on Linux, you can use ROCm for Pytorch and get some things working that way, so that's my plan now, to install Linux alongside my Windows installation. However, the requirements in this video suggest that you just straight up need NVIDIA, it doesn't even mention the option of AMD+Linux. So am I basically SOL for this one?
Whilst AMD does indeed have the best support on Linux, a lot of things will still require Nvidia software. And though I know it says you need Nvidia, one can only be sure if it gives an error 😉 There is always the fal site too!
@@NerdyRodent I'll probably try it out on Linux, then, and see if it might work still. I wasn't able to get Pytorch to work before; apparently it's now supported on Windows for some AMD GPUs, but not the one I have (RX 7800 XT), but supposedly that one does work on Linux. Sounds like it might be worth a try, at least! Thank you for your reply, and for the video :)
I have a workstation with 2 GPUs: A4500 20 GB in cuda0 and RTX 3060 12 GB in cuda1. Is training possible in this condition? Can I train in 20 GB using A4500? Or multi GPU using both? Or do I need 24 GB in a single GPU?
god damn i finally mastered khoya and now have to use another trainer ( that's the one thing so frustrating with ai with each update half of the shit breaks) hope it will come to khoya as well
@@RiiahTV Nice, I hope you get it. The higher end cards are way out of my range and I have other things I want too. I'll stick with my RTX 3060 12Gb VRAM for now.
Yes, I think what you said is great. I am using mimicpc which can also achieve such effect. You can try it for free. In comparison, I think the use process of mimicpc is more streamlined and friendly.
@@NerdyRodent ...unless you don't want some trace of your own face all over the internet, which is my case. Hence the fact I always prioritized local training over online services. Wether it is runpod or another Web service to train Flux Loras, how can I be 100% sure they dont keep track of my datasets ?
Yeah I get annoyed that every tutorial on the net for AI is mostly a website interface with a form and a click run button. A person is teaching almost nothing but go here. I mean I am looking for real info about installing it and running it myself. Or even coding it from scratch. Actually it’s like this for everything beyond AI. People these days want just a button for a technical skill and then a certificate that says they are something lol 😂 Additionally I don’t like comfyui because it’s too hands on with gui. I mean I code automatic programs so there wouldn’t be a gui. However comfyui has an export feature for straight code. Which I don’t think alot of people are aware.
24g of vram is noit needed anymore. It's even possible with 12g if we shared with a cpu (but you need 32g of Ram)
Most people don't know but the Florenc2 model is very versatile and can be used for OCR. And OCR of handwriting text too, which is pretty hard to do.
It’s pretty versatile - especially for the size!
@@NerdyRodent i'm curious, have you played with the optimizer? On CivitAI, in the settings for training we have Adafactor and Prodigy (I believe I saw it in the past in Koyha).
@@TransformXRED so far AdamW has been fine for me, but I’m creating a variety of datasets to test with still 🫤
Any good tutorials for OCR?
@@ayrengreber5738 I don't know.
But you can pretty much use the same workflow as this video, and you just change the settings of the Forence2run node to OCR. There are two OCR options, try both.
I didn't try to OCR a whole folder. I just use a load image node, load and run Florence2 nodes, a node displaying the text.
It’s exciting to see how Flux support is developing so rapidly! Between Lora training and ControlNet, we’re just about to the point where we can bid farewell to SD entirely.
True, just like what happened to me. My SSD, where all my SDXL and SD1.5 checkpoints and LoRA are stored, became corrupted. But instead of feeling very bad about it, I thought, it’s okay, I can do much better using Flux soon anyway.
flux is still the underlying sd engine. we wont be saying goodbye to sd any time soon, flux is just a unet.
true
But still the 24gb vram requirement to train a lora is giving us a reason to stay in SD
OMG. That thumbnail is on fleek. That’s beautiful.
Nice! Just stepping into the Lora chasm and this will really help
Thanks for the video, but @5:16 it would be nice if you could expand on how to do this entire section with the "run it the first time" (run what?), and how to make that .ENV file (folder?), I'm using WSL Ubuntu in Windows 11 (installing Linux version)
Under accepting model license it says, "Make a file named .env in the root on this folder"
What root? On hugging face itself; is that "THIS FOLDER", or the ai-toolkit root folder we just made using the instructions above? The instructions on the Github for this part needs to be less vague, or am I just that thick?
To create a new text file, you will need to be able to use a text editor, much like editing the config file
followed the steps and im only getting 1 .txt file even though I have many images. In the CMD prompt it describes all my images but only saves 1 .txt file that describes 1 random image .. any idea?
I've 16gb vram rtx 4080, is it possible for me to run the flux lora training locally??
There's a package "ComfyUI-prompt-reader-node" (self-explanatory), this gives tons of outputs along with the extracted FILENAME (without the extension) that you could use for naming or put into metadatas why not
I had trouble with u-kiyoe art as well. Flux is great at specific things but I think a lot of loras are going to be needed with it.
Thanks for this always love your videos! Hardest part looks to be the dataset so really appreciate the auto tagging workflow part. Do you have any tips for datasets for subjects rather than styles? Should I mix close up face shots and full body shots? Have you tried any other training tools like kohya-ss?
Try onetrainer for auto tagging since it also masks it
That would depend whether you want the subject’s face alone, or their body as well…
I'm not sure if it's right but for those who are reading the comments, in the config yaml file, you can un-comment the line number 15, and use a trigger word, I imagine if we want to train a subject/person, right?
Yup!
@@NerdyRodent Thanks
Nerdy, would you say that Ai-Toolkit is better than simpletuner for flux lora training?
It depends what you mean by better 😉
A few criteria would be ease of training, speed, VRAM requirements, and visual fidelity/quality. I would also like to know.
Great video, I've a couple of questions though: What is the huggingface token used for, is it only used to download the flux-dev model? Is any data sent back to HF? And lastly, if I already have the flux-dev model, can I skip this step somehow by placing the model in a folder somewhere?
Thanks for the tutorial!
Yes, the token is just to download the model. If you already have the files (not just the single .safetensors one) you can specify the directory in the configuration.
@@NerdyRodent Thanks for the clarification!
awesome video is there a way to use multi GPUs though? i have 3 3090
Yup, apparently that is possible
@@NerdyRodent great! i find stuff on simpletuner about it but didn't notice anything under ai-toolkit glad it is possible though will try it with simpletuner.
ur rich man,i only have an old 2070
"For the images, you can just use the relative path" -- relative to what? Sorry for the newbish question--it looks like your images and text files both showed up in the same folder, but I don't understand where to point the paths for the images and the text files.
EDIT: relative, evidently, to the comfyui output folder.
I dropped a like because I love you man. As for the content, I can barely work out a sky remote. My son literally showed me a button I can press on my Sky remote so I can ask for what I want to watch. I wish I kept up with tech as you have. I started out good because I was a proud owner of a ZX81 and a 16K ZX Spectrum. I bought a chip from Radio Shack to change it to a 128K Spectrum. It broke. Gave up after my Commodore 64.
😉
Gave up 40 years ago it like those movies were people go into comas and wakeup in the future to bad for you robot butlers aren't out yet should have waited 10 more years.
@@southcoastinventors6583 Creature from the black lagoon on channel 4. I remeber coming home to it when I was about 18. Got me totally into B movies.
@Nerdy Rodent Maybe a stupid question: can Loras trained for SD1.5/XL be applied to Flux models?
I have issue with \venv\lib\site-packages\torch\lib\fbgemm.dll Module not found.
I downloaded a dll called : libomp140.x86_64.dll and put it in C:\Windows\system32\ and it did the trick.
Great stuff! Could you share workflow for Florenc2
Hi, Thanks for the video. Can you tell us how many images you used to train the style LoRA for Flux? And whether you used any augmentation?
20-50 is typically good!
@@NerdyRodent thank you sir
You are doing a splendid Job!
Hi there can I generate images with my Lora model without the use of confyUi?
I just want to be able to load the checkpoint and generate images
It is mergeable? With model?
Is there a CPU mode for any lora training and flux out there..?
I can’t even guess how many days it would take on a CPU… best to use the website for a quick lora
Haha okay didnt think it would be that slow actually, but i can run it on my gpu luckily it were more a curiois question ☺️☺️@@NerdyRodent
Looking forward to when I can make these with a 3080 but still using SD 1.5 since sdxl and flux have still not given me as good as sd 1.5
hello , whenever I try to launch the AI script I get the following: ModuleNotFoundError: No module named 'dotenv'
any ideas? (have pyton and git already installed)
this error went away after running pip3 install -r requirements.txt from within the ai-toolkit folder
The .env file way did not work at all for me for some reason. But the cli login was literally paste command, paste key, done :P
The captioning works if I run it without save image node. If I run them both it never captions any images, it just loops through the same image node indefinately saving multiple copies of the images until I cancel the queue :(
by any chance do you save your image the same directory you load them ?
Hey can I train it with 16 gigs of ram on m1 pro mac?
What alternatives do I have?
There any way to do this with like Google Colab or a gpu on demand type service?
Like the fal website?
He mentions the Fal service near the start of the video. Search for 'Fal ai'.
There's also Replicate which I found a little cheaper than Fal. I trained a Flux lora with that and it turned out really well, took about 40mins with the default settings. Also, if you head over to Matt Wolfe's tutorial on it (read the comments because he misses some important bits) you can get $10 free credits, which should be enough for 2 or 3 Flux loras.
i find it impossible to download the models. it starts downloading the model at 30mb/s and then it goes down to just a few Kbytes and stays at 99%. i have tried with different hugginface tokens (write, read finegrain....). i also leave the .yaml by default except the path where i indicate the directory of my dataset. by the way i have a 14900k 4090 and 128ram and windows 11
is there a way to do this without all the huggingface connection. i already have the model downloaded to a separate location on my PC...don't wanna waste time downloading it again
If you’ve already got the files (i.e. not just the dev safetonsors file for comfy), specify the directory path in the config file
It's really weird, I managed to train a lora, it works great but only for a few generation, then the generation gets extremely noisy and unusuable, I have to reload the whole model again and it's fine. I don't know what's messed up.
Hi there! I was wondering if it's possible to train a single LORA model to recognize and generate multiple specific faces or bodies of specific persons. For example, could one LORA model be used to generate both my own face and the faces of others based on their names? How to manage this with the trigger words?
I have a single dataset with all people tagged by its name and a short caption in the .txt files
Yup, you can do it like that!
Where can we get reference images that aren't copyrighted and free to use for training?
Lots of museums have open access images
Is this better or worse than kohya? I want to train a lineart style human body poses for reference and Im really having issues.
Give it a go and see! 😉
no luck here, during training, even at steps 500, the samples looked amazing but i can't load the lora in comfy, neither with the lora loader nor the flux lora loader.
Could be an old version of comfyu as the Lora support was added hours ago now 😉
@@NerdyRodent i was on a different branch (facepalm)
Have they made it any quicker? My 3060 takes about 20 minutes for a basic image using flux in comfyui
20s on a 3090, so that sounds pretty slow!
Why peoples talk only on step count? It used to be epochs as when all your dataset is seen once by model its one epoch. Thats why step count should very different for dataset of 5 images vs 25 images. Has something changed?
is there a way to use flux with animatediff for creating videos?
A very good tutorial. But I'm not sure if ai-toolkit really works on my computer. How long does it take until something happens here?
Generating baseline samples before training
Generating Images: 0%|
Flux usually takes about 20 seconds per image
where is the workflow?
You can find links for everything in the video description 😃
How about sophisticated 3D style renderings? Is it possible?
Go for it!
Pls when training on lower vram comes out, make a vid on that
It's already available but it's with paid services, not locally.
you can train localy, but 16gig of vram will take around a day
@@quercus3290 I only have 12 😔
@@quercus3290 with this particular setup? Cause on github it says it's only possible with at least 24gb
What if i add one more 12gb gpu in my pc would it be detected in comfy ui? because in sd1.5 it did not recognize but im pretty sure my gpu is good.
Can’t add more vram without buying a new card. It’s not overal ram. It’s gpu ram
Those loras doesnt work with web ui forge :/
I'm trying your workflow with other flux lora models from civitai, but the render is still too long. 50s per iteration.
50s is about right for the lower end cards with Flux. It’s around 20s on a 3090
Yaaaaay going Flux is the only way!! 🎉😊
I'm very sad, because I got a nice new GPU just last year, but it's an AMD, and now I've become very interested in AI and I can't do almost anything with it on my local machine. I've found that at least on Linux, you can use ROCm for Pytorch and get some things working that way, so that's my plan now, to install Linux alongside my Windows installation.
However, the requirements in this video suggest that you just straight up need NVIDIA, it doesn't even mention the option of AMD+Linux. So am I basically SOL for this one?
Whilst AMD does indeed have the best support on Linux, a lot of things will still require Nvidia software. And though I know it says you need Nvidia, one can only be sure if it gives an error 😉 There is always the fal site too!
@@NerdyRodent I'll probably try it out on Linux, then, and see if it might work still. I wasn't able to get Pytorch to work before; apparently it's now supported on Windows for some AMD GPUs, but not the one I have (RX 7800 XT), but supposedly that one does work on Linux. Sounds like it might be worth a try, at least! Thank you for your reply, and for the video :)
I paid the patreon, can you share there this lora, looks great
As it’s flux dev I can’t put the Lora itself up on patreon, but I’ll maybe look at putting it up on GitHub if there is interest in that test file!
@@NerdyRodent I dont understand why you cant put it up on patreon. Otherwise, can you just upload to wetransfer or something and send it?
So...16 gig VRAM won´t cut it?
bummer
you should have more subs!
Hi great tutorial thank you, I have 11gb ram (GPU) is it impossible to train with this specs?
You can use fal 😀
Can you kindly please define what "fal" is?
You can find the link to the fal website shown in the video in the video description!
Thank you🎉
I have a workstation with 2 GPUs: A4500 20 GB in cuda0 and RTX 3060 12 GB in cuda1. Is training possible in this condition? Can I train in 20 GB using A4500? Or multi GPU using both? Or do I need 24 GB in a single GPU?
You may be able to squeeze it into the 20 gig in low vram mode
@@NerdyRodent Thanks. I will try that tonight and will let you know the results
god damn i finally mastered khoya and now have to use another trainer
( that's the one thing so frustrating with ai with each update half of the shit breaks) hope it will come to khoya as well
Can i train on my rtx 3060 12gb?
it literally says you need 24gb
Hello! 16GB VRAM won't be enough? 😢
You can use fal 😀
What is fal?
@@mssalomander links are in the video description!
is there a way you can do this with 16 vram?
You can through paid online services. That's the only way right now.
@@Elwaves2925 im just gonna buy a new gpu gonna get another rtx a4000 then ill have 32 vram
@@RiiahTV Nice, I hope you get it. The higher end cards are way out of my range and I have other things I want too. I'll stick with my RTX 3060 12Gb VRAM for now.
how do you get 24Gb of VRAM?
Buy an Nvidia RTX 3090 or RTX 4090.
@@jibcot8541 in other words, be less poor.
Download the remaining GB 😂
Would you make a tutorial for Civtiai's lora training feature? I have no idea what the best settings are looool
Have a go on the fal one!
看了半天,所以你是在lunix上训练的吗?
Yup! Anything AI is best supported on Linux 😀
Thank you!
so guess no way to make this work on half the recommended VRAm (like 12 or 16gb)
This is incredible! I've learned more in 15 minutes than I did in a month of watching TH-cam videos.
This is youtube video?
@@taucalm LMFAO! 🤣🤣🤣
Yes, I think what you said is great. I am using mimicpc which can also achieve such effect. You can try it for free. In comparison, I think the use process of mimicpc is more streamlined and friendly.
@@Huang-uj9rt bruv, no. say less
24GB of VRAM lol
Might as well hire a freelance artist to draw things for you at that price :o)
You would probably get 2 commissioned images for that price, I have made over 500K for the £700 I paid for my 3090 2 years ago.
@@jibcot8541 Congrats... What did you do with your card to make so much?
@@jibcot8541 How?
@@jibcot8541 lol liar
here i have 40mb vram XD
Flux is a great model, bad thing is we dont have consumer class gpus for that yet (affordaable).
We have things like the 3090 & 4090, which is great! Fal is nice and cheap too - especially if you know you’re never going to need a GPU again
@@NerdyRodent which have 16gb and 24gb of vram and we would need at least 36gb maybe 48gb. we need chinese modded 4090 48gb.
@@taucalm they have 24gb VRAM 😉
Tthis guy is always has the hottest stuff.
24GB of VRam...Ow maan, in other words it means only the lucky owner's of RTX 4090 can train loras locally.
Us peasants will have to wait I guess.
Or use the website shown 😉
@@NerdyRodent ...unless you don't want some trace of your own face all over the internet, which is my case. Hence the fact I always prioritized local training over online services.
Wether it is runpod or another Web service to train Flux Loras, how can I be 100% sure they dont keep track of my datasets ?
or a 3090
@@jonathaningram8157 3090 is 16Gb of Vram so no. Video says you need 24Gb of Vram.
Oh, Nerdy Rodent, 🐭💻
he really makes my day; 😊✨
showing us AI, 🤖📊
in a really British way. ☕🎩
Yeah I get annoyed that every tutorial on the net for AI is mostly a website interface with a form and a click run button. A person is teaching almost nothing but go here. I mean I am looking for real info about installing it and running it myself. Or even coding it from scratch. Actually it’s like this for everything beyond AI. People these days want just a button for a technical skill and then a certificate that says they are something lol 😂
Additionally I don’t like comfyui because it’s too hands on with gui. I mean I code automatic programs so there wouldn’t be a gui. However comfyui has an export feature for straight code. Which I don’t think alot of people are aware.
The websites are great for people who don’t have the GPU power of course. Glad you got this one installed locally using my tutorial!