LORA Training - for HYPER Realistic Results

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ต.ค. 2024

ความคิดเห็น • 139

  • @numbnut098
    @numbnut098 11 หลายเดือนก่อน +7

    I think this is the best, most detailed tutorial on the subject of training a character lora that I have seen. The information you have given has changed my lora's from treaining nightmare juice, to training actual people. Thank you so much for this.

    • @OlivioSarikas
      @OlivioSarikas  11 หลายเดือนก่อน

      Thank you very much :)

  • @JavierPortillo1
    @JavierPortillo1 ปีที่แล้ว +12

    Realistic vision just updated a couple days ago and it looks fantastic!

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +4

      Cool, i will look into that and maybe make a video about it

    • @yudhamarhatn3006
      @yudhamarhatn3006 ปีที่แล้ว +1

      Agreed, already training lora with RV4.0 and it gimme goosebump

  • @lennylein
    @lennylein ปีที่แล้ว +7

    I can not stress enough how important the quality of the source images is for training Loras. This is one of the few tutorials which actually give useful advice how to create and prepare a high quality training data set.
    Thank you for this outstanding video ❤

  • @pixeladdikt
    @pixeladdikt ปีที่แล้ว +21

    Thank you Olivio! I've had to train new faces and since Dreambooth isn't the "go-to" these days I've been looking for a new LORA tutorial. Those last 2 mins where you explain putting the LORA into the aDetailer really hit home - such an amazing workflow 👊💯

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว

      Thank you, yes, that helped me a lot too :)

    • @BunnySwallows-qx7ek
      @BunnySwallows-qx7ek ปีที่แล้ว +1

      do you know what the "go-to" Method is these days? I am having trouble running dreambooth on colab, so i'm looking for a new method to make a LoRa or model.

    • @chrisrosch4731
      @chrisrosch4731 8 หลายเดือนก่อน

      wouldnt you say dreambooth is still king?

  • @WifeWantsAWizard
    @WifeWantsAWizard 4 หลายเดือนก่อน +5

    (16:40) First, he meant "ALT+F4". Second, you can "ALT+TAB" to swap to the alert popup. Also, BooruDatasetTagManager has hotkeys you can set yourself (under "settings" => "hotkeys"). The default hotkey for hiding the preview window is "CTRL+P".
    (25:30) He forgot to mention that you can't set "max resolution" to 768x768 if your input images are less than that--say 512x512. A lot of times we'll create LoRAs specifically for use in image-to-image. That means we want those LoRAs to output at a low resolution so that it is quick and then you can "upscale" in img2img using the low-res as a base. You can also use 128x128 for pixel art.

    • @germanfa1770
      @germanfa1770 4 หลายเดือนก่อน +2

      Could you please explain one thing to me? I'm not sure if I understand this correctly. Can I use a resolution of 2040x2040 pixels or 1600x1600 pixels to train Loras SD 1.5, and set 768x768 pixels in the settings of Kohya without the "Enable Buckets" option turned off, since all the photos in the dataset have the same resolution of 2040x2040? If so, is it advisable to do this? After all, the SD 1.5 model only understands 512x512 pixels. Will understand the model my dataset of 2040px? But if I prepare my dataset as 512x512 or 767x768, the quality of the original photos will be noticeably reduced. Thank you.

    • @WifeWantsAWizard
      @WifeWantsAWizard 4 หลายเดือนก่อน

      @@germanfa1770 So, two things. To your question, "buckets" is for sorting input images into different groups. So the system presorts, let's say, all the 1080x480s into one group and does them together then presorts all your 768x512 into another group and so on. Checking the "buckets" toggle means you are telling the machine, "look through this pile and sort it before you get started". If you only have one size, when it "buckets" everything it will only find the one size and group them all together. Technically that's a waste of 20 seconds, so hence you can turn it off if you know everything you're ever putting in will be 1:1 or whatever.
      Second, stable diffusion "remembers" powers of two. So, you said, "understands 512x512". True, but use the word "remembers". SD and SDXL also remember 1024x1024, 2048x2048 (if you want your graphics card to catch fire), and even 128x128 (for pixel art). It stores the training data at a 1:1 aspect that is a power of 2 but can "see" (train from) any resolution/aspect ratio. It may seem counter-intuitive that a 1:3 aspect ratio can somehow result in 1:1 training data, but that's how the math works.

  • @ArnoSelhorst
    @ArnoSelhorst ปีที่แล้ว +6

    Olivio, absolute professional advice here that is really appreciated. I follow you for quite some time now and I have to say it really shows the earnesty in which you follow your passion and teach it to others. Bravo! Keep it up!

  • @ozerune
    @ozerune ปีที่แล้ว +31

    I made a Lora of my dead grandma last night to create images of her for my mom, and she was very happy with it, but it was so blurry and unfortunately it isn't really possible to give it a better dataset anymore

    • @testales
      @testales ปีที่แล้ว +12

      You could try upscaling your training dataset images with GigaPixel AI beforehand. This will also fix quite a bit of noises. There are also AI sharpers which can yield impressive results at times, so it depends how much time you want to invest in your training set.

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +12

      Try to upscale and sharpen the images, then try to use the very best images from your AI creation also in the training and see if that improves anything. Also try to render the images with differnt models or render it first with the model that works the best for you and then use img2img with a different model that creates more realistic results

    • @spoonikle
      @spoonikle ปีที่แล้ว +4

      I am with the comments, lots of AI’s specialize in upscaling old photos. If your using a model only trained on modern digital data then you wont get much help, but if you start with high res scans of photographs in a model train on upscaling photo scans you will get amazing results.
      Maybe we can fill a database to contribute to a lora by taking the same photos with different cameras and scans of film.

    • @Chilangosta
      @Chilangosta ปีที่แล้ว

      Agree with the other comments - I'd just echo the advice to not give up if you really want this! The first few times are almost never representative of the results you can get! Keep tweaking and trying new things, and research it more, and you'll probably end up with a much better result than you thought possible!

    • @Lexie-bq1kk
      @Lexie-bq1kk หลายเดือนก่อน

      People are recommending you upscale and sharpen, however there is a lot of work you can do before getting to that point. I would encourage you to take the photos you have, go through and remove any unwanted objects, and to simplify the background as much as possible. I use Topaz Photo AI and Photoshop to remove objects and create new backgrounds,' The images might be low quality but if you can simplify the image into a very simple background and just the subject, it will help. Also, you can take your training photos into a CLIP interrogator to see how the AI will recognize certain things about the image, ultimately you may be able to use that information for captioning or for future use in negative prompts.
      Also, I would see if you can accomplish the "sharpening" with actual very low scale denoise, like 0.1 in Topaz, since what you are hoping to achieve is more clarify, denoise might be a better alternative to sharpen. I find that as long the image isn't overly noisy, sharpness doesn't matter as much.

  • @JohnSmith-vk4vq
    @JohnSmith-vk4vq 8 หลายเดือนก่อน

    Wow thank you for explaining the right way to set up samples… you are correct 👍 Sir!

  • @maxfahl
    @maxfahl 9 หลายเดือนก่อน

    BIG thank you! This was exactly the video I was missing in my LORA expeditions.

  • @LeonvanBokhorst
    @LeonvanBokhorst ปีที่แล้ว +3

    Thanks again. Very helpful, like always 🙏🚀

  • @jrfoto981
    @jrfoto981 ปีที่แล้ว

    Thank you Olivio, this is a good process for getting a desired result. I used similar process of image preparation for making custom embeddings.

  • @0AThijs
    @0AThijs ปีที่แล้ว

    Thank you for this very informational guide, Definitly -one if not- the best out there : )

  • @OlivioSarikas
    @OlivioSarikas  ปีที่แล้ว +4

    #### Links from the Video ####
    Install Kohya ss Guide: th-cam.com/video/9MT1n97ITaE/w-d-xo.html
    Photon Model Video: th-cam.com/video/0tDFCZr5cA8/w-d-xo.html
    Photon Download: civitai.com/models/84728/photon
    v1-5-pruned.safetensors huggingface.co/runwayml/stable-diffusion-v1-5/tree/main
    github.com/starik222/BooruDatasetTagManager

    • @op12studio
      @op12studio ปีที่แล้ว

      you can just press enter to get passed the save popup if you accidentally forgot to move the preview image. Great video btw

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว

      @@op12studio oh, cool, thank you!

    • @gohan2091
      @gohan2091 ปีที่แล้ว

      Typo in your description
      it's Photon not photo :D

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +1

      @@gohan2091 thank you

  • @AeroviewMenorca
    @AeroviewMenorca ปีที่แล้ว +1

    As always, excellent work Olivio. I've been following you from Spain. My English is a bit limited, so I use an AI to translate your voice into Spanish based on the subtitles. It might be funny, but it's a lifesaver for me. You provide very detailed explanations in your videos. Greetings and thank you very much! 👏👏

  • @Inugamiz
    @Inugamiz ปีที่แล้ว +6

    Olivio, mind doing an update tutorial on making those dancing AI videos? I been trying but either the face is messed up or just stop doing the poses.

  • @axelesch9271
    @axelesch9271 ปีที่แล้ว +3

    The new Kohya SS master release is using different tabs of your video : Deambooth, Loora, Text Inversion ... nothing Like Dreambooth TI, Dreambooth Lora, How to figure what tabs do what, since the dreambooth tab doesnt include anything like network rank ... Also the Lora tabs include nothing related to Dreambooth/Lora technique.
    Nobody is talking about it but the dev of the GUI just changed the whole UI whithout providing any documentation on how to interpret all the changes to the GUI he has made.

  • @mr_pip_
    @mr_pip_ ปีที่แล้ว

    Wow .. really very well explained, thank you!

  • @TheTornado73
    @TheTornado73 ปีที่แล้ว +1

    Hello! there is also a psychology factor, women do not like too detailed photos -)
    that is, it is not necessary to see all the wrinkles, acne, pigmentation, etc.
    so detailing is important for large details of the shape of the eyes, eyebrows, eyelashes of the lips
    and if you focus on the super detailing of all wrinkles
    - they will tell you - it doesn’t look like it!
    no wonder the beauty industry works -)))
    you can reduce the number of epochs by increasing the dataset, i.e. on a dataset of 50-60 photos .80-90 steps per photo
    and one epoch give quite normal results, with the weight of lore in the prompt 0.7-0.8,
    + a variety of clothes, a variety of backgrounds
    if the set is on the same background - this background will pop up in the most unexpected places if it is only in a white T-shirt
    - this t-shirt will be everywhere,
    it's better to cut out the background altogether,
    I am from a set of 10 photos with the same type of background - I cut out the background for 8, well, a variety of clothes will not let sd get hung up on a certain color, style

  • @HanSolocambo
    @HanSolocambo 6 หลายเดือนก่อน

    21:03 "That number defines the steps or repetition [...]"
    This number represents repetitions only (repeats).
    steps is something else = nb.images x repeats.
    21:09 "I mostly use 10 for my LoRA but others use 5 [...]"
    Nothing's random in training a Lora ;) Number of repeats should more be about "how many images do I have for that specific LoRA", rather than about "how many epochs am I going to need now" or "I am used to that number".
    Images found (let's say 100) x repeats (8) = 800 steps.
    steps x gradient accumulate steps x epochs x regularization factor (if one uses properly made reg. images + reg captions for each trained image) = Max Train Steps.
    800 x 1 x 2 x 2 = 3200 steps (which is often enough).
    This being said I'm still confused about why or how one should balance repeats and/or epochs to reach the sweet spot between about 3K~4K Max Train Steps. Especially since we can save checkpoints and samples every N samples, run more epochs or resume from a precedent trained weight.

  • @RolandT
    @RolandT ปีที่แล้ว

    Und wieder vielen lieben Dank für Deine super Anleitung und Tipps mit Kohya und anderen Models auf denen man trainieren könnte! Ich hatte damit Anfangs Schwierigkeiten mit der Installation (falsche Python-Version und manueller Installation von accelator) aber nach ein paar pip Installs hat es nun endlich geklappt. Ich habe mich gleich an eine Fotoserie mit über 400 Fotos gewagt (mit 5 Epochs a 20) und bin vom Ergebnis überwältigt! Auch bin ich vorher nie auf die Idee gekommen, es nicht mit SD 1.5 zu trainieren. Hatte bisher auch nur Hypernetworks mit eigener Formel trainiert, was relativ aufwändig war. Die Ergebnisse waren mal so mal so. Mit Photon kommen die Fotos nun von meinem Fotografen gleich viel realistischer. Auch der Tipp mit dem ADetailer ist Klasse! Wieder mal so viel dazulernen können! Danke dafür! Jetzt fehlt nur noch ein Tut wie man LORA-Models für SDXL erstellt (hatte ich schon mal probiert aber bekomme noch überall Fehler). 🙂

  • @haydenmartin5866
    @haydenmartin5866 ปีที่แล้ว +2

    Hey man we need to see an SDXL Lora tut 🙏

  • @TaylorWay-e8z
    @TaylorWay-e8z 9 หลายเดือนก่อน

    great video! thank you!

  • @gohan2091
    @gohan2091 ปีที่แล้ว +3

    I'm currently using low resolution, low quality images from Facebook of work colleagues (with their permission of course) and making a LORA although the results are pretty poor but I'm using Roop with the 4x ultra upscaler and getting ok results but nothing amazing like these with high quality and high resolution photos. A tip.. You can see the meta data of a Lora inside A1111 which tells you the settings used during training.

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว

      Really? How do you see the meta data of the lora in A1111?

    • @gohan2091
      @gohan2091 ปีที่แล้ว

      @@OlivioSarikas I'm using Vlads A1111. When you view your list of LoRAs thumbnails (where you click them to insert into the prompt box) there are various buttons at the bottom of each thumbnail such as adding Lora to your favourites. One of the buttons is to read the meta data. This shows you the training data settings you used. Was looking at this last night. May just be exclusive to Vlads. I'm not home at the moment to give you exact instructions to locate but it's definitely possible.

    • @Dalroc
      @Dalroc ปีที่แล้ว

      @@OlivioSarikas Just click the ( i ) in the top right corner of the LORA i nthe A1111 GUI.

    • @Strangepaper
      @Strangepaper ปีที่แล้ว

      Use the roop-ed photos as additions to the training data!

  • @Lenpreston2
    @Lenpreston2 ปีที่แล้ว

    Fascinating video

  • @tomschuelke7955
    @tomschuelke7955 ปีที่แล้ว +2

    Many thanks for this. Two questions. No 3.
    What is about those extra images.. Some other youtubers suggest for the class of object.. Dont remember the name.. Calibration images?
    When to make a lora and when to use dreambooth?
    Third..
    When i want to train for example the typical style an architectural company has for.. Lets say.. Office fassades seen from the street.. That for sure difffer often. But to still finde the essence of the style.. Lora or dreambooth. How many images. How to capption?

  • @Aviator-ce1hl
    @Aviator-ce1hl ปีที่แล้ว

    Instead to create manually the folder for the training you can do it automatically using the Tools tab in the Kohya Dreambooth LoRA.
    About the using of "Restore Faces" if I well remember in one of your video you suggested to don't use if you are using a LoRA model because it may modify the actual face. Actually I found that it may be true. When you use the Tools in Dreambooth there you set the key word for the LoRA and you also give a category to the model which is I believe important for the training.

  • @JieTie
    @JieTie 9 หลายเดือนก่อน +1

    Maybe a vid for training LORA for XL?

  • @camar078
    @camar078 10 หลายเดือนก่อน +1

    Alles schön und gut, aber die wirklich relevanten Stellen und Problematiken die einem beim Training begeegnen hast du nicht besprochen. Punkte die tatsächlich informativ gewesen wäre, sind: Was hast du geändert, nachdem ein LoRa die Kopfform oder Haare nicht richtig wiedergegeben hat? Welchen Unterschied macht die Auflösung und das Seitenverhältnis der Quellbilder sowohl in Trainingszeit, so wie auch in den Ergebnissen? Wie stellt man die Buckets richtig ein und wie ist deren Beziehung zur Trainingsauflösung und den Quellbildern? Wie trainiert man sowohl Portrait- wie auch Full-Body-Shots? Wie viele der jeweilig eingestellten Perspektiven haben bei dir funktioniert? 1 Teil Full-body, 3 Teile Close-up? Etwas anderes? Wie kann ich die Konsistenz der Ergebnisse in verschieden eingestellten Seitenverhältnissen verbessern/muss ich etwas bei den Quellbildern beachten damit die LoRas hier gut funktionieren? "Erklärungen" zu Mixed precision, Network rank dims (Dateigröße, Konsistenz der Ergebnisse), LoRa Auflösung sind bestenfalls als gefährliches Halbwissen zu bezeichnen und sollten auch dringend als solches markiert werden. Aussagen wie "Ich habe viele gute Ergebnisse bei mit fp16 trainierten LoRas gesehen, aber auch welche mit bf16" helfen keinem weiter und haben keine Aussagekraft für irgendwen, wenn die zugrundeliegenden Eigenschaften nicht zumindest kurz angeschnitten werden. Mein Vorschlag ist daher, entweder diese Punkte kennzeichnen als "persönlichen Eindruck" oder gleich sagen, dass man hier nichts objektiv nachweisbares weiß und nicht weiter recherchiert hat. Das Netz ist mittlerweile voll von "Tutorials" die zu 90% inhaltlich alle identisch sind und nur zu bestenfalls halbgaren Ergebnissen führen. Mehr Ehrlichkeit und oder echte Recherche wäre erfrischend hilfreich.

  • @Dalroc
    @Dalroc ปีที่แล้ว +1

    Just hit Enter if your pop up in the Booru tagger is blocked by the preview.
    You should've shown the full process of tagging one or two images. Like what tags you removed and what tags you added.

  • @Foloex
    @Foloex ปีที่แล้ว +1

    I wish there was a tutorial out there to train other things than people for example: different sports (martial arts, ping pong, danse move, gymnastics, pole vaulting...), hugging, massage, meditation pose, playing cards, ... I tried to train a simple concept like holding "boy dressed as a magician, holding a rabbit", "woman holding a baby", "girl holding a cat"... So far I can't get consistent result and the relation " " doesn't seem to be understood. If anyone could give me pointers on that matter, I would appreciate.

    • @hcfgaming401
      @hcfgaming401 9 หลายเดือนก่อน

      Leaving a comment on the off chance this gets a reply eventually.

  • @ChrlzMaraz
    @ChrlzMaraz ปีที่แล้ว

    Something that is never talked about regarding quality images is focal length. Vary your focal length! some wide angle close ups, standard, and some telephoto portraits. In addition, vary your f-stop. Wide angle image's usually wont have bokeh, most telephoto photos will.

  • @krystiankrysti1396
    @krystiankrysti1396 ปีที่แล้ว +2

    I would advise against using DSLR or big sensor camera, it will introduce bokeh and you want whole head in focus (shallow dof ruined my trainin cause ears and hair were out of focus and it learned it), phone photos are better , especially if you have multiple lenses not just one wide angle, try batch 4,epoch 4, repeats 27, images 35, network at 200,200 inference in webui with 0.8 and inpaint the face at 1.0 or 0.9 to bring the likeness even more, with lora its hard to get likeness from one inference ,you have to inpaint just the face on full power but without distortion from overtraining, its better if 1.0 is overtrained and starts to distort

    • @mattmarket5642
      @mattmarket5642 ปีที่แล้ว +2

      Very bad advice. Larger sensor means more detail and higher quality. You’re right that you want the whole face in focus. Just don’t shoot the photos with too low of an f-stop. Smartphone cameras often distort faces.

    • @krystiankrysti1396
      @krystiankrysti1396 ปีที่แล้ว

      @@Teo-iq4gi its for kohya ss webui, 27 is number of repeats, when you train a face, you should see overtraining at 1.0 , at 0.6 there should be great stylisaiton but weak likeness, so you generate at 0.6 and inpaint face at 0.9 or 0.8 if you can, this way you get best results, get adetailer extension to fix inpaint face automatically

    • @PhilipRikoZen
      @PhilipRikoZen ปีที่แล้ว

      @@krystiankrysti1396 sorry to ask you again, where is the value "repeats" in kohya ss webui? can't find it. thank you

    • @krystiankrysti1396
      @krystiankrysti1396 ปีที่แล้ว

      @@PhilipRikoZen dood are you that lazy so many days passed and still cant find it ? Its in tools panel, maybe spend like 10 minutes going over all panels in webui and read what it says , you want to learn it or be lead by a hand

    • @PhilipRikoZen
      @PhilipRikoZen ปีที่แล้ว

      @@krystiankrysti1396 so much anger, 2 days ago was the first time ever i read your comment in my life, no idea what "so many days" you talking about, anyway thank you, was under "deprecated" which my brain was ignoring cause deprecated

  • @CrystalBreakfast
    @CrystalBreakfast ปีที่แล้ว +2

    With the tagging, people say to only tag the things that aren't intrinsic to the subject. So a lot of those (rather judgy) anime tags aren't appropriate because things like "thin lips" are intrinsic to what a "Betka" is. So by tagging "thin lips" what you're saying to the AI is it's a picture of "a Betka with thin lips," implying that a "Betka" normally doesn't have those lips. So you mainly tag things that aren't a permanent part of what makes a "Betka" a "Betka," then by process of elimination it learns what is a part of Betka and what isn't. Or at least, that's what I've come to understand from the advice I've read.

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +3

      I that was the case then it would not render anything that was tagged afterwards in the image. but that is not how it works. When you tag "short hair" the character does have short hair in the images, unless you write a different hair style in the prompt. but if you don't write short hair, it will always have short hair. So a keyword or tag does not exclude things, it makes the a variable as i said in the video. tag things you want to be able to change

  • @ProjectOfTheWeek
    @ProjectOfTheWeek ปีที่แล้ว

    Great video! We need a tutorial to train a brand logo. and then be able to create illustrations with the.logo, thanks!

  • @sigma_osprey
    @sigma_osprey ปีที่แล้ว +6

    Hi, Olivio. I'm new to this AI image generation thing. I want to learn how to do it. Do you have a structured tutorial or a set of videos that teaches how to go about this? Like from installing stable diffusion (not on a pc) to model installion and best image prompts to produce realistic looking human images. I hope I made sense there. Thanks.

  • @simonevalle8369
    @simonevalle8369 ปีที่แล้ว +1

    My kohya gui is totally different and I can't understand what I have to do for train my lora......evry time I tryu to train it I ahve only errors on the cmd page

  • @markreiser4080
    @markreiser4080 ปีที่แล้ว +1

    What about regularization images? Some say they are important, some even don't mention them?

  • @lordkhunlord9210
    @lordkhunlord9210 ปีที่แล้ว +2

    Unless im missing something , how are we supposed to use booru dataset ? Even the github page doesn't say how we're supposed to install it

  • @nayandhabarde
    @nayandhabarde ปีที่แล้ว +2

    @olivioSarikas can you please create one detailed style training guide with lora and dreambooth? which one is more suitable?

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +2

      I would maybe do that as a online course :)

  • @Divamage
    @Divamage ปีที่แล้ว +1

    my kohya ss has missing module library issue

  • @alectriciti
    @alectriciti ปีที่แล้ว

    It would be neat if you did a video on the Dreambooth extension for A1111. Personally, I've had several issues getting Kuyah working even after different types of installs, it ended up just being a big headache, where as the Dreambooth Extension just works. Though they share similar concepts, I think it would still be helpful for a lot of people. Anyway, thanks for the video m8!

    • @greatjensen
      @greatjensen ปีที่แล้ว +1

      Agree with this. I cant get Kuyah to work either.

  • @4dee103
    @4dee103 ปีที่แล้ว

    Thanks for a great video, new to SD but wanted to give this a go...can i please ask what sizes are all your photos for training? You're square photos are 768x768? What about the full body shots?

  • @randymonteith1660
    @randymonteith1660 ปีที่แล้ว +1

    Is the " Booru Dataset Manager " a standalone program or an extension for A1111 or Kohya?

  • @PatchCornAdams723
    @PatchCornAdams723 11 หลายเดือนก่อน

    All I want is smutty images of Diana Burnwood from the 'Hitman' series. I have a high end PC and I am computer savvy, I just trip up on the github stuff.

  • @sikliztailbunch
    @sikliztailbunch 10 หลายเดือนก่อน

    Sadly, every tutorial is either about training on a specific person or a general art style. I want to train on scorpions and they turn out bad.
    I also have trouble with Kohya. It won´t train. It gives me an error in the console. I haven´t found any fix for that. So I train with Dreambooth. But my loras don´t even chage the image . They do literally nothing. I watched countless tutorials, read through all docs. But I think I am doing it wrong anyways.
    I found a small tool called NMKD, which let´s me train on SD 2.0. It works, kinda. But it only gets me dead, misfigured scorpions. Also I want scorpions in SDXL. SD 2.0 is too weak overall.

  • @pb3d
    @pb3d ปีที่แล้ว +1

    16:49 if you're stuck with this, just hit enter, that will close the pop up

  • @matthewmounsey-wood5299
    @matthewmounsey-wood5299 ปีที่แล้ว

    🙌❤

  • @germanfa1770
    @germanfa1770 4 หลายเดือนก่อน

    Could you please explain one thing to me? I'm not sure if I understand this correctly. Can I use a resolution of 2040x2040 pixels or 1600x1600 pixels to train Loras SD 1.5, and set 768x768 pixels in the settings of Kohya without the "Enable Buckets" option turned off, since all the photos in the dataset have the same resolution of 2040x2040? If so, is it advisable to do this? After all, the SD 1.5 model only understands 512x512 pixels. Will understand the model my dataset of 2040px? But if I prepare my dataset as 512x512 or 767x768, the quality of the original photos will be noticeably reduced. Thank you.

  • @ArjenJongeling
    @ArjenJongeling 8 หลายเดือนก่อน

    15:26 I can't figure out what the name of the tool is you mention. Is there a link to it, or how do you spell it?

  • @testales
    @testales ปีที่แล้ว

    You put so much effort into details but in the end you are training with 512px or 768x, so that means these images will be downscaled accordingly right before the actual processing occurs. So it doesn't matter if you provide super nice 4k images or just 768px right from start. In fact it might be better to downscale the images the exact actual training resolution yourself, that way you can at least choose the downscaling method and see what the LORA will actually see and learn. Btw. for the 1.5 models 1024px is no problem if your VRAM can handle it, though it may not work with just a few images.

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +1

      I would have to try that, but you will still get a better 768 image from a very sharp image than from a image that lacks detail. Also you will see in upscaling that loras and checkpoints trained on larger images give a much better, more high detail result

    • @testales
      @testales ปีที่แล้ว

      @@Teo-iq4gi Recent implementations use buckets, where a bucket is container for images of a specific size. So in the first step your data set will analyzed and a number of buckets will be created that suits your dataset with the biggest possible bucket being the size x size, hence 1024 x 1024 if you set this as your training size. If you don't have square images, a smaller bucket will be used such as 1024x768. All your images will be put in the bucket they fit in best. How the bucket sizes are calculated and therefore how many buckets there will be, depends on the algorithm. But either way, every image that exceeds the training resultion will be scaled down to fit in one of the biggest buckets. So to my understandin,g there is no adavantage in providing images above training resolution.

  • @fatenathan
    @fatenathan 7 หลายเดือนก่อน

    Thanks Olivio! Can u tell me one thing. i want to train a LORA for Stable 1.5 but also XL. When i make few pictures how big should be the canvas size? Is it okay to make very high quality like 2k resolution and the lora make the details out of it? or is this more bad to get higher resolution like 2140x1600 px. ?

  • @Dmitrii-q6p
    @Dmitrii-q6p ปีที่แล้ว

    is it better to remove background, not?
    3d paint can do it successfully in 1 click.

  • @fernando749845
    @fernando749845 ปีที่แล้ว +1

    Is your audio out of sync or is it my system?

  • @temporaldeicide9558
    @temporaldeicide9558 11 หลายเดือนก่อน

    What about Regularisation folder? I hear that sometimes that helps a lot. But I don't actually know what it is or what it does.

  • @pb3d
    @pb3d ปีที่แล้ว +1

    what are your thoughts on regularisation images?

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว

      That can certainly help, but i haven't experimented with that too much. It's good for putting keywords on things that don't work so you can then put these keywords into your negative prompt

  • @leolaxes
    @leolaxes 9 หลายเดือนก่อน

    How do I select from 4 epochs that look technically the same? They are so similar that to distinguish the difference is difficult. Adetailer essentials makes them all perfect

  • @camilovallejo5024
    @camilovallejo5024 10 หลายเดือนก่อน

    Somehow I don't get the white button you used to select the model... Like really? 😑

    • @Jaysunn
      @Jaysunn 9 หลายเดือนก่อน

      Which point in the video are you referring to

  • @nolimit7582
    @nolimit7582 ปีที่แล้ว +1

    Why LORA? What about LyCORIS?

  • @traida111
    @traida111 11 หลายเดือนก่อน

    can you use Lora to train things like body positions? So if I had a few different people in the same positon, would it work?

  • @cleverestx
    @cleverestx ปีที่แล้ว

    Can someone help? I'm getting this when I click to train after following this video: "No data found. Please verify arguments (train_data_dir must be the parent of folders with images)"

  • @TheGarugc
    @TheGarugc ปีที่แล้ว

    Don't you use regularization images for training?

  • @walidflux
    @walidflux ปีที่แล้ว

    in Kohya ss there is an option to convert model to lora can you dive into that please

  • @weatoris
    @weatoris ปีที่แล้ว

    Is hyper realistic more realistic than realistic? 🤔

  • @kayinsho2558
    @kayinsho2558 7 หลายเดือนก่อน

    Can you use this on a Mac M1?

  • @bilybob-c4p
    @bilybob-c4p ปีที่แล้ว

    Oh, I thought the opposite; I thought you wanted to get every angle, every lighting condition etc...

  • @ZombieAben
    @ZombieAben 10 หลายเดือนก่อน

    Training will take multiple days to do on my Nvidia 1660 if done with the proposed repetitions and Epochs. Is there anything i can do or do i need a better graphic card like a 2080 ti or above? Maybe my source images have a to high resolution and i need to either crop, resize or do both. Will that speed up the process?

    • @HollySwanson
      @HollySwanson 9 หลายเดือนก่อน

      I have a 4060ti 16GB and it takes 30-60 minutes per model if you want it hyper realistic. I think there are some deals on for Christmas on amazon for 3060s and 4060tis

  • @TanvirsTechTalk
    @TanvirsTechTalk 7 หลายเดือนก่อน

    what is your discord channel?

  • @RenoRivsan
    @RenoRivsan 10 หลายเดือนก่อน

    This guys is a pro,. BUT he makes everything so complicated! dang!!

    • @RenoRivsan
      @RenoRivsan 10 หลายเดือนก่อน

      meaning do not get to the point and keeps adding more topics*

  • @gohan2091
    @gohan2091 ปีที่แล้ว

    The image folder is called "bf16, network 256 alpha 128" but in your training settings you are using fp16, network at 8 and alpha at 1 so I am very confused why you named your folder like that

    • @PhilipRikoZen
      @PhilipRikoZen ปีที่แล้ว

      He choosed the name of the folder by manually creating the folder in windows and typing the text you see, in the tutorial he doesn't actually create that folder nor it comes from Kohya, during the examples he makes at the end generating pictures he's actually using the Lora he created before the video which was trained with those values, bf16 and higher network and alpha. Long story short, if you computer can, try and generate with much higher network and alpha then the default 8 and 1.

    • @gohan2091
      @gohan2091 ปีที่แล้ว +1

      @@PhilipRikoZen I have a 4090 and used alpha and network at like 64/64 and 128/64 with koyha and when it generates I get only black images even when j lower the weight but at 1 and 8 it's fine. Any idea?

    • @thebrokenglasskids5196
      @thebrokenglasskids5196 ปีที่แล้ว +1

      @@gohan2091 Using settings that high for Network Alpha will often trigger the Nan error while training your Lora. I would suggest lowering it to something like 16 and test. If your Lora works in SD then increase it in Kohya and keep doing so until you get the Nan error. You'll know if the error is happening by watching the "loss=x.x" value during your training. If at any point it changes to "loss=nan" then your setting is too high and you've errored your Lora into one that will only render nan in SD(that's why you get nothing but a black image).
      The Network Rank is fine to keep at 128. In fact you should, as lowering that lowers the file size and the quality of the resulting Lora will diminish. The Network Alpha is what triggers the Nan error, so that's the one to lower to fix it.
      For reference, I have an RTX 3060 12gb and usually train with Network Rank 128 and Network Alpha 16 for character Loras using a dataset of 50 images @ 24 steps per image and 3 Epochs.
      Hope that helps.

    • @gohan2091
      @gohan2091 ปีที่แล้ว +1

      @@thebrokenglasskids5196 thanks. I think I used values 1 and 8 in my last Lora. Results aren't great but it kind of works. This seems like a guessing game. Picking numbers at random and see what works without any reel understanding on what's going on lol

  • @Sbill92085
    @Sbill92085 ปีที่แล้ว

    Does this process work the same with SDXL?

  • @seifergunblade9857
    @seifergunblade9857 ปีที่แล้ว

    can laptop rtx 3060 6gb vram use for training lora?

  • @Aks15314
    @Aks15314 ปีที่แล้ว

    Any one can help can i train lora on colab stable diffusion ?

  • @metamon2704
    @metamon2704 ปีที่แล้ว

    Unfortunately there is a new version of the ui already and several things have changed - like the ability to click the icon to select another model is not there.

  • @sin2pie432
    @sin2pie432 7 หลายเดือนก่อน

    Why are you using LoRA if you don't need an 8-bit transformation? What pipeline? LoRA is for training scenarios that are too resource intensive for your training environment. LoRA will not improve results in any scenario. Rather the opposite, in many use cases it is not lossless. I spend too much time writing code and forget people are actually using this in the real world.

  • @sinayagubi8805
    @sinayagubi8805 ปีที่แล้ว

    If this wasn't the right tutorial for you, watch my tutorial how to make a lora with just one image from your back with your camera off 👍

  • @dannous
    @dannous ปีที่แล้ว

    basically you need to use the same pictures you use for your passport or green card lottery :D

  • @willpulier
    @willpulier ปีที่แล้ว +1

    How about SDXL?

    • @OlivioSarikas
      @OlivioSarikas  ปีที่แล้ว +1

      There isn't even a good A1111 implementation for it yet to render image ;)

  • @mohegyux4072
    @mohegyux4072 7 หลายเดือนก่อน

    23:30 i think you're wrong here

  • @nwkproductions4424
    @nwkproductions4424 ปีที่แล้ว +1

    Vllt. bin ich das erste mal in meinem Leben erster?🥹

  • @360travels9
    @360travels9 ปีที่แล้ว

    is there something better than Roop for faceswap with higher resolution?

    • @ramn_
      @ramn_ ปีที่แล้ว

      you need to pay

  • @adelalfusail1821
    @adelalfusail1821 ปีที่แล้ว +1

    man, I got disappointed hearing you saying" Experiment With That" throughout the video. since you already did , why didn't you share with us

  • @SaxophoneChihuahua
    @SaxophoneChihuahua ปีที่แล้ว

    lycoris is better than lora

  • @pastuh
    @pastuh ปีที่แล้ว +2

    Currently, the trend is to create Lora "Sliders."
    It would improve your visitor numbers if you could create a tutorial on how to make them.

  • @gulfblue
    @gulfblue ปีที่แล้ว

    How do you find stable diffusion's trained faces based on your original input photos? (28:41 where are you getting this comparison?)

  • @xxxhihihixxx
    @xxxhihihixxx ปีที่แล้ว

    Awesome, thank you!
    But I'm getting out of memory crashes even if I use