Understanding Prompting for Stable diffusion in ComfyUI

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ธ.ค. 2024

ความคิดเห็น • 105

  • @J-ld9cl
    @J-ld9cl หลายเดือนก่อน

    WOW I wasn't expecting to understand anything but you made it easy to understand even to someone with limited python experience.

  • @centurionstrengthandfitnes3694
    @centurionstrengthandfitnes3694 4 หลายเดือนก่อน +2

    At first, I thought 'This is way above my tech ability to understand!' But you were so logical and clear in how you put it all across, I actually understood most of it. Thanks for a great lesson. Subbed!

    • @CodeCraftersCorner
      @CodeCraftersCorner  4 หลายเดือนก่อน

      Glad it was helpful! Thanks for the sub!

  • @HurdRandy
    @HurdRandy 11 หลายเดือนก่อน +1

    This channel is perfect for me, I could watch for hours.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @HurdRandy! I appreciate your kind words.

  • @WalidDingsdale
    @WalidDingsdale 11 หลายเดือนก่อน +1

    Thank you for sharing this in-depth technical exploration which unveals the secrete happened behind UI.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @WalidDingsdale! I'm glad you found the technical details helpful.

  • @salmanmunawar1
    @salmanmunawar1 11 หลายเดือนก่อน +2

    This was awesome! Watched to the last second🎉 Great pace and explanation of CLIP node. Goes beyond comfyui to really show what and how prompts work. Looking forward to more in depth videos like this.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Thank you, @salmanmunawar1! I'm glad the pace was right and explanation helpful. I'll definitely continue to create similar content in the future. Thank you for the positive feedback.

    • @salmanmunawar1
      @salmanmunawar1 11 หลายเดือนก่อน +1

      I got into AI imaging about three weeks back with fooocus, A1111, koyha ss then invokeAi and comfyui. All have similar concepts under the hood. There is so much terminology you come across it leads to a lot of trial and error, mystery and unexpected results, it’s frustrating. From this video I now know how prompts are parsed. What causes color bleeding, or what are tokens…there is a lot of information here without any filler or repetitive content. Worth multiple re-watches :) I hope you continue to be demystify the image generation process. Looking forward to your next video lesson.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Yes, exploring and experiencing these stable diffusion user interfaces is great, as each provides its own benefits. I appreciate the support.

  • @ArielTavori
    @ArielTavori 11 หลายเดือนก่อน +1

    This information is pure gold, and I very much appreciate your clear and concise style! 🙏🏻

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @ArielTavori! I'm glad to hear that you found the information valuable.

  • @pixelhusten
    @pixelhusten 11 หลายเดือนก่อน +2

    I thought it was great, especially because you also explain how the AI deals with the prompts. I'd love to see more

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @SaschaFuchs! I appreciate your positive feedback. I'll definitely aim to create similar content in the future.

  • @Utoko
    @Utoko 11 หลายเดือนก่อน +2

    Keep it up, always interesting/educational videos.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @Utoko! I'm glad this video was helpful. I was worried it was too technical.

    • @Utoko
      @Utoko 11 หลายเดือนก่อน +2

      @@CodeCraftersCorner I am sure it is for many people, that being said that is why I enjoy your channel a lot, where you get real understanding and not just the effect.
      Hope your channel can grow and enough people which value it. gl

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you so much, @Utoko! Your kind words and support mean a lot. Appreciate your well wishes for the channel's growth.

  • @banchew3983
    @banchew3983 11 หลายเดือนก่อน +2

    I can see you doubled up your subscription since my last visit. Well done! And a Happy New Year.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @banchew3983! Your continued support is greatly appreciated! Thank you and Happy New Year!

  • @randymonteith1660
    @randymonteith1660 11 หลายเดือนก่อน +1

    Somewhere I learned to use the word " BREAK " along the way so my prompt looks like this " 1girl, 26 years Old, white beach hat, BREAK blonde, BREAK green eyes," and this works. Great explanation !!

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Thank you, @randymonteith1660! I'm glad you found the explanation helpful and appreciate your input the the word "BREAK". I'll look into it and do some testing.

    • @finald1316
      @finald1316 11 หลายเดือนก่อน +1

      Isn't break a A1111 thing to separate conds into isolated chunks? There is probably a node for it in comfy, just like combine is a replacement for AND.

    • @finald1316
      @finald1316 11 หลายเดือนก่อน +1

      This explanation got me curious on what would happen if I average hat with the color, etc, and then concat all. Gotta test that 1.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Update: the word BREAK did not work in ComfyUI for me. Thanks @finald1316, for clarifying that it's for A1111. I tried a few times and thought i was using the word the wrong way. I am not aware of such a node. If I come across one, I will update.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +2

      The ConditioningAverage can be used to merge two or more concepts like making a hybrid horse + crocodile. Definitely give it a try.

  • @glibsonoran
    @glibsonoran 11 หลายเดือนก่อน +1

    Very nice primer on prompting, thank you

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @glibsonoran, for you continuous support!

  • @sidewinder979
    @sidewinder979 11 หลายเดือนก่อน +1

    Exactly what I was looking for, thank you.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @sidewinder979! I'm glad the content matched what you were looking for.

  • @lhovav
    @lhovav 10 หลายเดือนก่อน +1

    That is an amazing tutorial and approach to learn SD, well done!

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      Thank you very much, @lhovav! Appreciate the support.

  •  11 หลายเดือนก่อน +1

    You got my attention now you have my curiosity.. Thank you

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @SezginAhmetKaratas! Glad you found the content interesting!

  • @ferniclestix
    @ferniclestix 11 หลายเดือนก่อน +2

    Very nice deepdive on clip text, thank you very much, I think I get clip text a little better now :D

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @ferniclestix! I'm glad the video was helpful. Appreciate all your hard work bringing rich knowledge to the community.

  • @gabrieldaley5881
    @gabrieldaley5881 10 หลายเดือนก่อน +1

    Thank you for this video. This is great information.

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      Thank you for watching, @gabrieldaley5881! Glad the video was helpful.

  • @henrischomacker6097
    @henrischomacker6097 10 หลายเดือนก่อน +1

    Excellent video! Thanks very much.

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      Glad you liked it, @henrischomacker6097!

  • @PallaviChauhan91
    @PallaviChauhan91 10 หลายเดือนก่อน +1

    great explanation, althhough got very technical at some points but in conclusion I came to understand how prompting works for stable diffusion and why on civit ai there are different prompt guidelines given beneath diff models and why I never get good results with using just prompts :D

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน +1

      Thank you, @PallaviChauhan91! I'm glad you found the explanation helpful, even with the technical details.

  • @astr010
    @astr010 9 หลายเดือนก่อน

    Earned a subscriber this video is super clean thank you.

  • @kpr2
    @kpr2 11 หลายเดือนก่อน +1

    A highly enlightening look under the ComfyUI hood. Much appreciated! Now I'm off to see if there happen to be any "token count" nodes out there or if I'm going to have to write one myself, just to make life easier.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Thank you, @kpr2! I'm glad you found the video interesting. Good luck with the token counter.

  • @yvann.mp4
    @yvann.mp4 11 หลายเดือนก่อน +1

    Exactly what i needed, thanks a lot

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @yvann_mp4! Happy to help.

  • @David-id6jw
    @David-id6jw 11 หลายเดือนก่อน +1

    @5:30 chapter: From my understanding, setting the last clip layer to -1 is _not_ removing the last layer. The issue is that there are like three ways to represent the issue that this is trying to resolve, and it's different between ComfyUI and A1111, which are themselves different from what someone sitting down and looking at the problem from a pure UI perspective would probably do. ComfyUI is using the value as a direct index.
    With the caveat that this is from memory, what happens is that there's an array of layers - we'll say an array of [A, B, C, D] as a simple illustration - and we want to make it so that the system disregards a certain number of layers at the end.
    In most programming languages (including Python), arrays are indexed starting at 0. That is, array index 0 will get you A, then array index 1 will get you B, index 2 will get you C, and index 3 will get you D. We want to drop D, but we don't want to require that the user know how many layers there are. In Python, you can do this by using negative numbers to index from the _end_ of the array instead. So -1 is D, -2 is C, -3 is B, and -4 is A.
    All well and good, but the node is specifying what the last layer is, not the first layer getting dropped (or the number of layers getting dropped). So if you set it to -1, you're saying that the last layer that the system will use is -1 (D), which means you're not actually dropping any layers. What you need to do is set it to -2 (C), so that C is the last layer used, and D is discarded, if you want to drop the last layer.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @David-id6jw for providing a detailed explanation and sharing with the community.

  • @Ken1171Designs
    @Ken1171Designs 11 หลายเดือนก่อน +1

    Very useful and well explained! I now feel like I have a MUCH better understanding of how prompting works, though I also understood it depends a lot on how individual models were trained. Makes me wish there was a way to query a model to know what kind of prompts it prefers.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @Ken1171Designs! You're right, individual models can have their own preferences. I like the idea of querying a model for its preferred prompts. I appreciate your feedback. Thank you for watching.

    • @Ken1171Designs
      @Ken1171Designs 11 หลายเดือนก่อน +1

      @@CodeCraftersCorner The way I handle this is to keep a copy of the example prompts for the model, so I use them as a starting point. However, that alone is not enough. I also copy the recommended settings to get results similar to the example ones. This is a lot of work, so automated the task with a Python scraper, but they changed the site to use dynamic pages, making the process harder. That's why I wish there was a way to simply query the models to know the kind of prompt and the recommended params. I wonder if that could be added to the models as metadata?

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      It might be possible but I am not sure how that will affect the training process. The way I handle it is to manually set the settings / prompts a first time and save the workflow in JSON format. The next time, I simply load the workflow and continue from there. It is not perfect but at least I am only entering the settings once.

    • @Ken1171Designs
      @Ken1171Designs 11 หลายเดือนก่อน +1

      @@CodeCraftersCorner The problem with JSON is that we can only save 1 set of prompts and params that fit 1 model. The moment we change models, we may need a completely different set.
      When I mentioned adding this info to the model as metadata, I meant adding it after the training as extra info, similar to how PNGs have reserved space for metadata that is used for all sorts of things. I know models may not have such things, but I leave it as suggestion to add a metadata section to AI models for this purpose. I am sure people will find many uses for it. :)

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Good suggestion! For now, I am using the Note node to hold any additional prompts or information before saving the workflow.

  • @JWsDPP
    @JWsDPP 6 หลายเดือนก่อน

    Awesome tutorial, thanks for this!

  • @francaleu7777
    @francaleu7777 11 หลายเดือนก่อน +1

    Thank you!!

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      You're welcome, @francaleu7777! Thank you for watching!

  •  5 หลายเดือนก่อน +1

    Thank you

  • @romanbulatnikov7157
    @romanbulatnikov7157 11 หลายเดือนก่อน +2

    Super useful bravo

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @romanbulatnikov7157! Glad the video was helpful.

  • @Painfroi
    @Painfroi 8 หลายเดือนก่อน

    damn that video was tight. bless you man.

  • @AIAngelGallery
    @AIAngelGallery 7 หลายเดือนก่อน

    really good explanation, thx!

  • @MoraFermi
    @MoraFermi 11 หลายเดือนก่อน +8

    float32 does NOT hold 32 significant digits! The number signifies the amount of *bits* used to store the whole float: the sign, mantissa *and* exponent. Since mantissa is defined as 23 bits, this gives only 6 to 9 significant digits to go around.

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +2

      Thank you for the clarification, @MoraFermi! I appreciate your input on clarifying the precision of float32. There's a correction in the video description to ensure accurate information.

  • @sevenbells5347
    @sevenbells5347 11 หลายเดือนก่อน +1

    great!! i loved it

  • @KodandocomFaria
    @KodandocomFaria 11 หลายเดือนก่อน +1

    This is very useful. I had a lot of troubles with clothes, I was asking black, white or any other colors and it persists to be green, now with this combine maybe I can fix it

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you, @KodandocomFaria! I'm glad to hear that the video was helpful for you. When experimenting with different conditioning nodes, consider using weights in your prompts. It can make a significant difference in addressing color consistency issues. From my testing, it is not 100% guarantee as each model is trained differently. Best of luck!

  • @muc2810
    @muc2810 10 หลายเดือนก่อน +1

    thank you for your great effort to explain ist, amazing. I just tried your prompt and added "blue jeans, black shirt, red sneaker". Randomly it ignored one or two of the elements. I don't understand why.

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      Thank you for trying out the prompt, @muc2810! Different models can react differently to prompts. Also, the same model can be trained more on a particular term. For example, if during training, the model was exposed to blue jeans more than red jeans, the model will react more to blue jeans than red ones. It's best to experiment by adjusting or removing certain terms and see the influence on the final output. Play with the prompt weights and try various conditioning nodes. You can also try inpainting, image manipulation, post processing to optimize the results.

  • @alencg2518
    @alencg2518 11 หลายเดือนก่อน +1

    very good

  • @murphylanga
    @murphylanga 11 หลายเดือนก่อน +1

    super, thanks🥰

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Thank you very much, @murphylanga147! Appreciate the support!

  • @BrokoFankone
    @BrokoFankone 5 หลายเดือนก่อน

    This is one of the best videos on the subject, I actually understand what's going on - thanks a ton for making this and explaining >everything< so well!

  • @fernandomasotto
    @fernandomasotto 9 หลายเดือนก่อน

    this is a great video with very well explained concepts. i finally understood a little of what´s behind prompting. Thank you! now i see old Automatic 1111 had a limit of 75 tokens per prompt. so in thath case it never got to generate a batch of 77 tokens right?

    • @CodeCraftersCorner
      @CodeCraftersCorner  9 หลายเดือนก่อน +1

      Thank you @fernandomasotto! I am not sure how Automatic1111 was doing it. It just means that batches of 77 tokens are passed.

  • @krio_gen
    @krio_gen 6 หลายเดือนก่อน

    Thanks!

  • @crystian
    @crystian 11 หลายเดือนก่อน +1

    thanks!

  • @mehradbayat9665
    @mehradbayat9665 10 หลายเดือนก่อน +1

    Can you explain how the 'clip set last layer' node works?

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      Yes, @mehradbayat9665! In neural networks, there are input and hidden layers which are responsible for receiving input data and converting it into vector representations which machine can understand. Let's say, the input data is text. The text gets pass to the input layer which outputs a representation of the data. This is then passed on to hidden layers (can be many). Let's say, we have 3 hidden layers. If we do not set the clip set last layer, the data will go from input layer to all 3 hidden layers and then gets outputted. If we set the clip set last layer to -2, the last hidden layer will be skipped (input to hidden layer 1 and 2 only). Hidden layers perform computations using weights and biases. In general, less hidden layers means less intricate detail about the input data is captured. It's best to experiment as each model will give different results.

    • @mehradbayat9665
      @mehradbayat9665 10 หลายเดือนก่อน +1

      @@CodeCraftersCorner Does clip training perform back-propagation? If the -2 layer is set, is this assumed?

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      From what I can see in the code base, it does not explicitly show backpropagation. the clip_layer method is specifically setting the layer index. If you want to take a look at the code base, you can go to ComfyUI portable folder > ComfyUI > comfy > sd.py > lines 88 to 151. There's the CLIP class and the method clip_layer.

  • @edwardwilliams2564
    @edwardwilliams2564 11 หลายเดือนก่อน +1

    what is it that you use to mark and highlight everything on your screen?

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Thank you for watching, @edwardwilliams2564! I use ZoomIt from Microsoft to annotate on the screen. It's a small utility which allows zooming and drawing on screen. It does not have a highlighter mode but you can use the free draw option to get a similar effect.

    • @edwardwilliams2564
      @edwardwilliams2564 11 หลายเดือนก่อน +1

      Thank you

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน

      Happy to help!

  • @offmybach
    @offmybach 8 หลายเดือนก่อน

    Is there a way to use PDB to prevent mat1 and mat2 not matching, tensor numbers not same ,or need half got float etc.? Somehow correct the size errors before they error out. I also keep seeing the clip missing: ['clip_l.logit_scale', 'clip_l.transformer.text_projection.weight'] error while loading comfyui.

    • @CodeCraftersCorner
      @CodeCraftersCorner  8 หลายเดือนก่อน

      Hello @offmybach! The error is more like a warning. The content of some checkpoints are different which shows the warnings but using it should work just fine. There an issue log here: bit.ly/43yQnhk

  • @lalayblog
    @lalayblog 11 หลายเดือนก่อน +1

    By using Combine average with weight 1.0 you just gave a weight of nearly zero to another part of prompt. This is why green eyes don't influence the picture. No green eyes so no green color bleeding.
    To me this video doesn't look correct, but I will recheck on my own. I'm sceptical still

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Thank you, @lalayblog, for catching that mistake! You're absolutely right about the weight of 1.0 affecting only one prompt. I was so focused on the "white beach hat" that I did not realize I forgot to change the weight. I have rerun the experiment with the same checkpoint, prompts, seed, cfg and sampler with a weight of 0.0, 0.25, 0.5, 0.75 and 1.0. I have posted the results here bit.ly/3H0doPy.

  • @othoapproto9603
    @othoapproto9603 11 หลายเดือนก่อน +1

    I learned so much, thank you for all your skill and hard work making this. I got to say, in the SD 1.5 images, she looks more like 40 years old. I wonder if that is because you use the word "Lady" As an American I would only use Lady with older women. I don't know what bias the Models have however conversationally, Girl would be up to 25 or so, Women is older, and lady or old lady even older. I would doubt many on this list would work en.wikipedia.org/wiki/Category:Slang_terms_for_women

    • @CodeCraftersCorner
      @CodeCraftersCorner  11 หลายเดือนก่อน +1

      Thank you, @othoapproto9603! I appreciate your kind words and feedback. You raise an interesting point and your observations on the words influencing the models are valid. I must give this a try. It's fascinating how the language choices can impact AI's interpretation. Thank you for the link!

    • @gameplayfirst-ger
      @gameplayfirst-ger 10 หลายเดือนก่อน +1

      girl in most models in a teenager in sd/sdxl. so you should avoid it if you don't really want a very young woman.

    • @CodeCraftersCorner
      @CodeCraftersCorner  10 หลายเดือนก่อน

      Thanks for the input, @gameplayfirst6548! I appreciate your advice on the models' ages.

  • @sochan2ktham518
    @sochan2ktham518 2 หลายเดือนก่อน

    Thank you