Kasucast #22 - Stable Diffusion: High-resolution advanced inpainting ComfyUI (rgthree, IP-Adapter)

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ก.ย. 2024

ความคิดเห็น • 31

  • @kasukanra
    @kasukanra  6 หลายเดือนก่อน +6

    As I mentioned in the video, I'm joining StabilityAI in April 2024. Thanks for all the channel support. I've received permission to continue making YT videos as long as I don't violate NDA :)
    Workflows are here: github.com/kudou-reira/SDXL_training_settings/tree/main/comfyUI_workflow/high_resolution_inpainting
    Full timestamps:
    Time stamps
    00:00 Introduction
    03:24 Resizing the image
    05:04 Checking image size
    07:00 Default mask
    08:12 Preview bridge
    09:23 Canvas tab
    11:35 Naive inpainting
    13:48 Issues with naive inpainting
    15:06 Mask to region
    16:34 Cut by mask
    19:04 Resizing the crop image
    19:47 Resizing the crop mask
    20:45 Encoding the masked crop
    21:48 Comparing naive and high-resolution inpainting
    22:44 Compositing ksampler crop back onto the original image
    25:40 Checking the robustness of the high-resolution inpainting workflow
    27:32 Image blend by mask
    27:56 Image downsampling on the composite
    29:30 Acly's ComfyUI inpaint nodes (overview)
    30:56 Acly's pre-process workflow
    33:20 Fill masked options
    34:38 Blur masked area
    36:16 Fast inpaint
    36:50 Outpainting with SDXL
    41:44 Fill masked area (setup) with high-resolution inpainting
    43:04 Rgthree nodes introduction
    46:04 Fill masked area (visualize)
    46:52 Integrating Fooocus patch
    49:00 Tensor size mismatch error
    50:16 Adding ControlNet depth
    51:58 Rgthree bypass node on ControlNet depth
    53:00 Adding ControlNet depth to the Fooocus patch workflow
    54:06 Fill masked area integration
    55:06 Comparing results with Acly pre-process
    56:10 Image to mask error + fix
    59:02 Fill masked area (blur)
    59:36 Alternative blur method
    01:00:34 Image composite masked
    01:02:08 Fast inpaint
    01:03:12 IP-Adapter overview
    01:04:12 Removing redundant nodes
    01:06:14 Series workflow (fill masked area)
    01:07:40 Series workflow (blur masked area)
    01:08:22 Group bypass/muter
    01:09:12 Series workflow (fast inpaint model)
    01:09:42 IP-Adapter crash course
    01:11:20 Bypassing the IP-Adapter
    01:12:20 Using high-resolution fast inpaint to remove objects
    01:13:16 Applying reference image to high-resolution inpainting workflow
    01:15:48 Integrating attention masking to the IP-Adapter inpainting
    01:16:36 Why do we need double masking (attention + normal)?
    01:18:06 Using IP-Adapter with multiple reference images
    01:19:32 Addressing the opacity blend issue
    01:20:48 Applying ControlNet to IP-Adapter
    01:21:22 Adding pre-processing methods to IP-Adapter
    01:22:29 Using multiple reference images for IP-Adapter
    01:23:44 Compositing images inside ComfyUI (Canvas Tab)
    01:26:48 Switch nodes (Comfy Impact)
    01:29:16 Rgthree bookmarks
    01:30:28 Pad image for outpainting
    01:31:12 Integrating image padding into high-resolution inpainting workflow pt.1
    01:31:44 pt.2
    01:32:56 Outpainting from a bust-up image
    01:34:20 Context nodes (rgthree)
    01:35:52 Context switch (rgthree)
    01:38:32 Replacing switch any with context switch
    01:39:52 Toggle control system basics
    01:41:40 Toggle between txt-2-img and high-resolution inpainting
    01:43:12 Linking state across groups with relay node
    01:44:20 Fixing the one-way relay issue
    01:45:00 Linking the state of txt-2-img to image save node
    01:46:00 Debugging the outpainting workflow
    01:49:12 Debugging why both txt-2-img and inpainting run at the same time
    01:51:00 Giving txt-2-img its own full scale/non-masked reference image
    01:54:00 Conclusion

    • @Sheevlord
      @Sheevlord 6 หลายเดือนก่อน

      Congrats!

    • @fulldivemedia
      @fulldivemedia หลายเดือนก่อน

      you deserve it hun, I wish you success and happiness

  • @bentontramell
    @bentontramell หลายเดือนก่อน

    I never thought i would be taken back to Fluid Mechanics while watching a comfyui video 😅

  • @fulldivemedia
    @fulldivemedia หลายเดือนก่อน

    thanks

  • @8561
    @8561 4 หลายเดือนก่อน

    Great vid! I tend to find myself wanting to conserve my productivity bandwidth (decrease mental latency) 1:24:11! Thanks for the heads up haha. Must've taken you a while to edit, appreciate all the effort, I learned a couple tricks!

  • @polygonjuggler8163
    @polygonjuggler8163 28 วันที่ผ่านมา

    Pretty nice to remember most inpainting workflows in one video and compare different similar processes in one go. Always wanted to test them myself side by side but never had the time to do so, and just Learning each procedure one by one only gives you a vague idea of pros and cons of each option is the overall "winner" or at least the best for a given situation. Thanks a lot, very fun to watch as well.
    Nothing personal but watching so many non-native english speakers with weird accents makes me nervous (I'm a CG/VFX artist with both ADHD and ASD which help me a lot to focus in the things I love and get obsessed with like 3D/CG/VFX and now Generative Ai, but also anything that perturbs my attention like a non-standard accent drives me crazy, sorry for all the non-native speakers out there 🫣 it has nothing to do with segregation, racism, or anything like that, just phonemes and sounds my brain is not expecting in english and makes me unable to focus on anything else 🤦🏻‍♂️,
    I'm sure other fellow ADHDers and TEAers can relate.
    So finding new (natural sounding) English speaking and technically aquainted Ai demoers is quite a challenge. And I'm really happy to have stumbled upon your channel, subscribing and liking right away.
    There are lots and lots of "newcomers" to the world of digital "art" that have never worked in anything related to drawing, painting, sculpting, 2d, 3d, VFX, and CG in general.
    Sometimes this newcomers and self-called Ai experts that worked in who knows what before they decided to create youtube channels give advices that often make no sense whatsoever.
    Just because something works in one specific test doesn't mean it's technically and artistically correct, nor even what the people behind the idea and code proposed it to be used.
    Most of them don't take the time to read papers, and understand what's behind a certain idea or technique. Nor know anything about Navier-stokes, Euler, Karras, probabilistic algorithms, image Sampling, noise sampling, progressive refinement, Gaussian, perlin, brownian, noises, anything math related in the image creation field , and so on and so forth.
    And as you said: good for them....for most people learnint Ai to have fun, make cool wallpapers for their PC, or Phones, some funny memes, and the like knowing anything more than what button to click, and what checkpoint to use is more than enough.
    But for those of us who are not just having fun with a new tech that allows you to create cool images, with little to no talent, knowledge or expertise. But who are trying to learn, understand, and know all we can about this tech and starting to apply it to our daily workflows, and oiling up our gears and getting ready for what's coming sooner or later to our industry, finding channels like yours is really like finding a needle in a hay stack.
    The future of CGI is inevitable linked to the future of Ai, even if those who predict we're all on the brink of extinction/unemployment, like it or not, Ai is here to stay!
    Human decisions, talent, aesthetical taste and directed problem solving is not something machines will be able to do anytime soon.
    So either as plugins, native options or entitely new software for designing, creating, manipulating, animating, editing, and compositing with Ai tools, be ought to be ready today. Tomorrow might be already too late.
    And in that regard it's pretty frustrating to watch 10 or more minutes of wordy intros, excessive hype, and lots of nonsense or incorrect workflows that "just happen to work" but give you little control, or are slower to generate.
    Whether you are native or not doesn't mean a thing, your English and pronunciation is flawless. I suppose you are an asian descendant but raised in the US, because of your user name, the kind of art you show, and your calm way of speaking, just guessing here, of course, but whoever you are THANKS A LOT for sharing your knowledge tests and findings.
    Sorry for the super long comment (it's part of the ADHD side of my personality, I'm very talkative and completely unable to summarize my ideas when I write or type 😅)
    Kudos, congrats for your starting at Stability Ai, and please, keep up with your great videos!

    • @kasukanra
      @kasukanra  28 วันที่ผ่านมา

      Thanks for your kind words. As you have correctly guessed, I was born, lived, and studied in the United States for my entire life until deciding to move to Japan.
      There are a few reasons why I made a TH-cam channel in the first place.
      First, I felt that there were plenty of underqualified creators with poor quality channels trying to distort/shortcut design, CGI, and machine learning/generative AI. It's probably a large reason why my models/techniques/workflows are very different from the generic Gen AI influencer.
      Second, I like to get very technical and also have the traditional industry pipeline training, so I wanted to show how I would use these new tools to help me in my creative process.
      Thanks for stopping by!

  • @maximood-tired
    @maximood-tired 4 หลายเดือนก่อน

    Congratulations! It sounds really exciting!

    • @kasukanra
      @kasukanra  4 หลายเดือนก่อน +1

      Thanks for the kind words! It's been fun so far.

  • @LecrazyMaffe
    @LecrazyMaffe 6 หลายเดือนก่อน

    Congrats on the exciting new position!

  • @aleksandrivanov973
    @aleksandrivanov973 4 หลายเดือนก่อน +1

    The best! I looked through a bunch of tutorials, everywhere it’s just “connect this here, then here” and what it all does and how it works? God bless you man you saved my day

    • @kasukanra
      @kasukanra  4 หลายเดือนก่อน +1

      Thanks for the kind words. 2 hours is a long time, so I understand that it's not for everyone. But, I personally thought it was a resource worth making. Hopefully, this helps everyone create/improve their own workflows.

  • @onewiththefreaks3664
    @onewiththefreaks3664 4 หลายเดือนก่อน

    Thansk for your video. This is a truckload of very good material. And congrats on your new job!
    I might have stumbled upon another bug here. The blur masked node lets my SD server simply crash whenever I use a blur > 5. I also had those other bugs you had., although the weird checkered alpha mask occured AFTER I used your "hack". Before that, I "just" had problems with getting the mask blurred at all. Something here seems very buggy. Never ran in such a buggy situation with comfy before!

    • @kasukanra
      @kasukanra  4 หลายเดือนก่อน

      Thanks for the kind words. For the blur crash, I suggest trying other nodes that are supposed to do the same thing from other node creators if it doesn't work. Blurring at a high strength might be computationally difficult task (not sure). Also, for the checkered alpha mask, it's either a bug with newer version of Acly's inpaint nodes, or you'll just have to manually change the grow mask each time. Or, you can blur the mask, then solidify it again with a certain intensity threshold (127 is gray, 0 is black, so pixels around 10? or less must be converted to black).

  • @bxl2012
    @bxl2012 4 หลายเดือนก่อน

    Do I understand the workflow correctly that the nodes will always have to be manually adjusted to each new picture and the cropped part? That's so much work for doing something that works with A1111 automatically. I wish there was an easier workflow than this, but thanks for pointing out a way to do it. I haven't seen anyone else doing a tutorial on high-resolution inpainting in ComfyUI.

    • @kasukanra
      @kasukanra  4 หลายเดือนก่อน +1

      What do you mean? You can change the crop resolution if you want, but I just leave it at 1024 x 1024 pixels usually. The workflow will automatically resize the inpainted area (assuming it's a single continuous entity and not separate blobs) into your desired crop resolution (i.e. 1024 x 1024 pixels), do the inpainting, then resize it back down at the end.
      That's the core logic. Everything else I added such as the pre-processing of the masked area (blurs, etc.) and ControlNets/IPAdapter are just extra levels of customizability.
      Maybe I'm not understanding your question correctly. However, this is the "simplest" way I could think of to do the same method as A1111 in ComfyUI.

  • @scopeyin3629
    @scopeyin3629 5 หลายเดือนก่อน

    Congratulations

  • @tck42
    @tck42 5 หลายเดือนก่อน

    Congrats on the position! Also I think the the reason for resize width and height not doing anything at 4:40 and 6:15 is due to the mode on the image resize. You use the mode drop down to pick between using rescale factor vs resize height and width to specify the operation.

    • @kasukanra
      @kasukanra  5 หลายเดือนก่อน +2

      Thanks! I think what you said is probably right and I'll try it when I have time. It would be useful for situations that require precise resolution.

  • @stablefaker
    @stablefaker 6 หลายเดือนก่อน

    Wish this video had a tl:dr esque section where if you just wanted to test the workflow you could more easily jump into & try the workflow that way.

    • @kasukanra
      @kasukanra  6 หลายเดือนก่อน +1

      I did that at first and showed some friends the workflows and explanations, but they couldn't follow it at all. So, that's why I made this video.

  • @ceesh5311
    @ceesh5311 3 หลายเดือนก่อน

    Congrats. Is Stability AI a fully remote company?

    • @kasukanra
      @kasukanra  3 หลายเดือนก่อน

      Thanks! There are offices in some countries, but if Stability AI is really interested in recruiting you, they will let you be remote from wherever.

    • @ceesh5311
      @ceesh5311 3 หลายเดือนก่อน

      @@kasukanra Nice man. Congrats, dream job

  • @ceesh5311
    @ceesh5311 2 หลายเดือนก่อน

    noob question: how to see the corresponding custom node library name on the top of each individual node?

    • @kasukanra
      @kasukanra  2 หลายเดือนก่อน +1

      If you have the ComfyUI Manager add on, you can open it up, go to the fourth option on the left (Badge), and change it to #ID Nickname.

    • @ceesh5311
      @ceesh5311 2 หลายเดือนก่อน

      @@kasukanra thanks

  • @salmanmunawar1
    @salmanmunawar1 6 หลายเดือนก่อน

    Congrats on new role….

  • @1lllllllll1
    @1lllllllll1 17 วันที่ผ่านมา

    Great example of a very knowledgeable person who is sadly a terrible teacher. I wish you would understand didactics and how to convey knowledge in a manner suitable to those who don’t already know 90 percent of this.
    You evidently attempt structure with those section titles, but each segment is all over the place and all you actually do is walk through the nodes, which any student of your topic can do in their own time.
    It would be a thumbs up if you would actually explain what is going on, and spend less time frantically zooming and panning all over this overly convoluted workflow.
    I grant you that you know this topic well, but as a tutorial, it really fails. Which is unfortunate given the effort put into it.

    • @kasukanra
      @kasukanra  15 วันที่ผ่านมา

      Thanks for the feedback. I've already started pulling back on creating detailed videos.
      Also, you are mistaken about this being a tutorial. I shared the workflow before creating this video, and even experienced people could not understand the logic.
      This is why I made this video. I agree with you that I'm not a teacher, but I don't claim to be one, either.