Animatediff perfect scenes. Any background with conditional masking. ComfyUI Animation

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ก.ค. 2024
  • Consistent animations with perfect blending of foreground and background in ComfyUI and AnimateDiff. Do you want to know how?
    Discord: / discord
    #animatediff #comfyui #stablediffusion
    ============================================================
    💪 Support this channel with a Super Thanks or a ko-fi! ko-fi.com/koalanation
    ☕ Amazing ComfyUI workflows: tinyurl.com/y9v2776r
    🚨 Use Runpod and access powerful GPUs for best ComfyUI experience at a fraction of the price. tinyurl.com/58x2bpp5 🤗
    ☁️ Starting in ComfyUI? Run it on the cloud without installation, very easy! ☁️
    👉 RunDiffusion: tinyurl.com/ypp84xjp 👉15% off first month with code 'koala15'
    👉 ThinkDiffusion: tinyurl.com/4nh2yyen
    🤑🤑🤑 FREE! Check my runnable workflows in OpenArt.ai: tinyurl.com/2twcmvya
    ============================================================
    This method explores the blending of a foreground character with any background you want to use avoiding the edges which occur when masking frames. This is done with masking BEFORE rendering the images, and using conditional masking to control the rendering of foreground and background, using ControlNet.
    Please, support this channel with a ko-fi!
    ko-fi.com/koalanation
    The workflow and frames to test this workflow are found in this Civit.AI article: tinyurl.com/w2xr3hj4
    The tutorial where I show the basic workflow using the InstantLora method + AnimateDiff:
    • AnimateDiff + Instant ...
    Basic requirements:
    ComfyUI: tinyurl.com/24srsvb3
    ComfyUI Manager: tinyurl.com/ycvm4e29
    Vast.ai: tinyurl.com/5n972ran
    Runpod: tinyurl.com/mvbh46hk
    Custom nodes:
    AnimateDiff Evolved: tinyurl.com/yrwz576p
    Advanced ControlNet custom node: tinyurl.com/yc3szuuf
    VideoHelper Suite: tinyurl.com/47hka2nn
    ControlNet Auxiliary preprocessors: tinyurl.com/3j3p6bjw
    IP Adapter: tinyurl.com/3x3f2rfw
    ComfyUI Impact pack: tinyurl.com/4jsmf8va
    ComfyUI Inspire pack: tinyurl.com/2wkzezxm
    KJ Nodes: github.com/kijai/ComfyUI-KJNodes
    WAS node Suite: tinyurl.com/2ajuh2mx
    Models:
    DreamShaper v8 (SD1.5): tinyurl.com/3rka67pa
    ControlNet v1.1: tinyurl.com/je85785u
    vae-ft-mse-840000-ema-pruned VAE: tinyurl.com/c9t6wntc
    ClipVision model for IP-Adapter: tinyurl.com/2wrtvnx4
    IP Adapter plus SD1.5: tinyurl.com/2p8ykxf6
    Motion Lora mm-stabilized-mid: tinyurl.com/mr42m5hp
    Upscale RealESRGANx2: tinyurl.com/2frvcyca
    Tracklist:
    00:00 Intro
    00:13 Method: approach and explanation
    00:32 Basic requirements
    00:41 Downloading and copy of assets from Civit AI
    01:16 Base AnimateDiff and Instant Lora: installing missing custom nodes and updating ComfyUI with Manager
    01:59 Models used in the workflow
    02:17 Testing the basic workflow
    02:40 Creating Lora reference image with a new different background
    workflow test
    05:13 ControlNet for the background
    06:10 Creating masks for the foreground (hero) and background
    07:38 Conditional masking (blending masks and controlnets of foreground + background)
    09:34 Running the workflow for all frames to create your full animation - including face detailing and frame interpolation
    10:18 Outro
    My other tutorials:
    AnimateDiff and Instant Lora: • AnimateDiff + Instant ...
    ComfyUI animation tutorial: • Stable Diffusion Comfy...
    Vast.ai: • ComfyUI - Vast.ai: tut...
    TrackAnything: • ComfyUI animation with...
    Videos: Pexels
    Music: TH-cam Music Library
    Edited with Canva, Runway.ml and ClipChamp
    Subscribe to Koala Nation Channel: cutt.ly/OZF0UhT
    © 2023 Koala Nation
    #comfyui #animatediff #stablediffusion
  • ภาพยนตร์และแอนิเมชัน

ความคิดเห็น • 42

  • @Foolsjoker
    @Foolsjoker 7 หลายเดือนก่อน +1

    I had been trying to do this workflow for almost a month, but I could never get the foreground and background to merge correctly. Obviously, mine had some major missing components compared to this. So glad you posted. Thank you!

    • @koalanation
      @koalanation  7 หลายเดือนก่อน

      Glad I could help! I tried several tricks: masking, inpainting, trying to add correction layers in the video editor...so it also took me a while to find out the way to do how I want it.

  • @skaramicke
    @skaramicke 4 หลายเดือนก่อน +2

    Couldn't you just reuse the mask from the compositing step when isolating background from foreground in the later stages?

    • @koalanation
      @koalanation  4 หลายเดือนก่อน +1

      The mask in the compounding is only applied to one image. For the background/foreground: we are creating an individual mask for each of the video frames. The first is static, the second 'dyanamic', so to say. I hope this resolves your doubts!

    • @skaramicke
      @skaramicke 4 หลายเดือนก่อน +1

      @@koalanation yes of course! Didn’t think of that.

  • @dkamhaji
    @dkamhaji 8 หลายเดือนก่อน

    hey yo! super great video and many interesting techniques going on here. I will definitely be integrating this into my workflow. so I do have a question though, I get you are moving the character with the open pose animation. but how is the background moving? (and camera) are you using some video input to drive that or something else like a motion lora?

    • @koalanation
      @koalanation  8 หลายเดือนก่อน

      For the background, I have used this pexels video as a base: tinyurl.com/yn4y8bdf
      I reversed the video in ezgif first.
      In the workflow, I tested a few preprocessors to see which ones work the best, and adjusted how many frames per second match better with the foreground. In this case, Zoe depth maps and MLSD work well. I adjusted the frequency of the frames for one every 3 frames, starting from frame 90 (in a VHS Load Video node).
      To avoid running the preprocessors all the time during my tests, I just extracted the same number of frames as in the OpenPose and saved them as images and used them in the final workflow.

    • @dkamhaji
      @dkamhaji 6 หลายเดือนก่อน

      Hello!@@koalanationIm building a workflow that has similar intentions to yours here - but with a different slant. Im just using seg masks to separate the BG from the Character and applying separate masks to each IP to influence the character and the background separately. everything works great except that I'm trying to apply the motion from the original input video to the new background created by the attention masked ip adpater. is there a world we can discuss this further to try to find some possible solutions? I would love to share this with you

  • @lukaso2258
    @lukaso2258 5 หลายเดือนก่อน

    Hey Koala, thank you very much for this guide, exactly what i needed. I have one question, this works very well with merging two conditions into one scene, but what if im using separete IPAdapter for background and character? I found way to merge the two IPAdapter outputs, but cant find way a to mask each IPA model for purpose of character and background. Do you see any solution for this? (In my workflow im doing lowres output of char and background first, then upscaling both and now im figuring out how to run it throught another sampler and properly merging them together)
    Thanks again for ur work

    • @koalanation
      @koalanation  5 หลายเดือนก่อน

      Nowadays, in the Apply IPAdapter node, there is the possible to use 'attn_mask', so you can use the two separated images (foreground/background). This gives you more flexibility regarding the type of IP adapter, strength, use of batches....
      When I was preparing the video, that was still not possible. You can also use CN with masks. So having different layers is possible in several ways.
      Results will be slightly different, though, depending on how you do it.
      Good luck

    • @lukaso2258
      @lukaso2258 5 หลายเดือนก่อน

      @@koalanation You are legend, it works :) Thank you!

  • @eyesta
    @eyesta 4 หลายเดือนก่อน

    Good video.
    Slightly different question.
    I made vid to vid in comfyui, my background changes, but I have a static background to replace, how to render the model/character on a green background like you have in this video?

    • @koalanation
      @koalanation  4 หลายเดือนก่อน +1

      There are several custom nodes that do that Check rembg, for example: github.com/Jcd1230/rembg-comfyui-node or Was Node suit. However, this go frame by frame and you will need to review them.
      I made a video using segmentation with track anything, but no one has developed a comfyui node/tool. It used to work very nicely, but I have not used it for a while: th-cam.com/video/HoTnTxlwdEw/w-d-xo.htmlsi=Pnlr-YUo-YmRz8UL
      At the end, I think it is easier and faster to use video editing software with rotoscope features: adobe premiere, DaVinci resolve or Runway.ml. I personally use runway.ml, but choose what you prefer

    • @eyesta
      @eyesta 4 หลายเดือนก่อน

      ty!@@koalanation

  • @matthewma7886
    @matthewma7886 6 หลายเดือนก่อน

    Great Workflow!That's what im looking for.
    But I run the workflow and get this error:
    Error occurred when executing ConditioningSetMaskAndCombine:
    too many values to unpack (expected 3)
    Does anyone know how to fix it?Thanks a lot:)

    • @matthewma7886
      @matthewma7886 6 หลายเดือนก่อน +1

      Try several ways,finally got the reason.the bug comes from the growmaskwithblur node,and the blur_radius.if blur_radius is not 0,the error happen.I thing there is a bug in this version of Browmaskwithblur.can use baussian blur mask or mask blur instead of blur function until the next version.

    • @koalanation
      @koalanation  6 หลายเดือนก่อน

      Hi! Thanks for checking it out! I had some time to look at it. As you say, it seems the error comes from the GrowWithMaskBlur Node.
      I checked the workflow and it seems like this Node has changed. The numbers are swapped, the blur radius of 20 appears in the lerp alpha field. And there is no sigma parameter...
      I have changed the values according to what is shown in the video (blur radius 20, lerp alpha 1 and decay factor 1, no sigma anymore), and the workflow works for me.

    • @matthewma7886
      @matthewma7886 5 หลายเดือนก่อน

      @@koalanation All right,bro.Great appreciate for your work:-)

  • @Disco_Tek
    @Disco_Tek 7 หลายเดือนก่อน +1

    Any idea how to keep consistent color for items like clothing in vid2vid? Also... you can rotoscope in Comfyui now?

    • @koalanation
      @koalanation  7 หลายเดือนก่อน

      For clothing consistency, I think playing with masks and SAM detector (with for example, deepfashion2) it should be possible. But personally I have struggled to get the masks correctly for all frames (with other animations). I did a video using Trackanything, which I think can track clothes nicely. I believe with the right workflow it should be possible to do nicer things, but playing to the masks is not straightforward, so I did not elaborated further.
      Regarding rotoscoping: yes, with SAM is possible, but I find easier and faster use video editors (Premiere, DaVinci...). When rotoscoping, you may eventually need to correct some frames, and with ComfyUI becomes a very tedious task. TrackAnything is more user friendly, for adjustments, but it is a pity is not really maintained or integrated into ComfyUI (that I am aware)

    • @Disco_Tek
      @Disco_Tek 7 หลายเดือนก่อน

      @@koalanation thanks for the reply. Yeah there has to be a way for consistent clothing and some lora's have helped but things like colors constantly want to shift. As far as rotoscoping I didn't know that was possible and normally just stick with runwayml if I need to do it.

    • @aivideos322
      @aivideos322 7 หลายเดือนก่อน +1

      @@Disco_Tek animate diff, use 24 frame context length, and 8 context stride, works for 48 frames, keep any text prompt short and dont repeat, (wool scarf, red scarf) is not good, wool red scarf works and do not mention scarf again in the prompt. If you want to describe it better, reword the original like. wool textured red long scarf. Prompting is very important, as is the model you choose.

    • @Disco_Tek
      @Disco_Tek 7 หลายเดือนก่อน

      @@aivideos322 I've been using a context length of 16 and a overlap of 12 lately with pretty good results. I will mess with prompts though the next time I trying running without a LORA. I'm usually then just using the upscaler to get me home. Any suggestions for color bleed for when I add color to clothing item to prevent it from polluting the rest of the image?

    • @aivideos322
      @aivideos322 7 หลายเดือนก่อน +1

      @@Disco_Tek colour bleed is a problem for even images, and I have found no real solution to that.
      For upscaling, you can use tile/lineart/temporalnet control nets to upscale reliably and full denoise the video for double or triple sizes. Can even colour the videos with a different model at this step. You can give more details in this prompt that tend to work better for colouring things. This step does not use the animate diff model, it uses whatever model you want and controlnet so it has more freedom to colour what your prompt says. I use impact pack nodes to turn batches into lists before the upscale to lower the memory used and allow larger upscales. This does each frame 1 by 1.

  • @Cioccolata-m7l
    @Cioccolata-m7l 7 หลายเดือนก่อน

    do you think this workflow will work with 8gb vram?

    • @koalanation
      @koalanation  7 หลายเดือนก่อน

      I understand with AnimatedDiff you need 10 but I have read people can also do it with less. In this workflow, though, we are in reality doing one render for the foreground and another for the background, so it actually takes longer...However. with LCM you can decrease the render time quite a lot. I just made a video about it and I am really happy with how LCM works (with the right settings)

    • @Cioccolata-m7l
      @Cioccolata-m7l 7 หลายเดือนก่อน +1

      @@koalanation Cool, right now I am using animatediff with 2 cn on only 8gb 😅I will try your workflow though.

  • @user-ts2fq1gp8b
    @user-ts2fq1gp8b 7 หลายเดือนก่อน

    I run the workflow and get this error:Error occurred when executing IPAdapterApply:
    Error(s) in loading state_dict for Resampler:
    size mismatch for proj_in.weight: copying a param with shape torch.Size([768, 1280]) from checkpoint, the shape in current model is torch.Size([768, 1664]).
    File "/root/ComfyUI/execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
    File "/root/ComfyUI/execution.py", line 83, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
    File "/root/ComfyUI/execution.py", line 76, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    File "/root/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 426, in apply_ipadapter
    self.ipadapter = IPAdapter(
    File "/root/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 175, in __init__
    self.image_proj_model.load_state_dict(ipadapter_model["image_proj"])
    File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:
    \t{}'.format(

    • @user-ts2fq1gp8b
      @user-ts2fq1gp8b 7 หลายเดือนก่อน

      Thanks if you have time to help solve this problem

    • @koalanation
      @koalanation  7 หลายเดือนก่อน

      Check out which model version and clipvision and IP adapter models you are using. I think this error is because maybe you are using a SDXL model. Change the checkpoint or the IP Adapter model and/or the clipvision

  • @StyleofPI
    @StyleofPI 2 หลายเดือนก่อน

    Load Image I change to Load Video, does it work? Video to Video

    • @koalanation
      @koalanation  2 หลายเดือนก่อน

      You can use the Load Video node too for the controlnet reference images.

  • @TheNewOption
    @TheNewOption 8 หลายเดือนก่อน

    Damn I'm behind on AI stuff, haven't seen this UI and is this a new version of SD?

    • @koalanation
      @koalanation  8 หลายเดือนก่อน +1

      Yep, everything goes quick lately...but you will catch up, no worries.

    • @happytoilet1
      @happytoilet1 8 หลายเดือนก่อน

      Good stuff. many thanks. if the scene is not generated by SD, say it's a real photo taken by a camera, can SD still merge character and the scene? Thank you. @@koalanation

    • @koalanation
      @koalanation  7 หลายเดือนก่อน

      Hi, thanks to you! I think so...but take into account that the output is also affected by the model and the prompt you use. I am more fan of using cartoon and anime animations, but I think if you use realistic models (such as realistic vision), I think you will get what you are aiming for.
      At the end, there is quite a bit of experimentation here. Change and play also with the weights of the adapters.

    • @happytoilet1
      @happytoilet1 7 หลายเดือนก่อน

      thank you for your advice. Really appreciate it. @@koalanation

  • @SparkFlowAAA
    @SparkFlowAAA 6 หลายเดือนก่อน

    Great tutorial and method!! I have an isse with ConditioningSetMask: Error occurred when executing ConditioningSetMask: too many values to unpack (expected 3). If u can help would be awesome. Thank you
    Error log:
    `Error occurred when executing ConditioningSetMask:
    too many values to unpack (expected 3)
    File "/workspace/ComfyUI/execution.py", line 154, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
    File "/workspace/ComfyUI/execution.py", line 84, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
    File "/workspace/ComfyUI/execution.py", line 77, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    File "/workspace/ComfyUI/nodes.py", line 209, in append
    _, h, w = mask.shape`

    • @koalanation
      @koalanation  6 หลายเดือนก่อน

      Hi! It seems there were some changes in the GrowMasWithBlur Node. Can you change, in that node (at the bottom in Mask Foreground Group), and change make sure the values in the node are: blur radius = 20, lerp alpha = 1.0 and decay factor = 1?