ComfyUI: RAVE for video transformation (vid2 vid)

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ก.ค. 2024
  • Total transformation of your videos with the new RAVE method combined with AnimateDiff. In this video, we explore the endless possibilities of RAVE (Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models). We combine it with AnimateDiff to edit and convert any video into an incredible animation.
    #animatediff #comfyui #stablediffusion
    ============================================================
    💪 Support this channel with a Super Thanks or a ko-fi! ko-fi.com/koalanation
    ☕ Amazing ComfyUI workflows: tinyurl.com/y9v2776r
    🚨 Use Runpod and access powerful GPUs for best ComfyUI experience at a fraction of the price. tinyurl.com/58x2bpp5 🤗
    ☁️ Starting in ComfyUI? Run it on the cloud without installation, very easy! ☁️
    👉 RunDiffusion: tinyurl.com/ypp84xjp 👉15% off first month with code 'koala15'
    👉 ThinkDiffusion: tinyurl.com/4nh2yyen
    🕸️ Discord: / discord
    ============================================================
    Chapters:
    00:00 Intro
    00:16 RAVE and tutorial approach
    01:04 Part 1 - Base workflow and installation of Custom Nodes
    01:56 Part 1 - Models used in the workflow
    03:02 Part 2 - RAVE workflow development and testing
    05:14 Part 2 - Extending the workflow with AnimateDiff and ControlGIF
    09:08 EXTRA - Second example: convert a car into a warship with LooseControl
    11:06 Outro
    Final workflow (in OpenArt): tinyurl.com/46w4achr
    Base workflow (Github): tinyurl.com/hkfme93v
    RAVE paper: rave-video.github.io/
    Custom Nodes:
    Can all be installed with ComfyUI Manager (tinyurl.com/ms3jkk4m)
    ComfyUI-Rave
    ComfyUI Noise
    ComfyUI's ControlNet Auxiliary Preprocessors
    ComfyUI-VideoHelper-Suite
    Was Node Suite
    ComfyUI-Advanced-ControlNet
    AnimateDiff Evolved
    rgthree's comfyUI nodes
    ComfyUI Essentials
    KJNodes for ComfyUI
    Checkpoints (copy in models/checkpoints):
    Can be downloaded from Civit.ai
    Realistic Vision (can be downloaded from Civit.ai)
    Juggernaut (can be downloaded from Civit.ai)
    ControlNet (copy in models/controlnet folder)
    Depth: can be downloaded from ComfyUI manager
    ControlGIF (hugginface): tinyurl.com/28y9jkkr
    LooseControl (civit.ai): tinyurl.com/2dpdnxce
    AnimateDiff version 3 (tinyurl.com/y52cx825)
    Adapter: tinyurl.com/3ctk78xa (copy in models/checkpoint)
    Motion module: tinyurl.com/bdfdj8xv (copy in custom-nodes/ComfyUI-AnimateDiff-Evolved/models)
    Videos
    Pexels: tinyurl.com/46m8zxpk
    Pixabay: tinyurl.com/yc46w8fh
    Music
    Song: It's Our Time
    Music by: CreatorMix.com
    Edited with Canva, and ClipChamp. I record the stuff in powerpoint.
    © 2024 Koala Nation
    #comfyui #animatediff #stablediffusion
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 61

  • @maxfxgr
    @maxfxgr 6 หลายเดือนก่อน

    Amazing video! keep them coming mate, Greetings from Greece!

    • @koalanation
      @koalanation  6 หลายเดือนก่อน

      Ela! Thanks for the support!

  • @charnel3786
    @charnel3786 5 หลายเดือนก่อน

    Thanks for the tutorial

  • @zerox9646
    @zerox9646 6 หลายเดือนก่อน +1

    great work

  • @ryanontheinside
    @ryanontheinside 5 หลายเดือนก่อน

    Amazing job

    • @koalanation
      @koalanation  5 หลายเดือนก่อน

      Thank you! Cheers!

  • @skycladsquirrel
    @skycladsquirrel 6 หลายเดือนก่อน +3

    I'm dropping my new Ai music video today. Then I see this. lol Awesome video. Future's looking bright!

    • @koalanation
      @koalanation  6 หลายเดือนก่อน +2

      Very cool videos you have on IG! Love how artists like you embrace these tools to make great stuff!

    • @skycladsquirrel
      @skycladsquirrel 6 หลายเดือนก่อน +1

      🥰🙏@@koalanation

  • @epelfeld
    @epelfeld 4 หลายเดือนก่อน

    Best complex tutorial I saw, I succeeded to repeat everything even some nodes are already changed. Subscribed, appreciate your work, hope you will make more like this for nubes like me (there are a lot). Thank you
    PS. When I use jugernaut model I have out of memory message on Unsampler.

    • @koalanation
      @koalanation  4 หลายเดือนก่อน +1

      Thanks! I try to get to the point and make it as simple as possible, considering these are not beginners tutorials. We are all noobs...specially because things move very fast and we all need to learn new things all the time...
      RAVE unsampler used a lot of memory, unfortunately. You can reduce the number of frames to be processed or reduce the image resolution. You can also try evolved sampling (from Animatediff evolved nodes) and see if it works but I have no tried it myself yet

    • @epelfeld
      @epelfeld 4 หลายเดือนก่อน

      @@koalanation thanks a lot, lower resolution works

  • @xr3kTx
    @xr3kTx 24 วันที่ผ่านมา

    This did wonders

    • @koalanation
      @koalanation  24 วันที่ผ่านมา

      @@xr3kTx it is fun!

    • @xr3kTx
      @xr3kTx 23 วันที่ผ่านมา

      @@koalanation I took great inspiration from your workflow because I need to understand the tools at play, I actually did this with SDXL. I am using a framecap of 100, however the face seems to glitch. Can you suggest anything for the face glitching? I did use ipadapter with style and composition transfer, but every few frames it seems to redo the context.

    • @koalanation
      @koalanation  23 วันที่ผ่านมา

      @@xr3kTx I did not dare to use SDXL because of the GPU and VRAM requirements...besides, SDXL AnimateDiff is also difficult...with hotshot is ok, but then you are limited in a context window of 8 frames...not sure if testing with SD 1.5 is an option for you. You can always upscale and refine the output

    • @xr3kTx
      @xr3kTx 23 วันที่ผ่านมา

      @@koalanation I have had better results with SDXL personally (I am using a lora and sdxl respects it more for my character + ip adapter for style), I am using RTX A6000 on runpod so resources are less of a concern, its the workflow that I need to improve.

    • @koalanation
      @koalanation  23 วันที่ผ่านมา

      @@xr3kTx good to know. I may then give it another try...have you tried with free init in AD? Not sure how it will work with this setup, though. But it is a lot of trial and error, you know...

  • @hamedsadeghizadeh6660
    @hamedsadeghizadeh6660 4 หลายเดือนก่อน

    thanks

  • @D3coify
    @D3coify 4 หลายเดือนก่อน

    Thanks

    • @D3coify
      @D3coify 4 หลายเดือนก่อน

      Oh so Depth makes the video more realistic?

  • @drviolet396
    @drviolet396 5 หลายเดือนก่อน +1

    if I were to add an IPadapter where would you recommend to connect the model? at the upsampler level or after the animatediff connecting to the last ksampler?

    • @koalanation
      @koalanation  5 หลายเดือนก่อน

      With IP Adapter, as you want to have control of the output, i would use it for the last ksampler, but no other reason than that...

  • @drviolet396
    @drviolet396 5 หลายเดือนก่อน +1

    can you elaborate what does actually the part of uspampling-noise generation? there is an empty text prompt connected to a controlnet and cfg is 1

    • @koalanation
      @koalanation  5 หลายเดือนก่อน

      Hi! As I understand it, the unsampler is doing the reverse process as the Ksampler. So I assume not need to guide the prompt, that is why is blank? the github repo indicates best results with cfg of 1, but no reason why not to play with other values: github.com/BlenderNeko/ComfyUI_Noise?tab=readme-ov-file

  • @JefHarrisnation
    @JefHarrisnation 5 หลายเดือนก่อน

    I noticed the model versions of Realistic Vision and Juggernaut are SD 1.5. For this to work do I have to use the 1.5 versions or can I used the new SDXL versions of the models?

    • @koalanation
      @koalanation  4 หลายเดือนก่อน +1

      According to the ComfyUI implementation GitHub, should work (with limitations). Thus, I guess you can. If you do, use the right ControlNet model versions (for SDXL). github.com/spacepxl/ComfyUI-RAVE

    • @JefHarrisnation
      @JefHarrisnation 4 หลายเดือนก่อน +1

      @@koalanationThanks, will try.

  • @Stopsign002
    @Stopsign002 6 หลายเดือนก่อน +1

    is there any reason to run the rave part of this process at higher than 12 steps? Also, it seems like I run out of vram if I run too many frames through the process (meaning I have to skip n frames). I would imagine this is expected?

    • @koalanation
      @koalanation  6 หลายเดือนก่อน +1

      The Rave example uses 25. Decreasing it to 12 was working for me, at the end you want to find the sweet spot between speed and quality.
      The Rave Ksampler uses quite a lot of VRAM and depending on your machine you may need to reduce it. That seems to be one of the limitations of the implementation. Hopefully the developers set some trick to be able more frames....

    • @spacepxl
      @spacepxl 5 หลายเดือนก่อน +1

      If you're running a second pass through AnimateDiff, probably not necessary to go higher than 15 with DPM samplers. As for VRAM, the default is a grid_size of 3, which means you're diffusing a 3x3 grid, so for example if you're working at 512x512, it will actually be using a 1536x1536 image internally, which is just slower and more memory intensive than a batch of 9 512x512, no way around it. You can drop grid_size to 2 for more speed and less memory usage but less consistency.

  • @Nibot2023
    @Nibot2023 4 หลายเดือนก่อน

    Edit: So you get a error when using XL models . Not sure what control nets / LoRa's to use to make this work flow work with an XL model. I when with a none XL model and it worked but it craps out on the upscale. Says it needs to reconnect and stalls out. I am hoping to find a way to make XL models work. This tutorial is cool but I am so new I do not understand the nodes. I also crash out on the unsampler portion too..says reconnecting with a close button. Is there a way to reboot it without closing the window to get it back online?
    The file crashes when you using 1280x720 footage. I set my resize to the size of the footage but I leave the factor to 1. I am not really understanding the upscale math in which you deduced resizing smaller to blow it up. Is there a way to have it use the aspect ratio you want and then upscale to 1920x1080? I get this warning when trying to que prompt it.
    Error occurred when executing BNK_Unsampler:
    mat1 and mat2 shapes cannot be multiplied (4235x2048 and 768x320)

    • @koalanation
      @koalanation  4 หลายเดือนก่อน +1

      I see you found the issue with the mat1 and mat2 messages. Controlnets and the checkpoints need to be same version. Checkout huggingface.co/ckpt/controlnet-sdxl-1.0/tree/main or search in hugginface for specific CN.
      I have not tried myself the workflow with SDXL, so not sure if I am able to help you...RAVE is a nice tool but uses a crap lot of VRAM. For that reason I did not try SDXL. I find time I will try to update the workflow to adjust for SDXL.

    • @Nibot2023
      @Nibot2023 4 หลายเดือนก่อน

      @@koalanation Rad! Thank you for taking the time to answer and giving a location for the control nets! I will let you know if I am successful in that area.
      Last question - I am curious what to do when Comfyui says "reconnecting" and on the pop up says close. The system is crashed on the upscale part. Is there way to keep your work but reboot it to continue on that portion? or do I just have to re-open comfyui like I have been doing and starting over?

  • @RhapsHayden
    @RhapsHayden 2 หลายเดือนก่อน

    Where would I add a custom trained lora? after the load checkpoint?

  • @aaagaming2023
    @aaagaming2023 6 หลายเดือนก่อน +1

    Is there any way to maintain consistency with input video? Creates a lot of extra fingers and stuff with humans. Would adding a second controlnet such as openpose help?

    • @koalanation
      @koalanation  6 หลายเดือนก่อน +2

      Fingers are tricky...if you want better control you may want to use openpose or the meshgraphormer for hands....applying masks to the hands and using hed or lineart may also help. But this is more advanced and elaborated

    • @aaagaming2023
      @aaagaming2023 6 หลายเดือนก่อน

      @@koalanation Have you seen jboogx workflow for Animatediff? Im thinking about something like that but with RAVE instead.

    • @koalanation
      @koalanation  6 หลายเดือนก่อน +2

      yes, I have seen it. It is very complete,. That setup is very complete, to do many things. The idea with RAVE is to give more power to the prompt, but I guess it can combine nicely. Good luck!

    • @aaagaming2023
      @aaagaming2023 6 หลายเดือนก่อน +1

      @@koalanation I think ideal for the usecase of consistent transform of realistic human would be CN's dwopenpose, depth and tile with RAVE and an AD pass after.

    • @koalanation
      @koalanation  6 หลายเดือนก่อน +1

      Good idea!

  • @user-pw4uz2gd5i
    @user-pw4uz2gd5i 6 หลายเดือนก่อน +1

    is it possible to use a reference image instead of a prompt?

    • @koalanation
      @koalanation  6 หลายเดือนก่อน

      In principle yes...but the idea of RAVE is to use the prompt to create something different. To use a reference image, using IP adapter may be a simpler solution. Check out other videos I have, like this one: th-cam.com/video/Ka4ENd63VBo/w-d-xo.htmlsi=7usgz4pZnfVngOrn

    • @user-pw4uz2gd5i
      @user-pw4uz2gd5i 6 หลายเดือนก่อน

      Thank you@@koalanation

  • @ehsankholghi
    @ehsankholghi 5 หลายเดือนก่อน

    i upgraded to 3090ti 24gig.how much cpu ram i need to do video to video SD? I have 32gig

    • @koalanation
      @koalanation  4 หลายเดือนก่อน

      I think that should do...

  • @rayenmajoul
    @rayenmajoul 5 หลายเดือนก่อน

    does this work with SDXL models?

    • @koalanation
      @koalanation  5 หลายเดือนก่อน

      I do not see why not...but I have not tested, to be honest

  • @tonon_AI
    @tonon_AI 3 หลายเดือนก่อน +1

    does Rave work for text to video too?

    • @koalanation
      @koalanation  3 หลายเดือนก่อน +1

      I understand that RAVE is made for video to video...I do not think it will work if you connect an empty latent.
      For text to video I think it is better to directly use Animatediff. There are great examples out there.

    • @tonon_AI
      @tonon_AI 3 หลายเดือนก่อน

      @@koalanation thanks! Yeah I use animatediff but the movements are not the same.

  • @andrejlopuchov7972
    @andrejlopuchov7972 6 หลายเดือนก่อน +1

    For some reason my rtx 3090 got cuds error , like it run out of power

    • @koalanation
      @koalanation  6 หลายเดือนก่อน

      RAVE uses quite a bit of VRAM. I only manage to get 96 frames with a 4090 24 GB. Sometimes less is better...Hopefully they give support to be able to reduce requirements...

    • @SageMolotov
      @SageMolotov 6 หลายเดือนก่อน

      can we change the vram settings to low vram? would that solve this issue? my workflow failed at RAVE Ksampler (also ran out of VRAM and I have a 4090 with 16G Vram/ 64G ram.
      @@koalanation

    • @espedairsystems
      @espedairsystems 6 หลายเดือนก่อน

      torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory)
      Currently allocated : 16.42 GiB
      Requested : 2.96 GiB
      Device limit : 23.66 GiB
      Free (according to CUDA): 30.12 MiB
      PyTorch limit (set by user-supplied memory fraction)
      Looks like my RTX 3090 can't take the pace with 24GB vRAM ... time to save for my 5090 with 48GB

    • @koalanation
      @koalanation  6 หลายเดือนก่อน

      I get this error if I try to use to process too many frames. Try to reduce them and see if it works.

    • @koalanation
      @koalanation  6 หลายเดือนก่อน +1

      It is indeed a thing worth to try....otherwise I am afraid less frames will make it work