ComfyUI: RAVE for video transformation (vid2 vid)

Koala Nation

มุมมอง 6 044

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 ก.ค. 2024
Total transformation of your videos with the new RAVE method combined with AnimateDiff. In this video, we explore the endless possibilities of RAVE (Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models). We combine it with AnimateDiff to edit and convert any video into an incredible animation.
#animatediff #comfyui #stablediffusion
============================================================
💪 Support this channel with a Super Thanks or a ko-fi! ko-fi.com/koalanation
☕ Amazing ComfyUI workflows: tinyurl.com/y9v2776r
🚨 Use Runpod and access powerful GPUs for best ComfyUI experience at a fraction of the price. tinyurl.com/58x2bpp5 🤗
☁️ Starting in ComfyUI? Run it on the cloud without installation, very easy! ☁️
👉 RunDiffusion: tinyurl.com/ypp84xjp 👉15% off first month with code 'koala15'
👉 ThinkDiffusion: tinyurl.com/4nh2yyen
🕸️ Discord: / discord
============================================================
Chapters:
00:00 Intro
00:16 RAVE and tutorial approach
01:04 Part 1 - Base workflow and installation of Custom Nodes
01:56 Part 1 - Models used in the workflow
03:02 Part 2 - RAVE workflow development and testing
05:14 Part 2 - Extending the workflow with AnimateDiff and ControlGIF
09:08 EXTRA - Second example: convert a car into a warship with LooseControl
11:06 Outro
Final workflow (in OpenArt): tinyurl.com/46w4achr
Base workflow (Github): tinyurl.com/hkfme93v
RAVE paper: rave-video.github.io/
Custom Nodes:
Can all be installed with ComfyUI Manager (tinyurl.com/ms3jkk4m)
ComfyUI-Rave
ComfyUI Noise
ComfyUI's ControlNet Auxiliary Preprocessors
ComfyUI-VideoHelper-Suite
Was Node Suite
ComfyUI-Advanced-ControlNet
AnimateDiff Evolved
rgthree's comfyUI nodes
ComfyUI Essentials
KJNodes for ComfyUI
Checkpoints (copy in models/checkpoints):
Can be downloaded from Civit.ai
Realistic Vision (can be downloaded from Civit.ai)
Juggernaut (can be downloaded from Civit.ai)
ControlNet (copy in models/controlnet folder)
Depth: can be downloaded from ComfyUI manager
ControlGIF (hugginface): tinyurl.com/28y9jkkr
LooseControl (civit.ai): tinyurl.com/2dpdnxce
AnimateDiff version 3 (tinyurl.com/y52cx825)
Adapter: tinyurl.com/3ctk78xa (copy in models/checkpoint)
Motion module: tinyurl.com/bdfdj8xv (copy in custom-nodes/ComfyUI-AnimateDiff-Evolved/models)
Videos
Pexels: tinyurl.com/46m8zxpk
Pixabay: tinyurl.com/yc46w8fh
Music
Song: It's Our Time
Music by: CreatorMix.com
Edited with Canva, and ClipChamp. I record the stuff in powerpoint.
© 2024 Koala Nation
#comfyui #animatediff #stablediffusion
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 61

@maxfxgr 6 หลายเดือนก่อน
Amazing video! keep them coming mate, Greetings from Greece!
@koalanation 6 หลายเดือนก่อน
Ela! Thanks for the support!
@charnel3786 5 หลายเดือนก่อน
Thanks for the tutorial
@zerox9646 6 หลายเดือนก่อน ⁺¹
great work
@koalanation 6 หลายเดือนก่อน
Thanks!
@ryanontheinside 5 หลายเดือนก่อน
Amazing job
@koalanation 5 หลายเดือนก่อน
Thank you! Cheers!
@skycladsquirrel 6 หลายเดือนก่อน ⁺³
I'm dropping my new Ai music video today. Then I see this. lol Awesome video. Future's looking bright!
@koalanation 6 หลายเดือนก่อน ⁺²
Very cool videos you have on IG! Love how artists like you embrace these tools to make great stuff!
@skycladsquirrel 6 หลายเดือนก่อน ⁺¹
🥰🙏@@koalanation
@epelfeld 4 หลายเดือนก่อน
Best complex tutorial I saw, I succeeded to repeat everything even some nodes are already changed. Subscribed, appreciate your work, hope you will make more like this for nubes like me (there are a lot). Thank you
PS. When I use jugernaut model I have out of memory message on Unsampler.
@koalanation 4 หลายเดือนก่อน ⁺¹
Thanks! I try to get to the point and make it as simple as possible, considering these are not beginners tutorials. We are all noobs...specially because things move very fast and we all need to learn new things all the time...
RAVE unsampler used a lot of memory, unfortunately. You can reduce the number of frames to be processed or reduce the image resolution. You can also try evolved sampling (from Animatediff evolved nodes) and see if it works but I have no tried it myself yet
@epelfeld 4 หลายเดือนก่อน
@@koalanation thanks a lot, lower resolution works
@xr3kTx 24 วันที่ผ่านมา
This did wonders
@koalanation 24 วันที่ผ่านมา
@@xr3kTx it is fun!
@xr3kTx 23 วันที่ผ่านมา
@@koalanation I took great inspiration from your workflow because I need to understand the tools at play, I actually did this with SDXL. I am using a framecap of 100, however the face seems to glitch. Can you suggest anything for the face glitching? I did use ipadapter with style and composition transfer, but every few frames it seems to redo the context.
@koalanation 23 วันที่ผ่านมา
@@xr3kTx I did not dare to use SDXL because of the GPU and VRAM requirements...besides, SDXL AnimateDiff is also difficult...with hotshot is ok, but then you are limited in a context window of 8 frames...not sure if testing with SD 1.5 is an option for you. You can always upscale and refine the output
@xr3kTx 23 วันที่ผ่านมา
@@koalanation I have had better results with SDXL personally (I am using a lora and sdxl respects it more for my character + ip adapter for style), I am using RTX A6000 on runpod so resources are less of a concern, its the workflow that I need to improve.
@koalanation 23 วันที่ผ่านมา
@@xr3kTx good to know. I may then give it another try...have you tried with free init in AD? Not sure how it will work with this setup, though. But it is a lot of trial and error, you know...
@hamedsadeghizadeh6660 4 หลายเดือนก่อน
thanks
@D3coify 4 หลายเดือนก่อน
Thanks
@D3coify 4 หลายเดือนก่อน
Oh so Depth makes the video more realistic?
@drviolet396 5 หลายเดือนก่อน ⁺¹
if I were to add an IPadapter where would you recommend to connect the model? at the upsampler level or after the animatediff connecting to the last ksampler?
@koalanation 5 หลายเดือนก่อน
With IP Adapter, as you want to have control of the output, i would use it for the last ksampler, but no other reason than that...
@drviolet396 5 หลายเดือนก่อน ⁺¹
can you elaborate what does actually the part of uspampling-noise generation? there is an empty text prompt connected to a controlnet and cfg is 1
@koalanation 5 หลายเดือนก่อน
Hi! As I understand it, the unsampler is doing the reverse process as the Ksampler. So I assume not need to guide the prompt, that is why is blank? the github repo indicates best results with cfg of 1, but no reason why not to play with other values: github.com/BlenderNeko/ComfyUI_Noise?tab=readme-ov-file
@JefHarrisnation 5 หลายเดือนก่อน
I noticed the model versions of Realistic Vision and Juggernaut are SD 1.5. For this to work do I have to use the 1.5 versions or can I used the new SDXL versions of the models?
@koalanation 4 หลายเดือนก่อน ⁺¹
According to the ComfyUI implementation GitHub, should work (with limitations). Thus, I guess you can. If you do, use the right ControlNet model versions (for SDXL). github.com/spacepxl/ComfyUI-RAVE
@JefHarrisnation 4 หลายเดือนก่อน ⁺¹
@@koalanationThanks, will try.
@Stopsign002 6 หลายเดือนก่อน ⁺¹
is there any reason to run the rave part of this process at higher than 12 steps? Also, it seems like I run out of vram if I run too many frames through the process (meaning I have to skip n frames). I would imagine this is expected?
@koalanation 6 หลายเดือนก่อน ⁺¹
The Rave example uses 25. Decreasing it to 12 was working for me, at the end you want to find the sweet spot between speed and quality.
The Rave Ksampler uses quite a lot of VRAM and depending on your machine you may need to reduce it. That seems to be one of the limitations of the implementation. Hopefully the developers set some trick to be able more frames....
@spacepxl 5 หลายเดือนก่อน ⁺¹
If you're running a second pass through AnimateDiff, probably not necessary to go higher than 15 with DPM samplers. As for VRAM, the default is a grid_size of 3, which means you're diffusing a 3x3 grid, so for example if you're working at 512x512, it will actually be using a 1536x1536 image internally, which is just slower and more memory intensive than a batch of 9 512x512, no way around it. You can drop grid_size to 2 for more speed and less memory usage but less consistency.
@Nibot2023 4 หลายเดือนก่อน
Edit: So you get a error when using XL models . Not sure what control nets / LoRa's to use to make this work flow work with an XL model. I when with a none XL model and it worked but it craps out on the upscale. Says it needs to reconnect and stalls out. I am hoping to find a way to make XL models work. This tutorial is cool but I am so new I do not understand the nodes. I also crash out on the unsampler portion too..says reconnecting with a close button. Is there a way to reboot it without closing the window to get it back online?
The file crashes when you using 1280x720 footage. I set my resize to the size of the footage but I leave the factor to 1. I am not really understanding the upscale math in which you deduced resizing smaller to blow it up. Is there a way to have it use the aspect ratio you want and then upscale to 1920x1080? I get this warning when trying to que prompt it.
Error occurred when executing BNK_Unsampler:
mat1 and mat2 shapes cannot be multiplied (4235x2048 and 768x320)
@koalanation 4 หลายเดือนก่อน ⁺¹
I see you found the issue with the mat1 and mat2 messages. Controlnets and the checkpoints need to be same version. Checkout huggingface.co/ckpt/controlnet-sdxl-1.0/tree/main or search in hugginface for specific CN.
I have not tried myself the workflow with SDXL, so not sure if I am able to help you...RAVE is a nice tool but uses a crap lot of VRAM. For that reason I did not try SDXL. I find time I will try to update the workflow to adjust for SDXL.
@Nibot2023 4 หลายเดือนก่อน
@@koalanation Rad! Thank you for taking the time to answer and giving a location for the control nets! I will let you know if I am successful in that area.
Last question - I am curious what to do when Comfyui says "reconnecting" and on the pop up says close. The system is crashed on the upscale part. Is there way to keep your work but reboot it to continue on that portion? or do I just have to re-open comfyui like I have been doing and starting over?
@RhapsHayden 2 หลายเดือนก่อน
Where would I add a custom trained lora? after the load checkpoint?
@koalanation 2 หลายเดือนก่อน ⁺¹
Yes
@aaagaming2023 6 หลายเดือนก่อน ⁺¹
Is there any way to maintain consistency with input video? Creates a lot of extra fingers and stuff with humans. Would adding a second controlnet such as openpose help?
@koalanation 6 หลายเดือนก่อน ⁺²
Fingers are tricky...if you want better control you may want to use openpose or the meshgraphormer for hands....applying masks to the hands and using hed or lineart may also help. But this is more advanced and elaborated
@aaagaming2023 6 หลายเดือนก่อน
@@koalanation Have you seen jboogx workflow for Animatediff? Im thinking about something like that but with RAVE instead.
@koalanation 6 หลายเดือนก่อน ⁺²
yes, I have seen it. It is very complete,. That setup is very complete, to do many things. The idea with RAVE is to give more power to the prompt, but I guess it can combine nicely. Good luck!
@aaagaming2023 6 หลายเดือนก่อน ⁺¹
@@koalanation I think ideal for the usecase of consistent transform of realistic human would be CN's dwopenpose, depth and tile with RAVE and an AD pass after.
@koalanation 6 หลายเดือนก่อน ⁺¹
Good idea!
@user-pw4uz2gd5i 6 หลายเดือนก่อน ⁺¹
is it possible to use a reference image instead of a prompt?
@koalanation 6 หลายเดือนก่อน
In principle yes...but the idea of RAVE is to use the prompt to create something different. To use a reference image, using IP adapter may be a simpler solution. Check out other videos I have, like this one: th-cam.com/video/Ka4ENd63VBo/w-d-xo.htmlsi=7usgz4pZnfVngOrn
@user-pw4uz2gd5i 6 หลายเดือนก่อน
Thank you@@koalanation
@ehsankholghi 5 หลายเดือนก่อน
i upgraded to 3090ti 24gig.how much cpu ram i need to do video to video SD? I have 32gig
@koalanation 4 หลายเดือนก่อน
I think that should do...
@rayenmajoul 5 หลายเดือนก่อน
does this work with SDXL models?
@koalanation 5 หลายเดือนก่อน
I do not see why not...but I have not tested, to be honest
@tonon_AI 3 หลายเดือนก่อน ⁺¹
does Rave work for text to video too?
@koalanation 3 หลายเดือนก่อน ⁺¹
I understand that RAVE is made for video to video...I do not think it will work if you connect an empty latent.
For text to video I think it is better to directly use Animatediff. There are great examples out there.
@tonon_AI 3 หลายเดือนก่อน
@@koalanation thanks! Yeah I use animatediff but the movements are not the same.
@andrejlopuchov7972 6 หลายเดือนก่อน ⁺¹
For some reason my rtx 3090 got cuds error , like it run out of power
@koalanation 6 หลายเดือนก่อน
RAVE uses quite a bit of VRAM. I only manage to get 96 frames with a 4090 24 GB. Sometimes less is better...Hopefully they give support to be able to reduce requirements...
@SageMolotov 6 หลายเดือนก่อน
can we change the vram settings to low vram? would that solve this issue? my workflow failed at RAVE Ksampler (also ran out of VRAM and I have a 4090 with 16G Vram/ 64G ram.
@@koalanation
@espedairsystems 6 หลายเดือนก่อน
torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 16.42 GiB
Requested : 2.96 GiB
Device limit : 23.66 GiB
Free (according to CUDA): 30.12 MiB
PyTorch limit (set by user-supplied memory fraction)
Looks like my RTX 3090 can't take the pace with 24GB vRAM ... time to save for my 5090 with 48GB
@koalanation 6 หลายเดือนก่อน
I get this error if I try to use to process too many frames. Try to reduce them and see if it works.
@koalanation 6 หลายเดือนก่อน ⁺¹
It is indeed a thing worth to try....otherwise I am afraid less frames will make it work

ต่อไป

เล่นอัตโนมัติ

Animatediff perfect scenes. Any background with conditional masking. ComfyUI Animation