Resources: Hunyuan Video Diffusion Model Model Page aivideo.hunyuan.tencent.com/ Github : github.com/Tencent/HunyuanVideo Model Weights: huggingface.co/tencent/HunyuanVideo Hunyuan Video Installation In GPU Server : www.patreon.com/posts/hunyuan-video-in-117351026
LTX video with the STSG enhancing and proper prompts is AMAZING and FAST and Local! Even runs on 11 and 12 gb VRAM (barely). There is a quant model that cuts down about 1gb VRAM. I run it on a 1080 to 11gb! Set steps to 20, 512x512 at 25 to 97 length for super fast outputs. There is image to video also if you use a custom node with guidance. Edit: LTX is the video model itself. STSG = SpatioTemporal Skip Guidance is a set of nodes that increases video quality by a lot. "Add LTX Latent Guide" is another node, which makes it so that Image to Video works.
@@michaelbonomo6460 LTX is the video model itself. STSG = SpatioTemporal Skip Guidance is a set of nodes that increases video quality by a lot. "Add LTX Latent Guide" is another node, which makes it so that Image to Video works.
du you know, is it possible to generate a 8min. 1080p video on a h200 running hunyuan video with the tiling strategie. or can it still run out of memory?
Can linked cards pool their vram and allow for a larger model? Or do you just get faster processing and the models have to fit on a single cards vram? I guess that one or two seconds locally, with a given seed, lets you know what you're generating... which, if you're budget constrained, means you're not moving a job to a paid farm until your confidence is high. But are there any unavoidable parameter changes between local and farm jobs? If so, that'd really suck because it would means your entire flow from conception to final needs to be farmed out : /
Link grpahic card at home wouldn't even get a lot faster. It also consider the GPU clip processor, clips that it use, etc... For AI Video, No one would only use Text2video method and generate video for production. So for the approach to try to gather a list of seed number, it haven't work effectively. By only running txt2vid, it is just a gambling play.
@@AIBusinessIdeasBenji Ah, that makes sense. Shame about linked cards though : / But one day... retired server farm H200's will be on eBay, selling in their 1000's as farms move towards higher densities and lower TDPs ; ) ... on that day, we'll get ourselves a bargain ; )))))
You CAN run multigpu on this, but its not "distributed", I modified the comfyui nodes so I can set device on the sampler, decoder, and text encoder, allowing me to load each one to a different gpu, just to prevent the constant loading and unloading of the models, not everyone would need to do this, but my 2 4090's share a single 1x pcie slot via a riser switch with a 4080 on the board so it was taking me longer to load/unload then it did to generate. I also noticed on the decoder, auto_tile_size is default to on. turn that off and set sizes to 128, drops from needing 15gb to decode to about 4gb
Thank you for making this video I am really interested in this? I do have a beefy setup so I will have no problems installing it. However, I do have AP. C and you said that you're using AP. C, then, in order to get it to work, I would have to use sage. If it isn't too big of a ask, could you make an instructional video? Or how to go about install it? This AI video generator and sage on p. C. Thanks, that would be greatly appreciated. BTW this was a great review.
A week is a long time in AI, so now we have Kijai's wonderful comfyui adaption with temporal tiling that works on 24GB cards, or less. Works fine on Windows, but can't get it to work on Runpod, because of torch differences.
@@AIBusinessIdeasBenji It's stable at about 89 frames. After that if you try to go higher, it sometimes seems to lock up during the tiling part. I did find a runpod method too - look for hunyuan lab comfyu, or something like that. But it takes about 15 minutes to start up, and you have to copy to workspace to save your results using a console. Pretty good though! And you can reduce the size to pretty much anything to get faster results, say 512x512. It's also pretty good video to video, adding more life into say a Cogvideo creation.
Tested the lower quality version. Definitely not as good as what we're seeing from tencent. Does the full version that basically requires an H100 not have a standalone UI to use outside of ComfyUI? I'd rather not sign up for yet another website.
@AIBusinessIdeasBenji hate that. Was hoping I could run the full version like AI Toolkit or FluxGym in a standalone program. Lame. Hope they crank through their roadmap fast so we can see what it can do on our own servers.
Anyone getting into AI and planning to save up to buy a 8gb vram card, are making a huge mistake. 16gb VRAM should be your minimum goal. Model size will continue to grow. Get the most vram possible. VRAM is more important than cuda cores (within reason). The 5090 should be coming out January 2025, approximately.
Resources:
Hunyuan Video Diffusion Model
Model Page aivideo.hunyuan.tencent.com/
Github : github.com/Tencent/HunyuanVideo
Model Weights: huggingface.co/tencent/HunyuanVideo
Hunyuan Video Installation In GPU Server : www.patreon.com/posts/hunyuan-video-in-117351026
Nice video :) What cost per generation would you estimate?
Hi. I want to let you know that you do good work. And your videos are extremely helpful and valuable. Thank you. And keep it up!
LTX video with the STSG enhancing and proper prompts is AMAZING and FAST and Local!
Even runs on 11 and 12 gb VRAM (barely). There is a quant model that cuts down about 1gb VRAM.
I run it on a 1080 to 11gb!
Set steps to 20, 512x512 at 25 to 97 length for super fast outputs.
There is image to video also if you use a custom node with guidance.
Edit:
LTX is the video model itself.
STSG = SpatioTemporal Skip Guidance is a set of nodes that increases video quality by a lot.
"Add LTX Latent Guide" is another node, which makes it so that Image to Video works.
What is STSG?
@@FusionDeveloper yes, for Local PC AI video, I suggest using LTX also. A good option. And a lot of add-on features to use
@@michaelbonomo6460
Let talk about this one later 😉 using LTX and Cog are able to see the comparison with CFG junhahyung.github.io/STGuidance/
@@michaelbonomo6460 LTX is the video model itself.
STSG = SpatioTemporal Skip Guidance is a set of nodes that increases video quality by a lot.
"Add LTX Latent Guide" is another node, which makes it so that Image to Video works.
@@FusionDeveloper noted. I've used cog and LTX a bit but I didn't know people were using a different node.
Amazing content, Benji! Thanks for sharing it!
Glad you liked it! Will keep updating with this AI model , this one truly have potential VS
Amazing
"An AI you can run at home" - Great! - "All you need is a GPU with 80Gig of memory!"
du you know, is it possible to generate a 8min. 1080p video on a h200 running hunyuan video with the tiling strategie. or can it still run out of memory?
wow this is so cool i feel like this would be so expensive
Can linked cards pool their vram and allow for a larger model? Or do you just get faster processing and the models have to fit on a single cards vram?
I guess that one or two seconds locally, with a given seed, lets you know what you're generating... which, if you're budget constrained, means you're not moving a job to a paid farm until your confidence is high. But are there any unavoidable parameter changes between local and farm jobs? If so, that'd really suck because it would means your entire flow from conception to final needs to be farmed out : /
Link grpahic card at home wouldn't even get a lot faster. It also consider the GPU clip processor, clips that it use, etc...
For AI Video, No one would only use Text2video method and generate video for production. So for the approach to try to gather a list of seed number, it haven't work effectively.
By only running txt2vid, it is just a gambling play.
@@AIBusinessIdeasBenji Ah, that makes sense. Shame about linked cards though : /
But one day... retired server farm H200's will be on eBay, selling in their 1000's as farms move towards higher densities and lower TDPs ; )
... on that day, we'll get ourselves a bargain ; )))))
Multi-gpu inference is on their open source implementation plan, but not ticked done yet
You CAN run multigpu on this, but its not "distributed", I modified the comfyui nodes so I can set device on the sampler, decoder, and text encoder, allowing me to load each one to a different gpu, just to prevent the constant loading and unloading of the models, not everyone would need to do this, but my 2 4090's share a single 1x pcie slot via a riser switch with a 4080 on the board so it was taking me longer to load/unload then it did to generate. I also noticed on the decoder, auto_tile_size is default to on. turn that off and set sizes to 128, drops from needing 15gb to decode to about 4gb
Thanks man! What about the cost per minute generated? Did you calculate it?
In server about $0.5 for a 121 frames 50 steps sampling.
Thank you for making this video I am really interested in this? I do have a beefy setup so I will have no problems installing it. However, I do have AP. C and you said that you're using AP. C, then, in order to get it to work, I would have to use sage.
If it isn't too big of a ask, could you make an instructional video? Or how to go about install it? This AI video generator and sage on p. C. Thanks, that would be greatly appreciated. BTW this was a great review.
A week is a long time in AI, so now we have Kijai's wonderful comfyui adaption with temporal tiling that works on 24GB cards, or less. Works fine on Windows, but can't get it to work on Runpod, because of torch differences.
How many second you can generate with Comfyui wrapper node?
@@AIBusinessIdeasBenji It's stable at about 89 frames. After that if you try to go higher, it sometimes seems to lock up during the tiling part. I did find a runpod method too - look for hunyuan lab comfyu, or something like that. But it takes about 15 minutes to start up, and you have to copy to workspace to save your results using a console. Pretty good though! And you can reduce the size to pretty much anything to get faster results, say 512x512. It's also pretty good video to video, adding more life into say a Cogvideo creation.
512x512 is very low resolution, it needs to be at least 960x544 or even 1280x720@@geoffphillips5293
What is this hosting server?
@@CharlenoPires i did it in Runpod for code implementation, and also try with Replicate.com for API request
anyone tried this on runpod? have a deployment template?
They have a docker image. Or use Python code, just copy and paste.
Tested the lower quality version. Definitely not as good as what we're seeing from tencent. Does the full version that basically requires an H100 not have a standalone UI to use outside of ComfyUI? I'd rather not sign up for yet another website.
Theres no AI model have a standalone UI program for it man...
@AIBusinessIdeasBenji hate that. Was hoping I could run the full version like AI Toolkit or FluxGym in a standalone program. Lame. Hope they crank through their roadmap fast so we can see what it can do on our own servers.
@@michaelbonomo6460AI toolkit, Fluxgym is not even made by Black Forest Labs anyway. Most tools and web ui made by other people.
3D VAE : wow
😉👍
Its not up to production level yet, then again its not a million miles away... for little insets it might be good
Anyone getting into AI and planning to save up to buy a 8gb vram card, are making a huge mistake.
16gb VRAM should be your minimum goal.
Model size will continue to grow.
Get the most vram possible.
VRAM is more important than cuda cores (within reason).
The 5090 should be coming out January 2025, approximately.
I will say 24GB VRam is the minimum for nowadays. Many diffusion models, lora training, etc... basically useup to 20GB VRam now.
This whole AI never sleeping thing is getting old...
“AI is addicted to coke.. this week we have some great news…”
Haterrr lol and he’s freaking right though
can you speak in layman terms this feels like a whole different language, theere should be a process non coders or techies can use
Hat
HELP!!!! $$$- Anyone out there want to help me with a 6 minute music video? Paid gig. good artist!! good cause