A1111: IP Adapter ControlNet Tutorial (Stable Diffusion)
ฝัง
- เผยแพร่เมื่อ 19 พ.ค. 2024
- This is a comprehensive tutorial on the IP Adapter ControlNet Model in Stable Diffusion Automatic 1111. I showcase multiple workflows using text2image, image2image, and inpainting.
------------------------
PDF File (TH-cam Membership): www.youtube.com/@controlaltai...
ControlNet IP Adapter Models:
huggingface.co/lllyasviel/sd_...
huggingface.co/h94/IP-Adapter...
------------------------
TimeStamps:
0:00 Intro.
1:07 IP Adapter Models, Checkpoints.
2:56 Character Manipulation Workflow.
10:02 Hair Color & Age In Painting Workflow.
14:59 Environment Manipulation Workflow.
20:23 IP Adapter Plus Face Workflow. - แนวปฏิบัติและการใช้ชีวิต
Note: The ControlNet Models go in this folder: Stable Diffusion\stable-diffusion-webui\models\ControlNet
Also here is my Tutorial on How to Install ControlNet: th-cam.com/video/JdCyGKVgHKI/w-d-xo.html
Update: Some people might face errors when using ip-adapter-plus_sdxl_vit-h with sdxl checkpoint. There is a problem with the model, please check the official response: huggingface.co/h94/IP-Adapter/discussions/6
Update 2: CLIP Interrogate location has changed. Under the Generate Button - it should be the 4th icon from the left.
So its not working with A1111 anymore?
@@guedes489 only the ip adapter plus sdxl-vit-h model. To use that you need to use it with a sd1.5 checkpoint even in comfy.
My mind broke when you selected a batch size of 4 in 1024px resolution 🤯
STOKED!!! 🤙 Liked, shared, commenting for the engagement and free upvotes for everyone!
Literally exactly what I was needing to learn next!
3060 go BRRRRRR!
/excited ai bro
As soon as I installed the models and ran A1111 my controlnet stopped working after 9 months of no issues. Wonderful.
That has nothing to do with an ipadapter model. If you don't use the ip adapter ControlNet should work. If the ControlNet with ip adapter is not working, something is wrong, like model mismatch. Let me know what error you are getting? Maybe I can take a look and get it working for you.
@@controlaltai Just to say that the AI voice-over you have used for this video is not the voice of an adult white male, but a young black male. So, it doesn't fit with your avatar.
Great info. Looking forward to trying it out.
Awesome tutorials as always🎉!
Thank you! Cheers!
Thanks a lot for this tutorial!
Glad it was helpful!
I'm addicted to this controlnet model since last 2-3 days. Its magical! Good tutorial 👍🏽
Glad you enjoy it!
@@controlaltai Yeah, i supposed that.
@@controlaltai There isn't an IP adapter Face Plus but for SDXL right?
No, not yet there for sdxl, the face plus one. I think eventually, they will come out.
Excellent, Thanks !
great tutorial. which ai voice gen method did you use to create the text-to-voice narration for your videos? thanks!
Really helpful video. Thanks a lot
Glad it was helpful!
The last example for faceswapping ,can be further improved if you are trying to do deepfakes to give the face the last 20% of the look of the person, use roop/reactor with the same face pic as ip adapter in cunjunction with this ... also u can use ip adapter face + canny, softedge , depth, or normal if you want to use the bodytype. Ip adapter will use the head hairstyle and haircolor and also make the skin tone of your body match the skin tone of your face as well
any updates to this is the same method used now
Only the safetensor versions of the model are released. The method to use the workflows is still same.
Great, bring more videos.
Thank you, I will
most ppl not going to know or remember to where to put they files ! you have to show it to them !!! my freind asked me to watch your video and explain to him what happened ! remember keep telling where to put files every time !
Yeh I got that. Thanks!
Great Tutorial! What checkpoint did you use to create the house at time: 15.03 Thanks.
Hi, Thank You! The original house at 15:03 was created using MidJourney. Then img2img rev animated in a1111.
Midjourney prompt: vector image of a house --s 1000
Thank you@@controlaltai
thanks a lot, for some reason it takes a lot for the generation to get started when using IP Adapter, I have your same models - not sure why, like it takes 20s before I can see the progress bar with some progress
Tell me your GPU. I will write detailed optimisations instructions which will speed up the generations for you massively.
where to put all the files? all of the files that you told to download. all will go to "Stable Diffusion\stable-diffusion-webui\models\ControlNet" this folder? or some files on this and some files on another folder?
Hi correct, All the ControlNet IP Adapter Models (links in video description) go here "Stable Diffusion\stable-diffusion-webui\models\ControlNet".
All Checkpoints go here: "Stable Diffusion\stable-diffusion-webui\models\Stable-diffusion"
All VAE go here: "Stable Diffusion\stable-diffusion-webui\models\VAE"
All Upscales go here: "Stable Diffusion\stable-diffusion-webui\models\ESRGAN".
What is the advantage of using IP Adapter+ControlNet vs. Only pose controlNet + reference image. And also can IP Adapter be replaced by using Img2Img? Thanks for the video.
IP adapter is more accurate and you get more control over controlnet plus reference image. Img2Img is a bit different and the result won’t be the same as IP adapter. If you want to merge elements of 2 image with control, it adapter will give you way more accurate results. You can checkout this link for more details github.com/tencent-ailab/IP-Adapter
Thanks! I've been wondering what that model does and how to get it working. All I have to say is: Aurora Borealis? At this time of year? At this time of day? In this part of the country? Localized entirely above your house? ... Can I see it?
Thanks you! Glad the video helped.
Thanks, awesome video. Please tell us your PC configuration and any script/extensions used, the generations are so fast! I have an i7 13700K and 4080, but generations takes twice as much time as yours.
Hi, thank you. PC specs are AMD 7950X 3D and a 4090 GPU. Only using xformers, python 3.10.9 and torch 2. Extensions only have upscale sd ultimate and controlnet for this specific video.
@@controlaltai You've got a beefy system, that explains faster generations. Thanks.
@@controlaltai Do you think an older Python version can verdict over the generation speed? I use 3.10.6.
@fidelcrisis well when I was trying the lora training with kohya it forced me to use python 3.10.9 instead of 3.10.6 because I had to use torch 2 with the latest cudann dlls. I don't think so only python increases speed for a1111. I upgraded the processor recently and saw a boost in speed using the same GPU. Also when using a1111 it does use my ram as well. I have 64 GB of RAM.
Just verify if torch 2 is what is being used for 1111 as it does improve speed for nvidia cards.
th-cam.com/video/pom3nQejaTs/w-d-xo.htmlsi=XSwVzu-YUmAICJOp
Check out this video on how to get torch 2 and increase speeds.
There are some pages on GitHub and reddit as well via Google search.
@@controlaltai Thank you so much for a detailed explanation. I wish there were more people like you in the world!
Hi, love your videos as always. Do you have a guide to install, config and use stable diffusion with an amd gpu(ati radeon) pls!! Best regards.
Hi, Thank you. I found the install instructions, unfortunately I cannot test as I run an amd cpu with an nvidia GPU. Please check this and let me know if it helps: github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs
If you have any questions just ask. If I can help, I will.
@@controlaltai ty so much! Yes i will test it =)
What are you using to show the VAE selection at the top of the GUI?
Hi, Open WebUi, then go to settings - user interface - Quicksettings list - search sd_vae and add that, save, apply, restart ui.
Set the playback speed to 1.5 if you don't want to blow you'r head off waiting for him to finish a sentence
Thanks for your tutorial. I assume this only works with images generated by the model, correct? I tried with stock photo and the model wouldn't add sunglasses as per instructed
I haven’t tried it with the stock photo, let me try one and check, will let you know my findings.
@@controlaltai thanks! I’ve been trying here without success
Hi, I have tried it and it works. However for a photo realistic model, the problem is for the ai to regenerate the same photo realistic image. So a text2image won't work here. For adding elements to a realistic image try img2img with inpainting. Draw inpainting sunglasses over the image, anything rough will also do. Add "wearing sunglasses" in positive prompt, sampling method DPM++ 2M SDE Karras, select resize by 1, ip adapter in controlnet with ip-adapter-plus-face_sd15, checkpoint used is Realistic_Vision_V5.1-inpainting.
It drew sunglasses over a real photo portrait image.
@@controlaltai thanks! I will try it
What graphics card do you have? Your render time is fast. My old nvidia 1080 ti would take forever.
nvidia RTX 4090
Please create a tutorial on Reactor also...and can reactor be used with Adetailer or IP Adapter?
Sorry reactor I can't due to a contract I have with the company who holds the copyright. I haven't got a clearance from them to go ahead and make a tutorial regarding anything they use their proprietary model. The insight face swapped model.
2:50 Which BaseSDXL vae safetensor files and checkpoints are you talking about?
These are from hugging face. Here are the links
Base sdxl 1.0: huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main
Vae: huggingface.co/stabilityai/sdxl-vae/tree/main
@@controlaltai Where do I put the "control-lora-openposeXL2-rank256.safetensor" file? Is this the thibaud_xl_openpose_256lora model you apply at 4:54?
Hi, No. That is a different open pose model. You can download thibaud_xl_openpose_256lora model from here: huggingface.co/lllyasviel/sd_control_collection/tree/main (last file I believe).
The file should go in this folder: Stable Diffusion\stable-diffusion-webui\models\ControlNet
I forgot, you don't know if there is a SDXL Inpaint Model that works with AUTO's?
Hi, I don't think so, latest comment I found was on reddit. Apparently its a diffuser format and not safe tensor. www.reddit.com/r/StableDiffusion/comments/167cy0v/sdxl_inpainting_just_dropped/
What's the best way in ControlNet to keep the original clothes the same but only change the style of the image? Like transforming a real person into a cartoon without changing his uniform, but making everything cartoony instead.
Segmentation + Open Pose + Depth/Canny. Mask out the clothes - invert mask apply whatever you want, the clothes will remain untouched. But there will be artifacts. You will need comfy ui to do this properly, not possible in A1111 if you are changing everything
Can you try reproduce: First pass CN 1.1 Tile + second pass CN 1.1 Tile aka DOUBLE Tile
Someone said it's easy in comfyui
Let me give it a shot by Sunday, I am in the middle of making a video for Comfy Control Net Lora version (sdxl). Will reply here with result.
How can I train my custom images? I'm new to all this stuffs😃
Hi, I explained the basics in the video. Depends on what you want to train. You have to take a set of predefined images based on what you want. For example if you want to train a face take a face shot of the person in multiple angles, around 30 to 40 images will do.
Tell me please how to get the latest version of ControlNet like yours v1.1.410. When I update it doesn't help.
First Make Sure you have A1111 latest version. Go to Stable Diffusion\stable-diffusion-webui Folder then right click and open in terminal
Type: git pull
Press enter.
Open A1111, Go to extensions. Untick ControlNet, Apply and Restart. Close browser and command prompt.
Then Go to this folder: Q:\Stable Diffusion\stable-diffusion-webui\extensions
Delete this folder: sd-webui-controlnet
If Folder is not there ignore.
Open A1111 again - go to extensions - install from url:
Paste this: github.com/Mikubill/sd-webui-controlnet.git
Install, Apply and Restart. Close Browser command prompt, Start Again.
22:49 from where you pasted the seed?
I have a window opened on the left beyond the scope of the recording window, for my notes and other stuff.
23:40 this results are awesome with the pose and swapped faces. But what seed I have to paste on seed? And from where I can get it?
@@theawesome2902hi, you don’t have to paste the seed, just generate randomly, when you like the result save the seed number. The seed is useful to regenerate the results on your own device, unless you enable cpu noise for which seed will work cross platform. Since I have to replicate what I do for the video I save and use the same seed. If you follow the tutorial the results should be consistent, not exactly the same but consistent in terms of desired output. You can also get the seed number via loading the generated image in png info tab of a1111.
@@controlaltaithank you very much sir 🙏
At 7:23, I don't have the Interrogate CLIP option, how to turn it on?
They have changed the Location after the update. Under the Generate Button - it should be the 4th icon from the left. Just load the image in im2img tab and click on that.
@@controlaltaiYes. It's the 4th icon from the left. Thanks a lot.
where do the models go , the ip adapter ones
Here: "Stable Diffusion\stable-diffusion-webui\models\ControlNet"
I'm having a problem (preview error) I've updated Controlnet to the latest version, I've changed the file to .pth "please help me"
For which model are you getting the error? Sdxl or sd1.5
I can't get the "ip-adapter-plus_sdxl_vit-h.bin" to show up and if I rename it to .pth it shows up but then shows a torch.Size error when I try to generate, I already tried to use different SDXL checkpoints but still the same.
Hi, are you creating in batches? If yes, reduce the batch to one, do not use an upscaler. Do not use a refiner.
It seems some people are having this issue as it is posted in the github page here: huggingface.co/h94/IP-Adapter/discussions
Try this file: ip-adapter_sdxl_vit-h.bin which is smaller in size or try the one hugging face link - ip-adapter_xl (huggingface.co/lllyasviel/sd_control_collection/tree/main)
I am assuming the error is caused by low vram. Because I had no issues.
I am using A1111 with xformers and torch2 on a nvidia GPU (4090 24GB vRam)
This is my launch command for A1111
set COMMANDLINE_ARGS= --xformers --no-half
@@controlaltai hello, no batches and no upscaler or hires fix or refiner and I get the same error. So should I rename to .pth or should it appear even if it's .bin? because with .bin it doesn't appear on the models list in CN.
It won't appear as .bin. It has to be .safetensor or .pth. The .safetensor has not been released, so we have to rename to .pth. I think there is an issue with the release. Did you try normal ip-adapter-xl? Also whats is your GPU make and VRam.
@@controlaltai Thanks for the info. Normal ip-adapter-xl works fine. I have 3090 24GB VRAM
Your card is good. I think it's an issue with a1111 and this model. I will update the comment if there is a fix on GitHub. Not sure why for some people it's not working.
Bro Can I ask how can I re create photo that look similar to original one
(Image to prompt) ai that’s use for free bro?
There are multiple ways to do that. Too complicated to explain in a comment. But you can use IP adapter along with some other techniques to choice the desired results.
can u detail me more about this option on IP adapter brother how to do it i really need this so much@@controlaltai
Take an Image, Do an Image to image using IP adapter Face ID Model, if doing portraits. Otherwise do an Image to Image using Ip Adapter plus model. You can use control net to have similarity in composition to the original. Like open pose or a depth map.
@@controlaltai this work for image of house or landscape building …..? And Can I get link to download this ?
Yeah it would. But you would get better control in comfy ui rather than a1111.
ip adapter FULL face - for what?
I am not getting good results as mentioned in the video please help with that
You have to give some details as to what you are trying to do, checkpoint, etc.
@@controlaltai I am using Realistic_Vision_V5.1-inpainting.safetensors , vae-ft-mse-840000-ema-pruned.safetensors and ip-adapter-plus-face_sd15 [71693645] with CPU but not geeting same result as mentioned in video
What are you trying to do exactly?
@@controlaltai I am using the same approach as mentioned in the video and I want to change the hair colour only but it changes the hairstyle also by using an IP adapter
Have you tried inputting the hair color in the prompt? Also double check the ip adapter model being used.
Bro, you don't say where the downloaded vae files need to go....
In the models vae folder...
prefer inpaint anything more easier auto segmentation
Love how you did not say where to put the files. Some people do not know bro. Sigh.
My bad, will mention it next time. It goes in this folder: Stable Diffusion\stable-diffusion-webui\models\ControlNet
sdxl 1.5 - there is no sdxl of such version, it is sd 1.5 10:03
Correct that's a mistake. SDXL not SDXL 1.5
If that's what AI thinks a 60 year old looks like, I shudder to imagine what it thinks 80+ YO is. 😂
Did he say image art? 🤯 the road to the bottom.
I hate AI generated voices. But maybe you can't speak English well, in which case, thank you for trying.
So you'd risk missing out on solid info just because you aren't a fan of the way its conveyed?
People these days.
Quite a few of these guys have gone that route too, so I suppose good luck figuring it out on your own 🤔
@@MaddJakd Um, I still watch the videos? They're harder to understand than real speech, and easier to understand than heavy accents. I literally thanked the guy for going this route instead of using an accent.
@jonmichaelgalindo that's entirely a personal problem then, but a thing I guess.
@@MaddJakd It's not a "personal problem"! It's definitely a fault of the technology. Since the voice-synthesizer AI doesn't understand what it's reading, it can't intone anything right. Try listening to the dialogue in an AI-voiced audio book. It's beyond cringe-inducing how out of context the characters' attitudes sound. For scholarly articles, not only will existing tech like ElevenLabs frequently mispronounce terms, it will completely botch the pauses in long sentences, so you never know what parts of the sentences are parentheticals and what parts are the primary statement.
@jonmichaelgalindo as someone who has many people around speaking in very different accents, no.
Just like one can pick up on those, I certainly don't have an issue deciphering AI speech, and I'm far from alone....
Plus, as mentioned before, they aren't the first to use AI speech nor replicate AI speech, and folk aren't exactly complaining about it.
Nobody else is getting this error? "module 'torch.nn.functional' has no attribute 'scaled_dot_product_attention"
I updated Torch and still no go.
Hi please check this, I found it on reddit. See if it works for you.
I found this in regard to the control net:
\* Please note, the diffusers start to use \`torch.nn.functional.scaled\_dot\_product\_attention\` if your installed torch version is >= 2.0, and the ONNX does not support op conversion for “Aten:: scaled\_dot\_product\_attention”. To avoid the error during the model conversion by “torch.onnx.export”, please make sure you are using torch==1.13.1.
blog.openvino.ai/blog-posts/enable-lora-weights-with-stable-diffusion-controlnet-pipeline
@@controlaltai Thank you very much for trying to help. That link doesn't mention IP Adapter at all. All other Control Nets work fine.
I really don't want to downgrade my torch from above 2.0 to 1.13.1. I have many other AI stuff using Torch. If that's what it takes I guess I can't use it though it looks amazing.
Thanks again for trying to help.
Hi, give me some time. Working on the next video and workflow. After Friday I will sit and research about this issue, ask the Dev's if necessary. If I find anything I will just reply here. It should work with torch. Just for info can you tell me which IP adapter model you get that error. Is it all ip adapters or some specific ip adapter models.
@@controlaltai ip-adapter_clip_sdxl Preprocessor with ip-adaptor_xl model. That errors out. I tried the 1.5 versions with a 1.5 checkpoint. Didn't error out but also had no effect whatsoever. Thanks for keeping an eye out for a solution!
Hi, There are issues with the ipadapter xl model for many people. What the dev suggest is either to try in comfy or try the xl model but with sd 1.5 checkpoint even in comfy. Let me know how it goes for you.