ComfyUI: Style Aligned via Shared Attention (Tutorial)

แชร์
ฝัง
  • เผยแพร่เมื่อ 16 พ.ย. 2024

ความคิดเห็น • 71

  • @controlaltai
    @controlaltai  10 หลายเดือนก่อน +6

    Update 2: ComfyUI will create batches based on available VRAM. If Batches of 8 is breaking the Style Align (i.e. style align applies up to "x" number images then resets), Reduce the batch size. If the entire batch is not loaded in VRam and batches are load in batches via ComfyUI, style aligned will not work as intended. Solution is to reduce the batch size. So do in batches of 2-4 and keep repeating the first image changing other images prompts. The VRAM in the tutorial was 24 GB. On lower VRAM reduce batch size, approx 4 for 12 GB and 2 for 6GB.
    Update: Ignore step at 2:10, The dev master branch has been updated. You should now get the node via default Comfy Manager install.

    • @GreaterThanCookie
      @GreaterThanCookie 8 หลายเดือนก่อน

      Hello! Can you tell me how to do it, if there is a node Batch Prompt Schedule with prompts for 8 frames and need to be 2 batches of 4 frames (because 12 gb RAM) at the same time, as you pointed out, you need to repeat the first image at the beginning of each batch somehow? Maybe there is such a Workflow?

    • @controlaltai
      @controlaltai  8 หลายเดือนก่อน

      @@GreaterThanCookie Hi, Whatever I have shown in the workflow, is the same workflow, instead of 8 use 4. Say cow, lion, tiger, elephant. After generation, repeat using same fixed seed and change keywords to cow, crocodile, camel, shark. Here cow is same, keep repeating process to get all animals.

    • @GreaterThanCookie
      @GreaterThanCookie 8 หลายเดือนก่อน

      ​@@controlaltai Thanks for your reply! But when I run sequentially 2 KSamplers with prompts in Batch Prompt Schedule:
      "0" :"white cow,",
      "1" :"brown horse,",
      "2" :"yellow lion,",
      "3" :"pink rabbit",
      and
      "0" :"white cow,",
      "1" :"green crocodile,",
      "2" :"blue jay,",
      "3" :"red fox,",
      then in the second batch the last 2 pictures are similar to each other, but not to the others! This means that the memory for the second batch is even less and I can't do as you write. I managed to generate sequentially similar objects only with StyleAligned Sample Reference Latents + StyleAligned Reference Sampler, feeding a cow to Reference Latents and a list of prompts with other animals to Reference Sampler.

    • @controlaltai
      @controlaltai  8 หลายเดือนก่อน

      Run only 1 k sampler and not 2. Change keywords and repeat with only 1 k sampler. Running 4 with 2 k samplers is the same as running 8 with 1.

    • @GreaterThanCookie
      @GreaterThanCookie 8 หลายเดือนก่อน

      ​@@controlaltai thanks, this working in batch with 3 images!

  • @WiLDeveD
    @WiLDeveD 8 หลายเดือนก่อน +3

    Impressive Tutorial ❤❤❤ very useful and informative 💯💯💯 Thanks

  • @Shisgara77
    @Shisgara77 10 หลายเดือนก่อน +2

    OMG 🤩 I'll have to re-watch this video over and over again until I can realise something 🙈Unbelievable! Thanks for sharing ❤

  • @bh0072006
    @bh0072006 10 หลายเดือนก่อน +2

    This is something I was looking forward to since the paper was released. Thanks so much for letting us know, and for the great tutorial.

  • @g-grizzle
    @g-grizzle 8 หลายเดือนก่อน +1

    i have found your videos extremely helpful thank you for the effort and time you have put into making these.

    • @controlaltai
      @controlaltai  8 หลายเดือนก่อน

      Thank You!! Appreciate it....

  • @cgartist1447
    @cgartist1447 10 หลายเดือนก่อน +1

    Amazing! thank you! it is really mind blowing! God, where do you get all this? thank you for sharing this treasure!

  • @freshlesh3019754
    @freshlesh3019754 6 หลายเดือนก่อน

    Thanks for the tutorial. I am using your Style aligned with controlnet workflow and the Blip analyzer gives a: The size of tensor a (6) must match the size of tensor b (36) at non-singleton dimension 0
    I've tried updating and doing a clean install but it looks like the Blip analyze image node breaks the workflow. I bypassed and it works but doesn't actually produce a mask like in the example. Can you suggest a work around or similar node that might resolve it?

    • @controlaltai
      @controlaltai  6 หลายเดือนก่อน +1

      A wd14 tagger node will give you tags, you can use some of the tags and manually put the custom prompt. Any of these nodes are not that accurate. They are only mean you give some prompt suggestion. You can avoid them completely as well and upload the images on gpt4 and ask "describe this image" and use partial text from there. Or manually put whatever you think is short and appropriate.

  • @LuxElliott
    @LuxElliott 10 หลายเดือนก่อน +2

    Another wonderful video. You are great teachers and I'm learning a lot from what you're doing. If it is OK, I'd love to make a request. I'm becoming frustrated with nearly anything I create the subjects are looking and posing for the camera/viewer. I know there are lots of techniques, and I plan to explore them; however, I'd love to see ways to create scenes where the subject/s is not posing for the camera. I want to see shots as if I were out with my camera taking pictures of the real world in action - or general creative scenes, but still have some control of what is happening in the scene. Thanks again for all you do.

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน +4

      Thank You! Try these in the positive prompts (separately):
      busy
      candid shot
      busy (doing some action)
      Documentary
      Try combining them with a negative prompt: looking at viewer, looking forward
      Let me know if that helps. tell me which checkpoint are you using? And a sample Prompt. I can try it on my end and let you know what works and what does not.
      It could also be a checkpoint issue, if the checkpoint is trained for looking at camera only views. Can verify only after I know what checkpoint you are working on.

    • @LuxElliott
      @LuxElliott 10 หลายเดือนก่อน +1

      @@controlaltai Thank you, I will try these great suggestions.

  • @damsotaku945
    @damsotaku945 9 หลายเดือนก่อน

    Amazing tutorial , thanks a lot for sharing ^^

  • @sventillack
    @sventillack 9 หลายเดือนก่อน

    Thank you for this tutorial. When right clicking KSampler I cant see Convert Input to Seed (@3:46)… Can you tell me why? I think I closely and fully followed the instructions.

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      You have to show me a screenshot, I have to see what you see to figure out what is the issue. Its "convert seed to input" and not "convert input to seed"

  • @valorantacemiyimben
    @valorantacemiyimben 3 หลายเดือนก่อน

    5:25 Hello. The boxes with this prompt do not appear for me. How can I open them?

    • @controlaltai
      @controlaltai  3 หลายเดือนก่อน

      Hi, cannot understand the issue. Explain with more clarity. What do you mean boxes?

  • @chrisfromthelc
    @chrisfromthelc 9 หลายเดือนก่อน

    If I have one (or more) Loras that I want to use on the styled image, where should that go in the Reference Image flow?

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      Hi, can you email me? I will share the workflow there in reply. I have made the workflow for you right now with the LoRA node integrated. You can test the LoRA on that workflow. (mail @ controlaltai . com) - without spaces

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      Technically it goes right after checkpoint. But it should connect to all conditioning.

  • @ywy-v4n
    @ywy-v4n 8 หลายเดือนก่อน

    bro,thanks for your brilliant tutorial, is there any chance to use an existed img instead of generating an img, because I want the result has a face I have ,not random

    • @controlaltai
      @controlaltai  8 หลายเดือนก่อน

      Welcome! Yes the reference workflow you can use for style transfer as per reference image. However I am not sure what you are asking. Are you saying you want the face from a reference image to come out the same? Cause that's not possible. The generated ingae can have the same face, but they are different ways to do it. The reference is only used for style.
      You can use lora to generate image of trained face and style align to get the styles, face will remain consistent. You can also use text to image with ipadapter or face id then merge it with style align.
      The reference workflow will remain the same only the text to image generations part will have changes

    • @ywy-v4n
      @ywy-v4n 8 หลายเดือนก่อน

      yes, the latter is exactly what I want to ask, using an image of a person instead of generating an image by prompt so I can get the result with the excat face I want.
      so if I want to change the style of an image without changing its pose, I can add lora to change it and use the same image as the refernece to get the style, right?However, I tried it but the result is not very good, maybe I should change the param in sampler, or do you know any other effecient way? @@controlaltai

    • @ywy-v4n
      @ywy-v4n 8 หลายเดือนก่อน

      maybe controlnet is a better way? I cant tell the difference between them @@controlaltai

  • @LuxElliott
    @LuxElliott 10 หลายเดือนก่อน +2

    Thanks!

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน

      Thank You!!!

  • @CerbyBite
    @CerbyBite 10 หลายเดือนก่อน

    Any idea why the node "StyleAligned Sample Reference Latents" doesn't show up for me?

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน +1

      Yes, Please check from here how to get that node: 2:10

  • @goor76
    @goor76 9 หลายเดือนก่อน

    nice! I copied the workflow, and added control net but with softedge using an already created image so simpler, but working in the same way more or less. I found an interesting "bug?" With ControlNet included it creates an error if the batch number is an odd number. So I can do a batch of 2, 4, 6 etc. But 3,5,7, etc gives an error... scratching my head on this one.

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      Interesting find 🤔. Let me check this on my end and get back to you.

    • @xandervera7026
      @xandervera7026 9 หลายเดือนก่อน

      @@controlaltai
      Also getting the same thing, I've check all inputs and parameters and cant find a cause for it!
      I also get images batching in groups of two regardless of the quantity of images being made. If I do 8 images 1-4 are the same and 5-8 are the same. Same proportion for 4,8,16 etc always 2 groups?

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      You need to send me the workflow. I just tried with ControlNet and 3 batch. It is giving no error. mail @ controlaltai .com (without spaces)

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      I have to check your workflow for the batch issue. Send me the workflow at mail @ controlaltai . com (without spaces).

  • @jayolay2364
    @jayolay2364 10 หลายเดือนก่อน

    I followed your workflow for "referance image" but the "StyleAligned Reference Sampler" seems to only output a black image... any idea what i'm missing?
    thx a ton!

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน +1

      I have to take a look at the workflow to check. Can you email it to me? mail @ controlaltai .com (without spaces)

  • @AnimeDiff_
    @AnimeDiff_ 10 หลายเดือนก่อน

    maybe i'm missing something, i wanted to try to use this with animate diff to create style aligned video. I'm also wondering, is there a way to use this with img2img? if this could work with animate diff, it would be possible to run CN from a source vid to make some very interesting vid2vid. thank you

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน

      Well, I need to do more research on it for image2image. If I come up with something will let you know. The Style Aligned tech is tricky, I can't say if it works or not without doing extensive workflow testing. Give me some time...:)

  • @lilillllii246
    @lilillllii246 9 หลายเดือนก่อน

    Hello! Is there a way to integrate two json files with different functions in comfyui? One is to do the inpaint function, and the other is to maintain a consistent character through faceid, but I'm having trouble linking the two.

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน +2

      Hi, One way you can do is, open two browser tabs and load workflow 1 in one and workflow 2 in the other. Copy all nodes from workflow 2, go to tab 1 and paste. Then close tab 2, connect the new past nodes as required and save the workflow as new json. You can keep a split workflow if required one at top and one below, but all in one json.

    • @lilillllii246
      @lilillllii246 9 หลายเดือนก่อน

      @@controlaltai Thanks. I want to link the result_face(file1) created with faceid to inpaint(file2), but I'm not sure which one to link.

    • @lilillllii246
      @lilillllii246 9 หลายเดือนก่อน

      @@controlaltai I can change the face of the inpaint file, but only with a prompt, so I want to use the consistent face from file 1.

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน +1

      @lilillllii246 send me both your workflows, I will link it and send one json to you. The way to do it is from the vae decode now which goes to image preview save, drag it and connect that to a preview bridge node. That is available in impact pack. Use the preview bridge to inpaint and put that forward.

    • @lilillllii246
      @lilillllii246 9 หลายเดือนก่อน +1

      @@controlaltai drive.google.com/drive/folders/1jCZKufaXZ6G67ju4H-8utAkqJ9HnJDG0?usp=sharing Take a look when you get a chance. I'm a newbie and it's so hard

  • @dflfd
    @dflfd 10 หลายเดือนก่อน

    is there a way to provide a reference style image rather than generating on the fly?

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน +3

      Please check the tutorial workflow 3 and 4. The reference style image is used. From there it gets how to make the image generation. Use prompts to draw its “attention” on what you want it to do, if not doing only via reference image alone.

  • @qkrxodls3377
    @qkrxodls3377 6 หลายเดือนก่อน

    Hello Seth, sent you an email regaring making the thread-car you demoed on the last part of this session. sent my workflow, and the result. I tried various controlnet values and style align parameters, but was not able to achieve close to your's. Would deeply appreciate if you guide me with the matter. Best,

    • @controlaltai
      @controlaltai  6 หลายเดือนก่อน

      Hi, sure. I got the email. Will respond to it shortly.

    • @controlaltai
      @controlaltai  6 หลายเดือนก่อน

      Hi, replied on email, please check.

  • @KINGLIFERISM
    @KINGLIFERISM 10 หลายเดือนก่อน

    Odd I dont have the style aligned Reference Latent. I have the other two though. Nevermind... I had to go to the github not just using Comfy Manager

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน

      If you still don’t have the node check from here: 2:10

  • @Steve.Jobless
    @Steve.Jobless 10 หลายเดือนก่อน +1

    bro why mine doesn't have style aligned sample reference latents node, i was follow all the requirements

    • @controlaltai
      @controlaltai  10 หลายเดือนก่อน +1

      Uninstall and reinstall from comfy manager. There was an update.

    • @Steve.Jobless
      @Steve.Jobless 9 หลายเดือนก่อน

      @@controlaltai aight thanks bro, anyway is this possible to make variations from our own images using this style aligned

    • @controlaltai
      @controlaltai  9 หลายเดือนก่อน

      Yes it can. If you want to maintain your face you need something additional. This works brilliantly mimicking the first image. If you have a trained LoRA of a person using that is Golden for variations. Another way would be to use Face ID like reactor or roop or ipadapter face id before the generation.

    • @Steve.Jobless
      @Steve.Jobless 9 หลายเดือนก่อน +1

      ​@@controlaltai you're amazing, man thank you

    • @MS-gn4gl
      @MS-gn4gl 9 หลายเดือนก่อน

      @@controlaltai weird I tried this but the extra connection for "ref latents" after the "ref latent" connection doesn't connect to anything and pulling it out that gives the possible connections only gives "StyleAligned Sample Reference Latents" as an option. Must be some change to the latest version?

  • @97BuckeyeGuy
    @97BuckeyeGuy 10 หลายเดือนก่อน

    This AI Voice sounds like Tim Meadows from SNL.