How To Fine-tune LLaVA Model (From Your Laptop!)

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 เม.ย. 2024
  • Follow Along: console.brev.dev/launchable/d...
    Join Our Discord: / discord
    In this guide, we fine tune the popular open sourced model, LLaVA (Large Language-and-Vision Assistant) on a dataset to be used in a visual classification application. You can perform the fine tuning yourself, regardless your level of experience, or the level of compute you have access to.
    Please leave any future guides you would like made below!

ความคิดเห็น • 44

  • @RehanKhan-ps5tn
    @RehanKhan-ps5tn 2 หลายเดือนก่อน +12

    You know bro is a A level engineer when he can explain stuff soo easily

  • @wajdalkousa9745
    @wajdalkousa9745 2 หลายเดือนก่อน +10

    Big man ting yeh. Looking good brev

  • @TomanswerAi
    @TomanswerAi 2 หลายเดือนก่อน

    Best guide/insights on fine tuning I’ve seen. Subscribed 🔥

  • @ae_alg
    @ae_alg 2 หลายเดือนก่อน +4

    Baxate, you’re the goat. For a beginner like myself, that was a very useful video

    • @Baxate
      @Baxate 2 หลายเดือนก่อน +2

      glad you found it useful!

  • @hubertboguski
    @hubertboguski 2 หลายเดือนก่อน +2

    I tried out brev for my machine learning course, love the options for the payment system where i have the option to cut if off after X dollars, low prices, and ui looks awesome. I know it said it somewhere but it took me a minute to realize that my Jupyter notebook takes around 4 minutes to launch so for blind ppl like me I’d put some more text saying Jupyter notebook will be created in X minutes.
    Love this vid and outreach- I’ll keep watching Baxate

    • @brev-dev
      @brev-dev  2 หลายเดือนก่อน +2

      Thank you so much for the kind comment! I will bring that feedback to the team :).

  • @nady_in_rome8086
    @nady_in_rome8086 หลายเดือนก่อน

    Very useful! Subscribed indeed 🙂

  • @supritanellikeri4335
    @supritanellikeri4335 หลายเดือนก่อน

    Thank you, this is great

  • @suteguma0
    @suteguma0 2 หลายเดือนก่อน +2

    bro the goat

  • @athreesh
    @athreesh 2 หลายเดือนก่อน +3

    Bax ate with this one!

    • @Baxate
      @Baxate 2 หลายเดือนก่อน +1

      yurrr

  • @ZeyuJiang-ud6hn
    @ZeyuJiang-ud6hn 3 วันที่ผ่านมา

    man you are awesome!

  • @aimattant
    @aimattant 21 วันที่ผ่านมา

    Thank you

  • @diieggoo0
    @diieggoo0 2 หลายเดือนก่อน

    twin served meat 🔥🔥

  • @atriantafy
    @atriantafy 2 หลายเดือนก่อน

    Great video:) Can you please comment on the dataset size? The one you used consists of roughly 9k samples. How many samples are needed to have a decent lora fine-tune? I've heard that with LLMs you can achieve much even with only a few examples. Is it the case for LLava as well? Please share any more information you can on the dataset creation. thanks!

  • @shivanshsingh6899
    @shivanshsingh6899 หลายเดือนก่อน +1

    Hello bro , after running the deepspeed script there is no file with name mm_projector.bin is generated which is required in merging process but a non_lora_trainable.bin is generated

  • @freddyfly8970
    @freddyfly8970 หลายเดือนก่อน +1

    is it possible that the model can tell you there a picture was taken(geographic), based on probability, and purely focuses on this, because you give him the information in fintune( im a beginner)

  • @paulmiller591
    @paulmiller591 2 หลายเดือนก่อน

    Cool demo, thank you. Could you share some examples of training data? That new model is great. Can you share it on Hugingface? How big did it end up being for inference purposes,

    • @brev-dev
      @brev-dev  2 หลายเดือนก่อน

      Hey Paul! Here is the documentation for the model on hugging face:
      huggingface.co/docs/transformers/en/model_doc/llava
      Here is the training dataset:
      huggingface.co/datasets/Multimodal-Fatima/OK-VQA_train
      Here is the testing dataset:
      huggingface.co/datasets/Multimodal-Fatima/OK-VQA_test
      Note that we did not create the model, nor the training or testing datasets! We are simply using them as an example here

  • @camdencz
    @camdencz 2 หลายเดือนก่อน

    Came from IG

  • @madhavparikh6747
    @madhavparikh6747 หลายเดือนก่อน

    Hey, I had a query regarding generating the custom dataset using gpt 4, shown at the very beginning. It seems it does not generate json file with the exact format necessary for LLaVA

  • @Gvbr1e1777
    @Gvbr1e1777 2 หลายเดือนก่อน +1

    came from ig

  • @drsamhuygens
    @drsamhuygens 2 หลายเดือนก่อน

    For this use case, why didn't you just use prompt engineering (using a very specific prompt) to give you the same output?

  • @Snorlaxer565
    @Snorlaxer565 หลายเดือนก่อน +1

    Came from TikTok! But I have no experience w AIs but am surely going to dive in to train a model for my startup application. Do you think this model could be trained to estimate macros from an image, let’s say in buckets or ranges, after identifying the food itself?

    • @brev-dev
      @brev-dev  หลายเดือนก่อน

      yes, absolutely! That is a perfect use case

  • @raresracoceanu6039
    @raresracoceanu6039 3 วันที่ผ่านมา

    Can you show how to fine-tune VILA models from Nvidia?

  • @user-zw8do7um9r
    @user-zw8do7um9r 2 หลายเดือนก่อน

    Have you used this link? I'm reporting an error when loading the dataset now, if you can please take a look . thank you

  • @aimattant
    @aimattant 21 วันที่ผ่านมา

    Wouldn't prompting the LLM in various scenarios in the application code be enough to get the right response? I am not clear on fine-tuning.

    • @ashutoshtrivedi2527
      @ashutoshtrivedi2527 วันที่ผ่านมา

      @brev-dev Yeah I feel so too. Finetuning is more useful when you have a novel category or more deeper classification. For example general model identifies the dog, but the fine-tuned identify the breed as well. Then you need to have the tagged dataset of dog breeds.

  • @BR-lx7py
    @BR-lx7py 2 หลายเดือนก่อน

    Wouldn't it have been simpler to feed the fluffy text to llama3 to come up with the summary?

  • @tysonla181
    @tysonla181 หลายเดือนก่อน

    Do I have to buy credits to follow along?

  • @kukiui
    @kukiui หลายเดือนก่อน

    i see you're finetuning LLaVA 1.5 is it possible to use this notebook for 1.6 too?

  • @ppdesai434
    @ppdesai434 2 หลายเดือนก่อน

    baxate !

  • @julienblanchon6082
    @julienblanchon6082 2 หลายเดือนก่อน

    Is this video for elementary school ?

  • @aamirshaikh2100
    @aamirshaikh2100 2 หลายเดือนก่อน +1

    In what world is this a “beginner friendly machine learning guide”? What💀💀💀😂😂

    • @brev-dev
      @brev-dev  2 หลายเดือนก่อน

      let me know where you struggled! I tried to explain the concepts at a high level and run the cells as they were written.

    • @aamirshaikh2100
      @aamirshaikh2100 2 หลายเดือนก่อน +3

      @@brev-dev thanks for replying :)
      Carters insta story said “beginners guide” so i thought it would be a intro to machine learning or something
      But after seeing this video …
      A beginner WOULD NEVER be able to comprehend a single sentence in this video 😂😂

    • @brev-dev
      @brev-dev  2 หลายเดือนก่อน +5

      @@aamirshaikh2100 This is Carter :). I will keep that in mind and maybe make a dedicated intro to machine learning video!

    • @aamirshaikh2100
      @aamirshaikh2100 2 หลายเดือนก่อน

      @@brev-dev thanks for taking the time 💓

    • @germanpancardo7683
      @germanpancardo7683 2 หลายเดือนก่อน +1

      @@aamirshaikh2100 I am not Carter (or in any way related to the channel, lol) but if you tell me what is your starting point I can send you some resources or more specific questions, I'm not a pro yet but it may be useful if you're coming from zero