LLAMA-2 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ก.ย. 2024

ความคิดเห็น • 229

  • @teleprint-me
    @teleprint-me ปีที่แล้ว +51

    I was in the hospital because my lung collapsed and I've been having a seriously rough go at it lately (life long issues with fam, etc), so I really appreciate this video. Thanks for all your hard work. Researching these topics and understanding them is no small feat. Keep it up.

    • @engineerprompt
      @engineerprompt  ปีที่แล้ว +13

      I am really sorry to hear that! Hope you are recovering well. Wishing you a quick recovery. Also really appreciate all your contributions. Stay strong my friend!

    • @immortalsun
      @immortalsun 10 หลายเดือนก่อน +1

      Hope you get better!

  • @LainRacing
    @LainRacing ปีที่แล้ว +58

    Very disappointed you didn't show this actually doing anything. How to verify or test if its working. I can run a script and have it do nothing... How do we see it actually worked or test it.

  • @samcavalera9489
    @samcavalera9489 ปีที่แล้ว +14

    Thanks SO MUCH brother! You are a true hero! Fine tuning is the most important part of OS llms. That's where the value/wealth is hidden. I cannot wait for your following fine-tuning video.🙏🙏

  • @christianmboula8923
    @christianmboula8923 5 หลายเดือนก่อน +1

    Superb tutorial by its clarity, simplicity and to the point...big Thank you! NOTE Bugfix : replace the underscore with corresponding dash to make the autotrain command run on colab

  • @photojeremy
    @photojeremy ปีที่แล้ว +25

    would be great to have a colab notebook for this that included inference on the finished pushed model

    • @MuhammadFhadli
      @MuhammadFhadli ปีที่แล้ว

      hi, have you find a way to do the inference?

    • @manujmalik9843
      @manujmalik9843 ปีที่แล้ว

      @@MuhammadFhadli did you find it?

    • @gerardorosiles8918
      @gerardorosiles8918 ปีที่แล้ว

      I was thinking that once you push to huggingface you could use something like text generarion webui to play with the model

  • @jersainpasaran1931
    @jersainpasaran1931 ปีที่แล้ว +3

    Thank you very much champion! We are getting to the true spirit of open source, allowing science to be truly scalable for the public and public interests.

  • @arjunv7055
    @arjunv7055 ปีที่แล้ว +4

    One of the best video I have come across. I will definitely share this channel with my colleagues and friends who wants to learn more on this topic.

  • @garyhuntress6871
    @garyhuntress6871 ปีที่แล้ว +2

    I was initially skeptical but this was an excellent short tutorial. Thanks!

  • @ilhemwalker9145
    @ilhemwalker9145 7 หลายเดือนก่อน +2

    hey please i copied the same line but i'm getting error : autotrain [] llm: error: the following arguments are required: --project-name. i don't know what to do

  • @VerdonTrigance
    @VerdonTrigance 7 หลายเดือนก่อน +1

    How to train on unstructured data (a book for example) with self-supervized train algorythm and eventually make a chat from it?

  • @PickleYard
    @PickleYard ปีที่แล้ว +4

    Wow, just what I needed. I just put together a Flan Orca style dataset, I cant wait to try in Colab! Thank you for your hard work.

  • @anjakuzev592
    @anjakuzev592 ปีที่แล้ว +7

    Please make a video for creating your own dataset and actually using the model

  • @jongheebae6269
    @jongheebae6269 8 หลายเดือนก่อน +2

    I have the autotrain error as follows.
    autotrain [] llm: error: the following arguments are required: - -project-name
    So I changed '--project-name' instead of '--project_name'. Then faced another error.

  • @Yash-mk8tc
    @Yash-mk8tc ปีที่แล้ว +5

    how to use this trained model?
    can you please make video on this?

  • @bardaiart
    @bardaiart ปีที่แล้ว +3

    Thank you very much!
    Looking forward to the dataset preparation video :)

  • @OpenAITutor
    @OpenAITutor ปีที่แล้ว +3

    So great! Thank you for being so clear!!! loving it

  • @DikHi-fk1ol
    @DikHi-fk1ol 10 หลายเดือนก่อน +1

    Please make another tutorial on how to fine-tune a model on custom dataset rather than using the hugging face ones.

  • @anantkabra6825
    @anantkabra6825 10 หลายเดือนก่อน +1

    Hello I am getting this error can someone please help me out with it: ValueError: Batch does not contain any data (`None`). At the end of all iterable data available before expected stop iteration.

  • @arjunv7055
    @arjunv7055 ปีที่แล้ว +4

    some of my friends who followed this tutorial mentioned they see an argument issue. I think it is because of the command being broken down into multiple lines. Running the command in multiple lines requires a '\' to be added at the end of every line. Final command should look like this
    !autotrain llm --train --project_name '' \
    --model TinyPixel/Llama-2-7B-bf16-sharded \
    --data_path timdettmers/openassistant-guanaco \
    --text_column text \
    --use_peft \
    --use_int4 \
    --learning_rate 2e-4 \
    --train_batch_size 2 \
    --num_train_epochs 3 \
    --trainer sft \
    --model_max_length 2048 \
    --push_to_hub \
    --repo_id /'t \
    --block_size 2048 > training.log &

    • @nayyershahzad8051
      @nayyershahzad8051 8 หลายเดือนก่อน

      getting following error, kindly help:
      autotrain [] llm: error: the following arguments are required: --project-name

    • @avataraang3334
      @avataraang3334 หลายเดือนก่อน

      @@nayyershahzad8051 same here

  • @zhirongchen9861
    @zhirongchen9861 ปีที่แล้ว +3

    Hi, how can I choose a method to finetune the model. For example, if I want to use LoRA to finetune lamma2, how can I do it?

  • @sharadpatel107
    @sharadpatel107 ปีที่แล้ว +4

    can you please put in a link for a colab notebook for this

  • @krishnareddy9
    @krishnareddy9 ปีที่แล้ว +1

    Thank you for the video, I am looking forward video about how to prepare our own dataset without using huggingface dataset !!

    • @engineerprompt
      @engineerprompt  ปีที่แล้ว +1

      It's up now, enjoy!

    • @bookaffeinated
      @bookaffeinated 5 หลายเดือนก่อน

      @@engineerprompt video link please.... And this one-line command throws error on colab: unknown argument, any suggestions pls?

  • @bagamanocnon
    @bagamanocnon ปีที่แล้ว +2

    how can i incorporate my own data into the 'assistant' fine tune? for example, a 100 page document about a company product. do i format it into the something similar to what's in the openassistant dataset and add it to the dataset? or finetuning on own data will be another finetuning step? i.e. after finetuning on the openassistant dataset, i need to run another finetune for my own data? cheers and thanks for all your hardwork to share your knowledge to us!

  • @alx8439
    @alx8439 ปีที่แล้ว +2

    Does it use lora or qlora techniques?

  • @miriamramstudio3982
    @miriamramstudio3982 ปีที่แล้ว

    Thanks for the update. Very interesting.

  • @justabacteria
    @justabacteria ปีที่แล้ว +2

    Could you explain or make a video on how to use your new fine-tuned model?

  • @swauce507
    @swauce507 ปีที่แล้ว +2

    After you finetune the model, how do you use it as a chat interface to query the model and see its results?

  • @dr.aravindacvnmamit3770
    @dr.aravindacvnmamit3770 7 หลายเดือนก่อน

    Hi, the way you are explaining is very positive !!!! One solution am not getting is If I want to train my custom data on regional languages how to proceed can you share your knowledge on this. Which model is best on this and if we pass the Prompt in English will it gets converted to regional language and generates the ouput?

  • @serenditymuse
    @serenditymuse ปีที่แล้ว +2

    The major work looks to be in making your dataset properly. Which is pretty common. Do you have or are you planning another video that is for training models simply by handing it a lot of files of say web content or better still the raw urls and perhaps something like tags and such? In other words how to add to unsupervised learning from a corpus.

  • @learn2know79
    @learn2know79 ปีที่แล้ว +1

    Hi Thanks for the detail explanation. Could you please make another video explaining the RLHF with code implementation.

  • @adriantang5811
    @adriantang5811 ปีที่แล้ว

    Great Sharing again. Many thanks!

  • @dec13666
    @dec13666 11 หลายเดือนก่อน +2

    Nice video!
    A recurring aspect I have seen amongst these tutorials however, is that they never mention how to use the custom LLM model (i.e., doing some inference with the custom LLM model), or how to obtain metrics about it... Do you have any other video, where you discuss those 2 topics?
    Thank you!

  • @ajaym4257
    @ajaym4257 8 หลายเดือนก่อน +1

    usage: autotrain []
    AutoTrain advanced CLI: error: unrecognized arguments: --use-int4 --learning-rate 2e-4 --num-train-epochs 3 --model-max-length 2048
    i'm getting this error

  • @MaralSheikhzadeh
    @MaralSheikhzadeh 10 หลายเดือนก่อน +1

    well explained video. thank you:)

  • @8eck
    @8eck ปีที่แล้ว +4

    What if i only want to feed a specific non-instruction data into the model? For example some financial data or some books or some glossary? Can i just keep the ###Output empty, will the model learn from that data? Also, do i need to split that data into train and test parts or it is not required and is optional for pre-trained models?

    • @curtisho5255
      @curtisho5255 ปีที่แล้ว

      i have the exact same question! omg!

    • @phoenixfire6559
      @phoenixfire6559 ปีที่แล้ว

      If you leave the output empty then the model will learn to give you empty responses every time you put that type of data in. The best way to make the data for your finetune is thing about it from reverse. When you put the input in, what do expect the output to be? That's what you should be filling output with.

    • @8eck
      @8eck ปีที่แล้ว +2

      @@phoenixfire6559 i'm talking about pre-training like fine-tuning, models in the pre-training phase doesn't get any output examples, they just learn from the data, that's what i'm trying to understand. Is fine-tuning is only about question & answer pairs? How to continue pre-training of the model with frozen base weights. Just like transfer learning.

    • @curtisho5255
      @curtisho5255 ปีที่แล้ว +1

      @@8eck exactly. he don't get it. We want it to train on pure data, not train on Q&A responses. He must have not played with chatbase.

    • @robosergTV
      @robosergTV ปีที่แล้ว

      @@curtisho5255 lmao the author of the video knows this. The video is clickbait for farm views (which is money) from noobs, who cant use simple google search.

  • @prestonmccauley5467
    @prestonmccauley5467 ปีที่แล้ว +2

    I followed this exactly in collab, but seems that something is wrong with the arguments, Can you share your colab file?

    • @arjunv7055
      @arjunv7055 ปีที่แล้ว

      if you are breaking the command into multiple line please make sure to add \ towards the end so finally the command looks like this
      !autotrain llm --train --project_name '' \
      --model TinyPixel/Llama-2-7B-bf16-sharded \
      --data_path timdettmers/openassistant-guanaco \
      --text_column text \
      --use_peft \
      --use_int4 \
      --learning_rate 2e-4 \
      --train_batch_size 2 \
      --num_train_epochs 3 \
      --trainer sft \
      --model_max_length 2048 \
      --push_to_hub \
      --repo_id / \
      --block_size 2048 > training.log &

  • @caiyu538
    @caiyu538 9 หลายเดือนก่อน

    How to save the fine tuned model to local disk instead of pushing to hub. Could you show us the model pushed to hub? These video graphs will make it clearer. Great.

  • @abdullahbinmubarak1
    @abdullahbinmubarak1 2 หลายเดือนก่อน

    Thanks Brother 😍

  • @kumargaurav2170
    @kumargaurav2170 ปีที่แล้ว

    I trained llama 2 13-b on my custom dataset for just 200 epochs and results are unbelievable😀😀 that too it performed really well on tabular data with only numerical inputs

    • @engineerprompt
      @engineerprompt  ปีที่แล้ว

      Interesting, how big was your dataset? Did you run into overfitting issues?

    • @kumargaurav2170
      @kumargaurav2170 ปีที่แล้ว

      @@engineerprompt I trained it on around 2k curated dataset. Loss came around 0.13. overfitting didn't happen as the loss was consistently decreasing

    • @bharatkaushik9916
      @bharatkaushik9916 11 หลายเดือนก่อน

      How did you inference the model can you please explain

  • @sohailhosseini2266
    @sohailhosseini2266 ปีที่แล้ว

    Thanks for sharing!

  • @bahramboutorabi5971
    @bahramboutorabi5971 ปีที่แล้ว

    Great video. Thank you

  • @ScottzPlaylists
    @ScottzPlaylists 5 หลายเดือนก่อน

    🤯 Wow Wow Wow ❗

  • @AA-rd6nm
    @AA-rd6nm ปีที่แล้ว

    Very deatiled thanks for sharing. I ❤ it.

  • @mdfarhananis8950
    @mdfarhananis8950 ปีที่แล้ว +1

    Please teach how to create dataset for finetuning

  • @Noshiru
    @Noshiru 4 หลายเดือนก่อน

    Hello!
    The question might be stupid, but how come this is so difficult to learn to the AI our own data ? I mean, when you talk to ChatGPT for example, if you tell it stuff, it will remember (if you use the same chat) what you said and it will be able to answer your questions about it. Why can we just give the AI a documentation for example ?

  • @gamingisnotacrime6711
    @gamingisnotacrime6711 ปีที่แล้ว +1

    I have a custom dataset with 50 rows. For how many epochs should i fine tune thr model?
    Each line in my dataset is in this format - ###Human: Who is John?### Assistant: John is a famous youtuber
    (My dataset has only a single column named text and 50 rows which have the data in above format
    So also are there any issues with my dataset?

  • @sb98052
    @sb98052 11 หลายเดือนก่อน +1

    Thank you for these very clear videos. Do you have any thoughts or pointers on resources for doing this type of training on code models such as CodeLlama?

  • @emrahe468
    @emrahe468 ปีที่แล้ว +1

    finished running the autotrain in about 6h. And upload the model to hugginface. so what to do next? How to use this?

  • @adapalarajyalakshmi3728
    @adapalarajyalakshmi3728 ปีที่แล้ว +2

    Thanqu for the video can u explain how to use postgress database dataset

    • @Dave-nz5jf
      @Dave-nz5jf ปีที่แล้ว

      you would probably need to pull the data in batches, in the right format, and then run this autotrainer on a batch basis. But it's an interesting question - if you have data that's changed (in the database), and you retrain the model, how does the updated data impact the model output.

  • @Koyaanisqatsi2000
    @Koyaanisqatsi2000 ปีที่แล้ว

    Thank you very much! Where can I view the loss of my training or evaluation data using this method?

  • @ilyaskydyraliev6498
    @ilyaskydyraliev6498 11 หลายเดือนก่อน

    Thank you for the video! May I ask, how big of a dataset should I have to see that fine tuning actually worked and model learnt new data?

  • @oxydol3456
    @oxydol3456 5 หลายเดือนก่อน

    learnt a lot from the video.Thanks. Is it easy to revert the model to the state before a tuning?

    • @engineerprompt
      @engineerprompt  5 หลายเดือนก่อน +1

      Thanks, yes, you are merging the extra "LoRA Adapters" layers to the model. The actual model actually remains unchanged so you can just reuse it for other purposes.

  • @okopyl
    @okopyl 11 หลายเดือนก่อน

    Amazing, but how to do the inference properly with this peft thing?

  • @CrazyFanaticMan
    @CrazyFanaticMan ปีที่แล้ว +2

    At 5:08, what file format does it expect? Sorry, my english is not that good

    • @carlsagan9808
      @carlsagan9808 ปีที่แล้ว

      I'm pretty sure they mean .csv file

  • @machineUnlearner
    @machineUnlearner 7 หลายเดือนก่อน

    i have a time series data, with 7 to 10 parameters. What should I do ?

  • @waelmashal7594
    @waelmashal7594 ปีที่แล้ว

    Just amazing

  • @PajakRikiAkbar
    @PajakRikiAkbar ปีที่แล้ว +1

    I haven't tried it on colab yet but was wondering, do we need colab pro or colab pro+ for this tutorial?

    • @engineerprompt
      @engineerprompt  ปีที่แล้ว +3

      For this, you can use the sharded model with free version but for full model you will need pro

  • @ajlahade2201
    @ajlahade2201 ปีที่แล้ว

    can you please make a video on how to push this model to hugging face (like production level with model card) and call that model

  • @aiwesee
    @aiwesee ปีที่แล้ว +1

    For fine-tuning of the large language models (llama-2-13b-chat), what should be the format(.text/.json/.csv) and structure (like should be an excel or docs file or prompt and response or instruction and output) of the training dataset? And also how to prepare or organise the tabular dataset for training purpose?

    • @rainchengcode4fun
      @rainchengcode4fun ปีที่แล้ว

      timdettmers/openassistant-guanaco has introduction about the dataset, it should be a list of json with instruction, response in it.

    • @GEfromNJ
      @GEfromNJ 11 หลายเดือนก่อน

      See this is one the thing that gets completely glossed over in videos like this. If you take a look at timdettmers/openassistant-guanaco, you'll see that it's some nicely formatted data. It doesn't answer the question about how someone would take their own data and get it into this format.

  • @noraalzamil2660
    @noraalzamil2660 11 หลายเดือนก่อน

    Thank you very much 🙏
    Can I apply it with TheBlock llama-2-7b ggml?

  • @AdamTwardoch
    @AdamTwardoch ปีที่แล้ว +4

    But what's "a while"? Hours? Days?

    • @adbeelomiunu
      @adbeelomiunu ปีที่แล้ว

      😂
      Funny but a great question to ask

    • @phoenixfire6559
      @phoenixfire6559 ปีที่แล้ว +5

      In the video @11:29 it looks like he's been running the training for 44 minutes and it still has over 43 hours to run. The Guanaco data set he used has 10k instructions and let's assume 250 tokens per instruction, that's 2.5 million token dataset. The Alpacae dataset he mentions is 52k instruction and around 10 million tokens.
      Remember he's using a batch size of 2, if he ran with a batch size of 8 (assuming he had enough vram), then it would take 1/4 the time.

    • @fuba44
      @fuba44 ปีที่แล้ว +2

      If it helps, i ran hes exact example on an nvidia tesla P40 with 24gb of vram (changed the batch size from 2 to 5) and it toke me 20 hours.

    • @emrahe468
      @emrahe468 ปีที่แล้ว +1

      finetuning 6.6K sized database took me like 6h on google colab pro. but on some other tutorials, this was like 30 min. im totally lost

  • @hamedhaidari4479
    @hamedhaidari4479 9 หลายเดือนก่อน +1

    i followed everything like you, i get this error
    autotrain [] llm: error: the following arguments are required: --project-name

    • @onesecondnanba
      @onesecondnanba 9 หลายเดือนก่อน +1

      same problem

    • @onesecondnanba
      @onesecondnanba 9 หลายเดือนก่อน

      autotrain llm \
      --train \
      --project-name 'llama2-openassistant' \
      --model TinyPixel/Llama-2-7B-bf16-sharded \
      --data-path timdettmers/openassistant-guanaco \
      --peft \
      --lr 2e-4 \
      --batch-size 4 \
      --epochs 3 \
      --trainer sft \
      > trainer.log

  • @vijayendrasdm
    @vijayendrasdm ปีที่แล้ว

    What is the relation between max token size and the model kind of repeats itself ? The one you talk in the things to consider

  • @pareak
    @pareak 8 หลายเดือนก่อน

    What is the difference between the SFT and the Generic trainer?

  • @titangadget
    @titangadget 6 หลายเดือนก่อน

    I'm using this one line training code but is giving me error... can you update it?

  • @alex_p08
    @alex_p08 7 หลายเดือนก่อน

    hello, I am beginner on llama, I dont know how to generate text using fine tuneded version of it
    also, this format does not work there, thx
    prompt = ""
    prompt_template=f'''
    USER: {prompt}
    ASSISTANT:
    '''
    response=lcpp_llm(prompt=prompt_template, max_tokens=256, temperature=0.5, top_p=0.95,
    repeat_penalty=1.2, top_k=150,
    echo=True)
    print(response["choices"][0]["text"])

  • @georgekokkinakis7288
    @georgekokkinakis7288 ปีที่แล้ว +1

    I really love your tutorials, they are deeply informative. I was wondering for the following. Unfortunately 😔 all these LLMs are trained in English , but the world has so many other languages. If I follow the fine tuning you described in your video would I be able to fine tune the lama model for a specific dataset which has questions about mathematical definitions and methodologies with their according responses written in Greek? The amound off samples is about 100 questions with answers, I know it is really small but could this give good results for thebspecific dataset? And one last question , do you know any multilingual LLM which supports Greek. Thanks once more and keep up with your excellent ❤ presentations.

    • @AymanEL-BACHA
      @AymanEL-BACHA ปีที่แล้ว +1

      hi @georgekokkinakis7288, have you tried training with your 100 sample/questions ? any improvements ?

    • @georgekokkinakis7288
      @georgekokkinakis7288 ปีที่แล้ว

      @@AymanEL-BACHA No I haven't yet

  • @Yash-mk8tc
    @Yash-mk8tc ปีที่แล้ว +1

    can you make a video on hugging face basics

  • @nexusinfosec
    @nexusinfosec ปีที่แล้ว +1

    Could you please create a video on the dataset creation?

    • @VadiyalaRR
      @VadiyalaRR ปีที่แล้ว

      th-cam.com/video/-ui8YKz4d-E/w-d-xo.html hope it helps you

  • @PickaxeAI
    @PickaxeAI ปีที่แล้ว

    What GPU should we select to complete this training? Could the T4 handle it?

  • @vinayvelugu
    @vinayvelugu 11 หลายเดือนก่อน

    This does not work for Windows.. Is there any similar alternative for windows?

  • @deepakkrishna837
    @deepakkrishna837 10 หลายเดือนก่อน

    Hi Great Video. Thanks a lot for this. QQ: if I am building an information extractor and the max token length of the training data is 2750 and hence I have kept model_max_length as 3000. Do I need to strictly keep the block_size as well to 3000? Please answer!

  • @pickaxe-support
    @pickaxe-support ปีที่แล้ว

    Is there a link for the google colab notebook?

  • @fangxiaoyuan-fm6vr
    @fangxiaoyuan-fm6vr ปีที่แล้ว

    Could you introduce how to deploy our model to a website? Thanks!

  • @白泽-x2n
    @白泽-x2n 8 หลายเดือนก่อน

    Hello, I am a beginner in LLM. I generated the model folder locally according to the video operation, but the folder size is only about 130Mb. The base model I use is 7b llama2. Is this normal? Why is the model size reduced so much? How do I get the normal size model? I would be grateful if you could answer it for me

  • @SadeghShahmohammadi
    @SadeghShahmohammadi ปีที่แล้ว

    It took a few hours, everything went well but at the end the model is not in my hf repository! Cannot find it anywhere!

  • @nufh
    @nufh 11 หลายเดือนก่อน

    Other than google colab, what is other platform that we can use? I'm still new, just started to learn about python.

  • @AtharvaWeginwar
    @AtharvaWeginwar 7 หลายเดือนก่อน

    I am facing issues in the autrain line where its stating argument should be project-name instead of project_name and even if i change that its not taking arguments like data_path, use_peft. can someone help me out?

  • @PromptoraApps
    @PromptoraApps 10 หลายเดือนก่อน

    how to create the own dataset from the pdfs

  • @tubesarkilar
    @tubesarkilar 10 หลายเดือนก่อน

    can you show a sample of time series data file to feed into Autotrain?

  • @MarceloLimaXP
    @MarceloLimaXP ปีที่แล้ว

    Thanks guy ;)

  • @fpena06
    @fpena06 ปีที่แล้ว +1

    What's a sharded version and why did you go with a sharded version model? Thanks

    • @phoenixfire6559
      @phoenixfire6559 ปีที่แล้ว +6

      Every LLM model works best on a GPU because GPU's excel in parallel calculations. Loading a model into a GPU needs a set amount of VRAM, the amount depends on the parameters of the model and the precision e.g. a 7bn Llama-2 model at 16 float precision will need around 16GB VRAM. I believe the free Colab GPU VRAM is 12GB so you cannot load the 7b model at 16fp precision - you could at 8 bit precision though.
      One way to get around this is to split the model into shards - this is not the same as splitting the model into 3 files. When you download a model from huggingface it is often in multiple pieces, however this is just for ease for download/ help build fault tolerance i.e. protection if one piece is corrupted. When loading these models into the GPU, it is done in series, so for 7b 16fp model, it will still take 16GB VRAM.
      Sharding also splits a model into pieces but it does it in a way that each piece can still talk to the other while still being separate. In a nut shell, you are loading the pieces in parallel. Therefore, as long as you can fit the largest piece, you should be able to load in the whole model. For the one in this video, I believe it is sharded into 5GB VRAM pieces. Note, sharding has some issues:
      1. A sharded and unsharded model may behave slightly differently
      2. Sharded models will take longer to train because data has to go between multiple pieces
      3. Combining the sharded model back to an unsharded model may not yield the same results as a trained unsharded model even if using the exact same data

    • @fpena06
      @fpena06 ปีที่แล้ว

      @@phoenixfire6559 thank you so much for the detailed explanation 👏

  • @BatoolZ-q5r
    @BatoolZ-q5r 3 หลายเดือนก่อน

    do we have to add the tiny pixel model to colab?

  • @Noscov
    @Noscov ปีที่แล้ว

    Thanks for the video. I have a further question. At 5:50 your dataset has the columns instruction and input. What is the input-column for?

    • @immortalsun
      @immortalsun 10 หลายเดือนก่อน +1

      For example a question.

  • @youwang9156
    @youwang9156 10 หลายเดือนก่อน

    thank you for ur video, literally save my life, just have one little question about the prompt format, you were using ### human and ### Assistant, so does this format basically depend on the pre-train model prompt format? like Llama-2 chat which has a certain unique format, but some like the Llama 2 base model, if there's no specific mention of that, then we can define our own format for the prompt? do I understand it correctly ? Thank you for your video again !!!!

    • @engineerprompt
      @engineerprompt  9 หลายเดือนก่อน

      Glad you found it helpful. The template depends on whether you are using the base or the chat version. For the base model, you can define your own template as I am doing here because there is no template for it for using it as assistant (base model is actually the next word prediction model). But if you are finetuning a chat version then you will have to use the specific template that was used for finetuning the model. Hope this helps

  • @MicaleAntonio
    @MicaleAntonio 9 หลายเดือนก่อน

    Does auto train do multi-label text classification?

  • @sravanavvaru4473
    @sravanavvaru4473 ปีที่แล้ว

    hey the thing I did not get is on what data is the model getting trained ??

  • @bharatkaushik9916
    @bharatkaushik9916 11 หลายเดือนก่อน

    Can someone tell how to inference this model ?after pushing it to hub thanks

  • @nitingoswami1959
    @nitingoswami1959 ปีที่แล้ว

    Can we train this model on any data or it requires some specific format ? Does every llm requires some specific tabular data or any raw data ?

  • @bfam7110
    @bfam7110 ปีที่แล้ว

    Is there embeddings or RAG with this approach?

  • @sanj3189
    @sanj3189 ปีที่แล้ว

    How can i use LLama2 for generating synthetic data

  • @ShiftKoncepts
    @ShiftKoncepts ปีที่แล้ว

    I am a little confused, so the Llama LLM on gpt4all has to be trained first before usage with local docs?

  • @meteor1
    @meteor1 10 หลายเดือนก่อน

    Can I fine-tune llama-13b-GPTQ using autotrain-advanced ?

  • @JonathanYankovich
    @JonathanYankovich ปีที่แล้ว +1

    Can you train against GPTQ's using this?

    • @engineerprompt
      @engineerprompt  ปีที่แล้ว

      Yeah, I think you will be able to. However, remember this is not the chat version, it's the base model.

  • @contractorwolf
    @contractorwolf ปีที่แล้ว

    subscribed!

  • @efexzium
    @efexzium 10 หลายเดือนก่อน

    can this do PEFT ?

  • @akibulhaque8621
    @akibulhaque8621 9 หลายเดือนก่อน

    If the dataset is made using my native language when will the model still be trained for that specific language?

    • @engineerprompt
      @engineerprompt  9 หลายเดือนก่อน

      You will need to make sure the tokenizer also supports the language otherwise you will run into issues

  • @tejasingle9655
    @tejasingle9655 11 หลายเดือนก่อน

    when i try to train the model it shows this error "ValueError: Token must be specified for push to hub"

    • @manubansal7388
      @manubansal7388 11 หลายเดือนก่อน

      Were you able to solve this?

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w ปีที่แล้ว

    Why is there such sharded versions of the model?

  • @carlsagan9808
    @carlsagan9808 ปีที่แล้ว

    This command is not working at all. Happening to anyone else? I get the repeated error >> RuntimeError: CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`