How To Fine-tune The Llama 1 Models(GPT3 Alternative)

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ก.ย. 2024

ความคิดเห็น • 61

  • @Brillibits
    @Brillibits  ปีที่แล้ว +7

    In this video, I just fine-tune the 7B version. However, given more resources(mainly RAM) there is no reason the larger models couldn't be fine-tuned as well. I suspect that at least the 33B model could be fine-tuned this way, but I wouldn't be suprised if 65B worked as well.
    Discord: discord.gg/F7pjXfVJwZ
    Update: Less than 24 hours after this video release a breaking change has happened with that PR with regards to the tokenizer. Its effectively just a renamed but it breaks the autotokenizer. I will work to update it quickly.

    • @akissot1402
      @akissot1402 ปีที่แล้ว

      how much would cost on cloud to finetune a model like vicuna for question-answer on technical documents?

  • @jameshughes3014
    @jameshughes3014 ปีที่แล้ว +35

    Everyone is talking about Gpt4, but this is the exciting stuff right here. Thank you for doing this.

    • @Brillibits
      @Brillibits  ปีที่แล้ว +7

      I definitely think that being able to directly use and change the weights of these powerful models is exciting.

    • @gankam
      @gankam ปีที่แล้ว +1

      Totally agree with James.

  • @frankvitetta
    @frankvitetta ปีที่แล้ว +6

    Great to hear that you found the video helpful! I noticed that you've been addressing some important questions on your blog post. Speaking of which, I have a question that I believe you could help me with. I'm curious about the instructions for training the LLaMA model with my own data. Specifically, if I have a library of law cases and want to turn my Alpaca into a robo-lawyer, what would be the process for further training the models based on the existing work? Thank you in advance for your help!

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      I have a video on creating a custom dataset for GPTJ. The core behind it is the same for all of these decoder models.

    • @haveaniceday7950
      @haveaniceday7950 ปีที่แล้ว

      This is a cool idea, did you do it?

  • @MachineMinds_AI
    @MachineMinds_AI ปีที่แล้ว +1

    Thank you so much for these videos, setting it up now

  • @quantumbyte-studios
    @quantumbyte-studios ปีที่แล้ว

    LlaMa's story remind me of Robin Hood. Thanks continuing to bring rich content to the people🏹

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Thanks for watching!

  • @javideas
    @javideas ปีที่แล้ว +7

    Could you make a video about how to set up (and rent?) a gpu server with llm in mind?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +3

      Thats a good idea for a video. I may do it, its a just a question of finding time

  • @NewMateo
    @NewMateo ปีที่แล้ว +5

    Any chance youll cover how to add the stanford Alpaca retrain? I think the 13b version was just released today.

    • @Brillibits
      @Brillibits  ปีที่แล้ว +3

      A very quick search of the model did not yield any results for downloads of the model but it does seem to exist, or recreations of it.
      You can finetune it, as it's just an existing finetune of LLaMA. You will replace the model name in the video with the path. The model weights need to be in the right format.

  • @muhammadshahzaib9122
    @muhammadshahzaib9122 ปีที่แล้ว +2

    Gem of a work '''' 👍👍

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Thanks for watching!

  • @Jena598
    @Jena598 ปีที่แล้ว

    Thank you very much @Brillibits. This is very helpful video and impressive work. I have one question. I like to fine-tune using question answer data set. Is it possible to fine-tune using this Q&A data set? if it is possible can you point out the reference document or video to do it? GOAL: I need add more data(which is in plane text and I'm converting into Q&A data set) to customize the model.

  • @sseot
    @sseot ปีที่แล้ว

    Hi! thanks for the great video! I noticed the repository has already changed compared with the video one. Is there any more detailed steps how to finetune with the new files, starting from environment building? I'm not sure when building the docker, which is the first file I need to run build imagesh? and what's the file about update llamash?

  • @mattbuscher5401
    @mattbuscher5401 ปีที่แล้ว +3

    great videos. we are currently working on build ML box with two Nvidia 4090 RTX. do you think we could do the bigger LLaMA?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +1

      If you want to have your own server like I do, and you want to use LLaMA, make sure you have a ThreadRipper system and have more RAM.
      That gets very expensive though vs top of the line consumer hardware like I have so it starts making more sense to rent.

  • @xalgiadotcom
    @xalgiadotcom ปีที่แล้ว +2

    Thank you 🙏😊 right on time

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Thanks for watching!

  • @timothymaggenti717
    @timothymaggenti717 ปีที่แล้ว +3

    I could not figure out the lama but it downloaded the world on my machine. For such a small... Model it takes a ton of space. You should make a video on how to lama.

  • @silvacarl
    @silvacarl ปีที่แล้ว +1

    Cool video

  • @deltavthrust
    @deltavthrust ปีที่แล้ว

    Thanks for sharing. Very good insights.

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Glad it was helpful!

  • @bosquillondejenlisarmand354
    @bosquillondejenlisarmand354 ปีที่แล้ว

    Thank you for this video; I have a question: can you see the weights of the model?
    So you could, for instance, train a new neural network composed of Llama and newly added neurons.
    Or is Llama operating as a program for which you do not have access to the NN's source code or structure?
    In which case, can you train a NN that interacts with Llama running on your own computer?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +1

      Yes you could add or remove neurons

  • @DataRockShow
    @DataRockShow ปีที่แล้ว

    How do you control your Linux machine from your Mac? If you could give me an idea to look up the rest on the internet. Another question, what version of Linux are you using? Thank you very much.

    • @drowningpenguin1588
      @drowningpenguin1588 ปีที่แล้ว +1

      Pretty Sure it’s just an SSH connection to the Linux Server. Plenty of good documentation online for configuring 😊

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Yes, as the other commenter said I am using SSH. I am using Ubuntu 20.04

  • @yuantian8408
    @yuantian8408 ปีที่แล้ว

    Hi, why can you directly use the path on huggingFace? Is that because of Deepspeed?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +1

      Because the model is on HuggingFace hub.

  • @maxencelaurent4885
    @maxencelaurent4885 ปีที่แล้ว

    Is it possible to train the model with, let's say, some documentation ?
    So that I can just ask him to help me to resolve a problem, and it can answer fast ?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Yes that would be possible

  • @eyemazed
    @eyemazed ปีที่แล้ว

    how do these llama models perform with foreign language queries? are there any stats on this?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Not sure. Worse case you could finetune on a foreign language and it would do fine.

  • @spotterinc.engineering5207
    @spotterinc.engineering5207 ปีที่แล้ว

    Would this process work for WizardLM 13b?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      If that is a LLama based model most likely yes.

  • @hintzod
    @hintzod ปีที่แล้ว

    Is there a way to increase to fine tune the model to increase the context size?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +1

      Outside of 2048? Not that I am aware.
      Having a larger context window is the greatest advantage OpenAI has in my opinion.

  • @Alvee_AI
    @Alvee_AI ปีที่แล้ว

    Do we need to download the weights separately to run this?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      At the time I made I video, the weights were on HuggingFace and were automatically downloaded. If that changes then yes, you would need to download and convert them

  • @nupersu6307
    @nupersu6307 ปีที่แล้ว

    What is the minimus vram to run the smallest LLaMA model?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +2

      With a modern GPU using bits and bytes int8 we can get down to like 7GB. Then we need some headroom to query it so probably around 9-10GB. For float16 or bfloat16 probably around 16-17GB.

    • @TheRaretunes
      @TheRaretunes ปีที่แล้ว

      @@Brillibits I managed to load the 7B one at 4bit precision on a 6700xt on linux.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w ปีที่แล้ว

    Can we fine tune it on Colab, using the GPU?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Not enough RAM so no

  • @f3lixadam
    @f3lixadam ปีที่แล้ว +1

    Hello Blake, I think you said in a video before that you are also open for work.
    My company and partners would be inteerested. How can we connect, if you are still interested?

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Thanks for watching. If you are interested in working with me, the best way to contact me is email. There are also ways to contact me through discord and other means, but I check email the most. Email is under about on the channel.

  • @kc-jm3cd
    @kc-jm3cd ปีที่แล้ว +1

    Is it possible to run it on multiple computers at home at once using shared GPU’s and ram

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      For training it could be done through distributed training. With multiple GPUs on the same device you can also host a model.

  • @khrissxander
    @khrissxander ปีที่แล้ว +1

    Is this really 100% on the local computer??? or just an api??

    • @Brillibits
      @Brillibits  ปีที่แล้ว

      Yes. My computer is better than most, but it does train the model, just slowly. If I had more RAM, it would be plausible. I recommend renting a a100 server.

    • @khrissxander
      @khrissxander ปีที่แล้ว +1

      @@Brillibits wow huh. But what about all the training data-sets? wouldn't that be a few terabytes, so what gives?

    • @Brillibits
      @Brillibits  ปีที่แล้ว +1

      @@khrissxander It is not possible to pretrain the model on consumer hardware. But you can finetune it. Pretraining takes thousands of GPU hours and millions of dollars.

    • @khrissxander
      @khrissxander ปีที่แล้ว +1

      @@Brillibits I actually understood that. So what I'm trying to understand is how the model can generate anything without being attached to it's training data. Is the reality that all it needs are it's pre-trained weights to do all the new generating? does it not scan it's data when generating a response?
      If you don't have training data you must have the pre-trained weights,
      if you have neither,
      then you are only accessing the AI through an API or some online method.

    • @elifnur4955
      @elifnur4955 ปีที่แล้ว +1

      @@khrissxander As far as I understand you do have the pre-training weights and as you fine-tune it the weights get adjusted (so the original weights do not disappear completely)