Master Fine-Tuning Mistral AI Models with Official Mistral-FineTune Package

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ส.ค. 2024
  • In this video, I walk you through the official Mistral AI fine-tuning guide using their new Mistral FineTune package. This lightweight code base enables memory-efficient and high-performance fine-tuning of Mistral models. I delve into the detailed data preparation process and explain how to format your datasets correctly in JSONL format to get the best results. We'll also set up an example training run using Google Colab, download necessary models, and validate our dataset. Finally, I'll show you how to execute the training job and verify the results. If you're keen to learn about fine-tuning and LLMs, this is a must-watch. Don't forget to subscribe for more updates on training and rack systems!
    #mistral #finetuning #llm
    🦾 Discord: / discord
    ☕ Buy me a Coffee: ko-fi.com/promptengineering
    |🔴 Patreon: / promptengineering
    💼Consulting: calendly.com/engineerprompt/c...
    📧 Business Contact: engineerprompt@gmail.com
    Become Member: tinyurl.com/y5h28s6h
    💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
    Signup for Advanced RAG:
    tally.so/r/3y9bb0a
    LINKS:
    Github: github.com/mistralai/mistral-...
    Notebook: tinyurl.com/26a3yj8t
    TIMESTAMPS
    00:00 Introducing Mistral FineTune: The Ultimate Guide
    00:35 Deep Dive into Data Preparation for Fine Tuning
    03:57 Setting Up Your Fine Tuning Environment
    06:39 Data Structuring and Validation for Optimal Training
    12:05 Configuring and Running Your Fine Tuning Job
    19:42 Evaluating Training Results and Model Inference
    22:41 Final Thoughts and Recommendations
    All Interesting Videos:
    Everything LangChain: • LangChain
    Everything LLM: • Large Language Models
    Everything Midjourney: • MidJourney Tutorials
    AI Image Generation: • AI Image Generation Tu...
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 26

  • @engineerprompt
    @engineerprompt  2 หลายเดือนก่อน

    If you are interested in learning more about how to build robust RAG applications, check out this course: prompt-s-site.thinkific.com/courses/rag

  • @mike_nyc_212
    @mike_nyc_212 2 หลายเดือนก่อน +1

    In the provided training example the first user "message" looks like it matches the "prompt". Btw, totally agree that my main frustration with fine tuning toolkits has been how to format the training data (especially for multi turn conversion)!

  • @spectre123
    @spectre123 หลายเดือนก่อน

    Thanks for this video. Can you make a video for the pretrain data corresponds to plain text data stored in the "text" key. E.g: {"text": "Text contained in document n°1"} ? and how many text we need for a good fine tuning results? thanks

  • @mohsenghafari7652
    @mohsenghafari7652 2 หลายเดือนก่อน

    thanks

  • @unclecode
    @unclecode 2 หลายเดือนก่อน +5

    Very interesting. I noticed that the library dependency isn't solely Torch and doesn't involve HuggingFace. This means they built all these transformer classes themselves. I wonder about the motivation behind this. Although we can use HuggingFace for fine-tuning, it's intriguing they optimized it for their own model.
    I'm also curious about setting the weight to 0. If it's skewed to a row, what happens if you set the weight to another number, like 0.5 or 1? Does it consider it with a multiplier?
    Another idea: let's create a simple web interface for fine-tuning. What do you think? If you agree, I can create a repo and we can work on it together. We could build a simple interface where people can upload CSV files or other formats, use validation tools, prepare data, and then queue it for fine-tuning. I think it would be very useful for many people. May be you can add it to localGPT, its nice to see localGPT has local fine-tuner too.

    • @engineerprompt
      @engineerprompt  2 หลายเดือนก่อน +2

      I agree, seems like they have a lot custom code and optimizations. I think they might be releasing a lot more and have decided to create dedicated implementation which might be able to support their new models with waiting for HF transformers to be upgrade each time their is some new innovation.
      I suspect the weight is going to be just binary (I plan to look into their code a bit deeper when I get back from the break). Seems like this is part of their data filtering for samples that are not properly formatted.
      I like your idea. I always wanted to integrate something like this in localgpt but never got around to it. One aspect which could actually be really useful is if someone could just upload raw text files and there is a model which create instruct dataset out it. I have had a few requests around it and I think this might be very useful for people who want to fine-tune their own models but don't even know where to start with their datasets. Let's discuss this when I get back towards the end of the week.

    • @unclecode
      @unclecode 2 หลายเดือนก่อน

      ​@@engineerprompt True, they have just updated their library for Codestral, try it, and please make a content for that. Regarding LocalGPT, looking forward, and will be happy if I can be a help.

  • @farazfitness
    @farazfitness 2 หลายเดือนก่อน +1

    I did everything but I'm unable to understand the last step how do I push the model and run it locally like a chat gpt interface??? Can you do a video of how to integrate a model from google colab to gpt4all

  • @pawan3133
    @pawan3133 2 หลายเดือนก่อน

    Thanks for another great video!!
    Can you please make a video or at least share the material on fine-tuning a quantized mistral v0.3 model

    • @engineerprompt
      @engineerprompt  2 หลายเดือนก่อน

      In general, you want to load the model in 4-bit. Look at my finetuning videos using unsloth.

  • @iainattwater1747
    @iainattwater1747 2 หลายเดือนก่อน +1

    Nice video - thanks. Have you tried inferencing Mistral FT with TGI? If the tokenizer contains the chat template then TGI/HG Chat UI should honor it. I'm going to try it.

    • @engineerprompt
      @engineerprompt  2 หลายเดือนก่อน +1

      Not yet, I am looking into TGI and plan to create a tutorial on it.

  • @godataprof
    @godataprof 2 หลายเดือนก่อน

    Can you do a video on function calling fine tuning?

  • @kenchang3456
    @kenchang3456 หลายเดือนก่อน

    Great detail, thank you very much. I'm interested in Mistral fine-tuning using a JSONL dataset for NER. Do you have any videos for that topic or is this video sufficient and really all I would need to do is determine what the JSONL data format should be?

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน

      I don't have specific video on that topic but I think this will be a good start. My recommendation will be to do few shots first before you start thinking about finetuning. There is a LOT you can achieve with few shots. Finetuning should be the last resort!

    • @kenchang3456
      @kenchang3456 หลายเดือนก่อน

      @@engineerprompt Thank you for responding, I appreciate it.

  • @vivekjainmaiet
    @vivekjainmaiet 19 วันที่ผ่านมา

    Most of people do not have dataset but have unstructured data. Could you make a video to train base model and then convert it to chat Model.

  • @user-gl3it3ki2u
    @user-gl3it3ki2u หลายเดือนก่อน

    How to convert model to GGUF after fine-tuned?

  • @BoHorror
    @BoHorror หลายเดือนก่อน

    If I just wanted the Model to speak in a certain way, and I have a PDF full of examples what would I need to do.

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน

      If it's just the tone, you could potentially use few shot prompting to get it working

    • @BoHorror
      @BoHorror หลายเดือนก่อน

      @@engineerprompt So just a simple example. Input would be Speak Like Jolly Roger and output would be Jolly Roger speaking

  • @azkarathore4355
    @azkarathore4355 หลายเดือนก่อน

    Can we finetune mistral for machine translation task for a low resource language

    • @engineerprompt
      @engineerprompt  หลายเดือนก่อน

      Yes, I think that can be done

    • @azkarathore4355
      @azkarathore4355 หลายเดือนก่อน

      @@engineerprompt I have some quries about it can you guide me