When some do this every day all day, they can say EASY, my first time tuning a model, toke me about a month... that was about 7 months ago!, none of what you did here I remember doing it back then... I'm not near "A" from EASY... so I did what most folks here, I was following your instructions, so, thanks for the vid.
Thanks Mervin. Just did my first finetuning!! Colab stopped much earlier as expected , win11 didnt work but performed all again today in wsl2 on my laptop, which worked as a charm.
Thank you fine sir. I have been watching many videos and THIS IS THE BEST. Training my first LLAMA 3.1 model successfully now! And I understand how to format the data to train! FINALLY !
@@Krish-lx1jv there is critical information left out. If you are on windows and not Linux, unsloth will not work. You would have to import a special Triton package but it is error prone. I ended up using a better method without unsloth and it works
@@blackswann9555hi, which way did you use without unsloth? Do you have any tutorial or links for that? If you have can you share please, I cant use unsloth in windows and I am suffering :(
Maybe im off here's but like is there a way to just use Llama 3.1 and upload your files to it somehow, or do you gave to go throghh this process? Plus i dont want my private data on hugging face
I watched one of your Florence-2 videos a couple weeks ago and was very impressed by your workflows. Now with Llama 3.1, you can get even better vision (at least for the 8B parameter model). The model I came across was Llama-3.1-Unhinged-Vision-8B by FiditeNemini. It pairs very nicely with mradermacher's Dark Idol 3.1 Instruct models, surely it would work with several other finetunes. Perhaps someone might have done or will do vision projector models for the Llama-3.1 70B and 405B models.
Is finetuning the best way to give data to a model? I think if the information is updated quickly, like documentation etc. I don't think fine tuning is the best way? That would be RAG now that there are long context available for llama3.1. I have always considered using fine-tuning a model to change "behaviour" or provided static data, like teaching other languages, or uncesoring. RAG to give it my own data
OOOOO! SO CLOSE! Great video :) This ALMOST worked ... but failed with the errors " xFormers wasn't built with CUDA support/ your GPU has capability (7, 5) (too old)". I'm running this on an AWS EC2 G4dn.xlarge (16GB VRAM). Gonna try again with TorchTune instead. Wish me luck!
Is it possible to do a unsupervised learning by Giving the model first a large corpus of data of a specific domain to make it context aware first and then use supervised fine-tuning??
Instruction is the thing that you want the model to do. For example in Medical Chatbot an Instruction might be : Please see my report and tell me what i am suffering from. And the input will have the context for instruction. In our case input will contain the report.
Were the heck did you get those 4 A6000s? I only have 1 RTX4090 😃 What I've heared is that 24GB VRAM isn't enough, right? How long run the training and what were the costs? Anyway, great video, thanks!
Training a local model means that it's as secure as the regular corporate network its on. Unless you end up making it accessible through the internet to other parties, it should not be accessible by them.
@@unclecode Keeping it private on HF does not imply that the data is not on their server... This needs to run completly local if possible. Any idea? Thanks
great vieo Mervin. I have one simple question . can i change the alpaca prompt language besides english, lets say in french, if i will use a french dataset for french language. Does it work like that ?
I had fine-tuned the llama3.1 more than 10 times with alpaca format & using the unsloth if its comes to deployment and testing unsloth models really bad. they don't have any standard document for deployment. My personal suggestion go with standard format fine-tuning instead of alpaca format.
Hi, I have a question if you don't mind. If I plan to use my fine-tuned model with Ollama, but keep it private at the same time (not publicly available in the Ollama models list), is that possible? I want to integrate it, so running it locally won't work for me.
what if I'm giving same prompt for the same type of data generation for whole dataset then will it affect training or It will be fine tuned nicely and I've 2000 data rows then how much epoch should I run?
how small can a custom dataset be? is there an automated way to create a dataset, e.g. use an existing llm to undestand the dynamic input, create based on that the data for the dataset?
Hey, I already have a dataset and tokenizer in JSON format for the Georgian language. I tried to fine-tune Mistral, but the model failed to deliver reasonable text. I was training it in Paperspace but did not like their service that much. So now, I want to know what's the best 8B or 7B small model that can learn a foreign language like Georgian with one GPU. Also, what are the easy ways to do this task? I know it's actually a very hard task, but I want some advice.
Generally not all LLM support every language. It's based on the tokenizer they use. Gemma is one of the model which supports many languages but not all. Try fine tuning Gemma with Georgian Language. Hopefully in the near future, there will be models which supports all languages. Also try this Llama 3.1
You can use any data. Just make sure you format it to CSV or JSON file with input output columns like you see in the vids. Upload it to your code with pandas or directly to Hugging Face repo and start training with dataset library
Open Interpreter + Groq + Llama 3.1 + n8n + Gorilla AI = Lightning speed 100% autonomous agent that automates all workflows with a simple prompt, all open source and free, access to over 1600 API's.
You are testing the fine tuned model with the data used for training the model. That is not showing that the model is working. You don't even need a model to do that, as you already have the date.
Nice video. Can I ask a question. If I just want to have it locally and merged. How to do it ? model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit") Is that correct ?
Hi Mervin. I am trying to sign up for massed compute, but the coupon code is not recognised. Getting this message "Coupon code is not valid for this GPU Type and/or Quantity." Could you tell me where these code can be applied?
You can also use RAG for this type of tasks. Put your doc in a vectorial database and let your model query from it, then you're sure it won't hallucinate and you can keep adding the most uptodate documentation without tempering with the model's training.
@@MervinPraison no i want you to explain why we should include all of the library and other codes we don't know what they doing and why should use them.
Nice can why do it without uploading that to ollama or hugging face i mean like offline fine tuning?
When some do this every day all day, they can say EASY, my first time tuning a model, toke me about a month... that was about 7 months ago!, none of what you did here I remember doing it back then... I'm not near "A" from EASY... so I did what most folks here, I was following your instructions, so, thanks for the vid.
Thanks Mervin. Just did my first finetuning!! Colab stopped much earlier as expected , win11 didnt work but performed all again today in wsl2 on my laptop, which worked as a charm.
Thank you fine sir. I have been watching many videos and THIS IS THE BEST. Training my first LLAMA 3.1 model successfully now! And I understand how to format the data to train! FINALLY !
hello dude, when i try to run the code in terminal, i get a subprocess error. i have entered the same code from 2:34
@@Krish-lx1jv there is critical information left out. If you are on windows and not Linux, unsloth will not work. You would have to import a special Triton package but it is error prone. I ended up using a better method without unsloth and it works
@@blackswann9555 please share me the method which you used.
@@blackswann9555hi, which way did you use without unsloth? Do you have any tutorial or links for that? If you have can you share please, I cant use unsloth in windows and I am suffering :(
Best finetuning tutorial
Man, you explained everything so so well!
Fantastic detailed tutorial Mervin! Absolutely love this!
It seems we've got different definitions of the word easy.
Hahaha, trust me, this is considered very easy in the realm of coding fine-tuning!
He meant to say that he spent 20X the time and it was easy to post edit it to appear effortless.
😂😂😂😂
Maybe im off here's but like is there a way to just use Llama 3.1 and upload your files to it somehow, or do you gave to go throghh this process? Plus i dont want my private data on hugging face
Great tutorial mate!
Thanks for this tutorial! I usually use Unsloth but their Ollama notebook was more advanced so having the video is very helpful.
You explained amazing thanks a lot sir. Lemme know what if fine tune to other language will it work?
I watched one of your Florence-2 videos a couple weeks ago and was very impressed by your workflows. Now with Llama 3.1, you can get even better vision (at least for the 8B parameter model). The model I came across was Llama-3.1-Unhinged-Vision-8B by FiditeNemini. It pairs very nicely with mradermacher's Dark Idol 3.1 Instruct models, surely it would work with several other finetunes. Perhaps someone might have done or will do vision projector models for the Llama-3.1 70B and 405B models.
It is super clear to understand and apply into my use case. Thank you so much!!
Is finetuning the best way to give data to a model? I think if the information is updated quickly, like documentation etc. I don't think fine tuning is the best way? That would be RAG now that there are long context available for llama3.1.
I have always considered using fine-tuning a model to change "behaviour" or provided static data, like teaching other languages, or uncesoring. RAG to give it my own data
Do both
@@j0hnc0nn0r-sec Yes Agreed, Try doing both for better response. Finetuning + RAG
How do you choose between fine tuning or rag?
OOOOO! SO CLOSE! Great video :) This ALMOST worked ... but failed with the errors " xFormers wasn't built with CUDA support/ your GPU has capability (7, 5) (too old)". I'm running this on an AWS EC2 G4dn.xlarge (16GB VRAM). Gonna try again with TorchTune instead. Wish me luck!
All the best
Super awesome tutorial! Many thanks, Mervin!
A silly question maybe. What if I have to upgrade the model? Can I push the model again with the same name? and how to define the parameters
Is it possible to do a unsupervised learning by Giving the model first a large corpus of data of a specific domain to make it context aware first and then use supervised fine-tuning??
Hi! awesome video, i didn't understand the input format: what's the difference between "instruction" and "input"? Thanks for your time!
Instruction is the thing that you want the model to do. For example in Medical Chatbot an Instruction might be : Please see my report and tell me what i am suffering from. And the input will have the context for instruction. In our case input will contain the report.
Were the heck did you get those 4 A6000s? I only have 1 RTX4090 😃 What I've heared is that 24GB VRAM isn't enough, right? How long run the training and what were the costs? Anyway, great video, thanks!
Do you have 4x A6000 on your local machine? I have RTX 4090. I use it for computer vision models finetuning and I finetuned and ran some smaller LLMs.
Yes, I have 4x A6000 in the cloud
I bought from here Massed compute: bit.ly/mervin-praison
Coupon: MervinPraison (50% Discount)
My system configuration i5 processor and 8GB , is it sufficient ? As it is lagging ?
Brother, you are becoming the guy with the coolest nickname among me and my friends, like, "Hey did you watch The Amazing Guy's new video?"
Can I use this code with my Local machine or is this just for Cloud Computing ?
Are we able to fine tune the model which is available in the ollama?
02:10 Thanks for sharing your machine specs.
Can you please tell us how this is can secure companies data, we are saving our model at olama to get the end results
Training a local model means that it's as secure as the regular corporate network its on.
Unless you end up making it accessible through the internet to other parties, it should not be accessible by them.
That's simple, don't save in ollama! Keep it private in HF.
@@unclecode Keeping it private on HF does not imply that the data is not on their server... This needs to run completly local if possible. Any idea? Thanks
Hello sir, Can you tell me how to fine tune and deploy llama3 models on Amazon Sagemaker using notebooks ?
How to add llama 3.1 in "laravel PHP" website?
Create a video on this topic. Please 🙏🙏🙏
Could you have maybe your face a bit smaller when the code is shown? Now it's behind the face (which is ok to show!)
Excellent thank you so much,
is it possible to fine tune using online news article dataset in any regional language to train llama3.1 to response in that regional language?
great vieo Mervin.
I have one simple question . can i change the alpaca prompt language besides english, lets say in french, if i will use a french dataset for french language. Does it work like that ?
Yes it should work
In this video you showed how to train using terminal. Can we train it on google colab and upload ?????
You can. You can immediately find it by googling google colab unsloth fine tuning and the answer is on top
Really cool, thank you.
I had fine-tuned the llama3.1 more than 10 times with alpaca format & using the unsloth if its comes to deployment and testing unsloth models really bad. they don't have any standard document for deployment. My personal suggestion go with standard format fine-tuning instead of alpaca format.
Can you please provide more details in my discord ?
Just would like to analyse the results and why it is not performing better
@@MervinPraison Currently they updated the script. the way of passing the data into llama3.1 model in unsloth please check once.
Hi, I have a question if you don't mind. If I plan to use my fine-tuned model with Ollama, but keep it private at the same time (not publicly available in the Ollama models list), is that possible? I want to integrate it, so running it locally won't work for me.
good tutorial
what if I'm giving same prompt for the same type of data generation for whole dataset then will it affect training or It will be fine tuned nicely and I've 2000 data rows then how much epoch should I run?
Hello Melvin, I find that llama3.1 8b is not great at calculation, can I fine tune it?
how small can a custom dataset be? is there an automated way to create a dataset, e.g. use an existing llm to undestand the dynamic input, create based on that the data for the dataset?
It can be 1 row if you like but check for PEFT techniques because fine tuning a small dataset can lead to overfitting
Hey, I already have a dataset and tokenizer in JSON format for the Georgian language. I tried to fine-tune Mistral, but the model failed to deliver reasonable text. I was training it in Paperspace but did not like their service that much. So now, I want to know what's the best 8B or 7B small model that can learn a foreign language like Georgian with one GPU. Also, what are the easy ways to do this task? I know it's actually a very hard task, but I want some advice.
Generally not all LLM support every language. It's based on the tokenizer they use.
Gemma is one of the model which supports many languages but not all. Try fine tuning Gemma with Georgian Language.
Hopefully in the near future, there will be models which supports all languages. Also try this Llama 3.1
im new to training llms, can i use my own data for training llms, ie- scraped data, if so how/what should i research?
You can use any data. Just make sure you format it to CSV or JSON file with input output columns like you see in the vids. Upload it to your code with pandas or directly to Hugging Face repo and start training with dataset library
Open Interpreter + Groq + Llama 3.1 + n8n + Gorilla AI = Lightning speed 100% autonomous agent that automates all workflows with a simple prompt, all open source and free, access to over 1600 API's.
thx for sharing
You are testing the fine tuned model with the data used for training the model. That is not showing that the model is working. You don't even need a model to do that, as you already have the date.
Nice video. Can I ask a question. If I just want to have it locally and merged. How to do it ?
model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit")
Is that correct ?
Yes
Why don't u use RAG?
how can I do this in the cloud?
Hi Mervin. I am trying to sign up for massed compute, but the coupon code is not recognised. Getting this message "Coupon code is not valid for this GPU Type and/or Quantity." Could you tell me where these code can be applied?
I will check and get back to you soon
@kannansingaravelu Please try A6000 or A5000 GPU's
Those are the one's which avail 50% for now.
52 Easy Steps
what python app is this?
All data stays local? And how long did it take you?
Where else can it go when you're using local models?
@@Leto2ndAtreides huggingface for instance…
It took approx 15 mins for me. But it varies based on the computer spec, the model, the dataset and also the training configuration you are using.
Why using all of those alpaca questions and answers if you want to train your model in a dif way?
i want to train it on specific hardware documentation,,, let say arduino esp32 is this wil help generate better code for it,,,
You can also use RAG for this type of tasks. Put your doc in a vectorial database and let your model query from it, then you're sure it won't hallucinate and you can keep adding the most uptodate documentation without tempering with the model's training.
Is Unsloth support only with GPU?
Why we are not using the model which is available in our ollama instead why we are taking the base model from hugging face?
cant load the code link for the life of me 502 bad gateway
is a 4090 enough to train like you did ?
That more than enough
Stop flexing bro I know you are being sarcay😂😊
is it possible to run this on macbook m2 air
I will try on m3 air, 16gb, and let you know otherwise use a vm
Absolutely you can, specially 8B models, using Ollama,
Yes you can. Try MLX: th-cam.com/video/sI1uKhagm7c/w-d-xo.html
This was good, but I feel like you ran through everything too fast.
how to fix "Error: no slots available after 10 retries" ?
LOL .. did you just say "as simple as that" ?? ^^
There is a problem with all of your videos and you not saying ( why??? ) !
Do you want me to explain “why” to fine tune ?
@@MervinPraison no i want you to explain why we should include all of the library and other codes we don't know what they doing and why should use them.
Unsloth doesnt support MAC, thank you good bye
I struggled as fuck to run it on windows. Are they using Linux?
Can we email u