Fine-tuning Tiny LLM on Your Data | Sentiment Analysis with TinyLlama and LoRA on a Single GPU
ฝัง
- เผยแพร่เมื่อ 21 ก.ค. 2024
- Full text tutorial (requires MLExpert Pro): www.mlexpert.io/bootcamp/fine...
Getting bad predictions from your Tiny LLM? Learn how to fine-tune a small LLM (e.g. Phi-2, TinyLlama) and (possibly) increase your model's performance. You'll understand how to set up a dataset, model, tokenizer, and LoRA adapter. We'll train the model (Tiny Llama) on a single GPU with custom data and evaluate the predictions.
AI Bootcamp (in preview): www.mlexpert.io/membership
Discord: / discord
Subscribe: bit.ly/venelin-subscribe
GitHub repository: github.com/curiousily/Get-Thi...
00:00 - Intro
00:36 - Text tutorial on MLExpert
01:01 - Why fine-tuning Tiny LLM?
04:38 - Prepare the dataset
09:46 - Model & tokenizer setup
11:32 - Token counts
12:41 - Fine-tuning with LoRA
22:13 - Training results & saving the model
24:00 - Inference with the trained model
28:05 - Evaluation
30:46 - Conclusion
Join this channel to get access to the perks and support my work:
/ @venelin_valkov
#artificialintelligence #sentimentanalysis #llm #llama2 #chatgpt #gpt4 #python #chatbot
Full text tutorial (requires MLExpert Pro): www.mlexpert.io/bootcamp/fine-tuning-tiny-llm-on-custom-dataset
Thank you very much for this wonderful video. Among other things, the details you give are really very useful!
Thanks for this totorial
Hello. Thank you for this work! I don't see the jupyter notebook in the github repo.
Hi thanks!! A question for a model in which I have more than 2,000 pdfs. Do you recommend improving the handling of vector databases? When do you recommend fine tunning and when do you recommend vector database
Thanks!
Can u send notebook of this tutorial
Such an ontime tutorial! I'm working on some SLMs and need insights on Fine-tuning parameters. Your video is a huge help, thx for that! Couldn't find the colab for this project in the repository, any chance the colab is available? Btw I'm one of your mlexpert members.
Here is the colab link: colab.research.google.com/github/curiousily/AI-Bootcamp/blob/master/08.llm-fine-tuning.ipynb
From the GitHub repo:github.com/curiousily/AI-Bootcamp
Thank you for watching and subscribing!
Thanks
Kind of hard to tell if this is a close match to needs...where I can't see anything at all...
thanks for the video.
but ...
is that a language model?
idk a lot about Ai, but it looks like a multi class classification. LLM is supposed to be like chat gpt right ?
Please explain one interesting moment: First you add special token and then enlage embedding dimension to take this new token into account. At that point the new embedding is initialized by random values. Later you apply to target modules and embedding layer is absent in that list. My questions: 1) when you will train new embedding you have just added? Original model is freezed, only LoRA layers will be trained by trainer(). 2) why you do not add ###Titel, ###Text and ###Prediction as special tokens and let'em be part of the text?
getting NameError: DataCollatorForCompletionOnlyLM not found. I also checked the docs. I didn't find any Class named DataCollatorForCompletionOnlyLM
from trl import DataCollatorForCompletionOnlyLM