outputs = model(input_ids=inputs, labels=targets), is there a reason you used the same tensor for both inputs and targets. Should targets be moved 1 place after inputs?
I have a classificationstask where i have a para of text and it classifies labels of the para. Can i use this similar approach to tune it? Also will llama be a better choice?
Sorry, I have a beginner question. 1. What is the difference between using data in a data frame and using it by creating a model and loading it like now? 2. Is the 'SmallMedLM.pt'' model created in the example simply a vector database of "QuyenAnhDE/Diseases_Symptoms" data? 3. Or do you mean that ‘SmallMedLM.pt’ is a GPT2-based LLM fine-tuned with symptoms matching the disease? 4. So, does this mean that the trained 'SmallMedLM.pt' can be used when creating a chatbot related to a specific disease or symptom?
my usecase is that the input will be some product keyword and output should be product category it belongs to (example - input: white shirt, output: apparel). for this purpose which model will be suitable, distilgpt2 is good or do you recommend some other models in text generation section or do you recommend me to check model from other section like roberta or distilbert?
Please tell if you have any training on how to train and deploy a model given a dataset . I am an experienced developer want to learn machine learning and AI and related stuff😊😊
Buddies, it runs on colab but when I try it in my Windows VS code environment I get the following error: RuntimeError: PyTorch is not linked with support for mps devices. Anybody else got the same and how did you fix it?
The steps presented for training the model are incredibly valuable, thanks for your guidance. Would there be any notable differences in the results if we were to utilize a LaMini model instead of GPT2?
Won't recommend to convert this into GGUF as the model is already very small (around 380MB). Compressing it more will further degrade the performance. But if the data quality is better and other LMs like t5 and lamini are used then we can try GGUF. The current one smoothly runs on CPU tho.
WOW IT IS JUST A AWESOME I HAVE 36000 DOC AND WAS USING LAMINI ' YOUR VEDIO WITHOUT INTERNET BUT IT IS HOW CAN I BUILD MODEL USING THAT DOCS SO FETCHING TIME WILL BE LOW
subscribed just after watching one video. Thanks to creator. plz keep posting videos related to GenAI.
saving, will watch later when i am good in ml and python
outputs = model(input_ids=inputs, labels=targets), is there a reason you used the same tensor for both inputs and targets. Should targets be moved 1 place after inputs?
what extensions are you using for code autocomplete in colab?
can you provide the inferance notebook for using created model for inferance
Awesome tutorial ❤
Can you please make a tutorial on how to fine tune a model, especially on textual and image data?
For anyone trying to skip the noise, 6:14
why inputs and targets are same?
Oh wow .... Thank you for this tutorial ❤
Glad you like it!
@@AIAnytime i am having 35 columns if i ned to get all information based on ID how i would train please explain
I don't think so its transformer Architecture, which Architecture you used ?
What are the differences between encode and encode_plus?
how can i load my own dataset ?
I have a classificationstask where i have a para of text and it classifies labels of the para.
Can i use this similar approach to tune it?
Also will llama be a better choice?
Sorry, I have a beginner question.
1. What is the difference between using data in a data frame and using it by creating a model and loading it like now?
2. Is the 'SmallMedLM.pt'' model created in the example simply a vector database of "QuyenAnhDE/Diseases_Symptoms" data?
3. Or do you mean that ‘SmallMedLM.pt’ is a GPT2-based LLM fine-tuned with symptoms matching the disease?
4. So, does this mean that the trained 'SmallMedLM.pt' can be used when creating a chatbot related to a specific disease or symptom?
i am having 35 colum,s if i ned to get a;; information based on ID how i would train please explain
provide a code to push this model to hugging face too sir.
Need a video or instructions to integrating the model into mobile application
Excellent information 🎉
Glad it was helpful!
my usecase is that the input will be some product keyword and output should be product category it belongs to (example - input: white shirt, output: apparel). for this purpose which model will be suitable, distilgpt2 is good or do you recommend some other models in text generation section or do you recommend me to check model from other section like roberta or distilbert?
Thanks for sharing!
Thanks for watching!
Please tell if you have any training on how to train and deploy a model given a dataset . I am an experienced developer want to learn machine learning and AI and related stuff😊😊
Buddies, it runs on colab but when I try it in my Windows VS code environment I get the following error: RuntimeError: PyTorch is not linked with support for mps devices. Anybody else got the same and how did you fix it?
Got it now, Gemini told me mps is just for Apple 🤡
Can I train this model on movies dataset with columns Name, Synopsis, Genres and ask the model to recommendation similar kinds of movies?
Absolutely. You must do it. Try T5 model instead distilGPT
The steps presented for training the model are incredibly valuable, thanks for your guidance. Would there be any notable differences in the results if we were to utilize a LaMini model instead of GPT2?
In my experiments, T5 was better .... I am working on that to create a video shortly for a similar use case.
is there any metrics to check the performance of the LLMs
@@AIAnytime
Yes look at my Evaluation of LLMs and RAGs video. It's detailed.
Convert it to ggml format so we can use it in cpu
Won't recommend to convert this into GGUF as the model is already very small (around 380MB). Compressing it more will further degrade the performance. But if the data quality is better and other LMs like t5 and lamini are used then we can try GGUF. The current one smoothly runs on CPU tho.
@@AIAnytime Agree. Any model within 1 GB is very much within "small" model range
Why didnt you do this with GPT 3.5 instead ?
If you don't want to expose your data to open AI
1. Gpt 3.5 in a closed source. You need to pay. 2. Data protection and privacy. 3. Inference token costs.
@@AIAnytime
So you don't pay Hugging face?
WOW IT IS JUST A AWESOME I HAVE 36000 DOC AND WAS USING LAMINI ' YOUR VEDIO WITHOUT INTERNET BUT IT IS HOW CAN I BUILD MODEL USING THAT DOCS SO FETCHING TIME WILL BE LOW
Much needed
Love it!!!!!!!!!!!!!!!!!!!!!!!
Why did Manchester United bring a map to the game against West Ham?
MAN U 💔
❤️😜