Navigate to key moments👇 made via tubestamp.com 0:03 - Understanding the concept of fine-tuning models 1:44 - Importance of providing example inputs and outputs for fine-tuning 4:09 - Importance of having 10 different data points for fine-tuning 6:30 - Minimum of 10 example data points required for training 9:46 - Training cost estimation and model success 11:32 - Leveraging fine-tuned models in OpenAI playground
explain when you mentioned at the end of the video to “try jason or jvsc “ before finetuning if prompting didnt work. can you explain how to try json and java scripts what did you mean
Try using code to refine the outputs before using a fine-tuning model. For example, if in your outputs you are continually getting "", then write lines of code to remove "" from the output rather than updating the underlying prompt or model.
Awesome video! I would like to see more content of how you can improve the model and make it even more powerful for clients)) It was awesome, if you could use then fine tuned model via another interfaces such as Zapier!
It's less about pricing and more about controlling the output. If you're trying to make a product to sell, you want some aspect of control over the output. Do you want OpenAi linking to competitors? Do you want it to get confused by bad information online? Fine tuning, in my mind, can often be used to get ahead of those issues and almost set "baselines"
I can speak to a use-case for it after battling with developing an auto-grader for a bit, gpt base models just don't have the level of accuracy and consistency needed for certain tasks.
The cost is sitting at around $3.00 per 1M tokens for input. That is why I would say try attempting to create an effective prompt with the base model or formatting the outputs before choosing to fine-tune a ChatGPT model like we did in this video.
Thank you for all the content so far. I work in a smaller company and want to be the go-to person for AI and specifically the section of AI you do your videos on. What would you suggest working on first? You should do some lessons even if those 'courses' are behind a pay wall, I want to learn everything.
My suggestion for anyone trying to leverage AI to its maximum capability is to get comfortable with automation platforms. Using platforms like Make or Zapier allows us to access external software and complete tasks automatically. Check out my playlist on my channel; I have over 100+ videos dedicated to this topic.
Hello, thank you for the video. I would be very grateful if you could answer this question! I have various data from a transport company about origin, destination, and description of the load, prices charged, and data that the company has collected. I want to create an OpenAI assistant that helps me calculate the cost of transport services. However, what I have seen is that finetuning is a good option to teach the model how to answer questions in a certain style and things like that. But not so much for giving it a broader dataset so that with this new information it can make calculations or have a broader context. Is this true? What should I do in my case where I want it to take this data and improve the calculation of transport rates?
@@tismine What I'm trying to do is take the data, convert it in embedindding and pass it to file search as a csv. I still don't know if it will work, but it seems to me that it may work.
LLMs are language models and aren't good with calculations, unfortunately. What ChatGPT would be good for your case is finding the data that is on your documents (price, description, origin, destination), and then you could programmatically make the calculation based on the price that the LLM collected from the source.
So my understanding is that fine tuning can be used to create chat bots aka customer service chat bots for companies. Is there a way you can them get the bot to be called on a website!? like an API call
is it possible to fix the output space to only return certain strings? i have tried implementing a RAG system, but it provides too many possible strings and the LLM does not perform well. so I would like to pre-bake this into the LLM if possible
im aware theres different methods: Chatbot Development Strategies Large Language Model Fine-Tuning When considering fine-tuning, you have several platforms to explore: OpenAI's GPT models Anthropic's Claude models Google's PaLM/Gemini models Open-source models like Llama is OPenAI the best?
I have a big database of categories and special code names. Is fine tuned right for me? I was thinking training chat bot to learn my database rather than giving it as data, I think it will be more costly.
It would be more costly; I would approach by attempting both options, see which gives the best results for the cost, and make a decision there. I lean towards untrained models.
I have a similar usecase and fine running provides low quality results. The trained model does not use the complete training data in answers plus halucinates a lot.
Thanks so much. When I fine-tune Gemini 1.5 flash I get a model with only 16k tokens context. Do you know if there is a way to retain the 1M window context? Why did they do it this way? I guess it’s a distilled fine-tuned model?
I have a question. I followed all the steps and successfully fine-tuned my model. However, I don't see my fine-tuned model in the playground. The only options available are GPT-4, GPT-3.5 Turbo-1106, GPT-3.5 Turbo, and GPT-3.5 Turbo-16k. Why are other models, including my fine-tuned one, not appearing?
thanks, did you group somewhere interesting exemples for fine tuning ? I see why doing finue tuning, but I miss interesting business cases. Moreover, how do you add your face on your videos AND contouring it ?
it's nice only the upload file step of the documentation is where I'm swimming now. But I'm sure I will get this. To answer your question why I'm finetuning a model, it's because the standart model 4o-mini isn't strong enough to seperate languages reliably from one another and I'm hoping I can overcome that problem for a multilanguage podcast script generater.
Your best approach would be developing a function that handles this in VSC or, alternatively, using automation software like shown here: th-cam.com/video/8zO0kJXmSG0/w-d-xo.html
Hi Corbin!! Your videos are so helpful. I would love a tutorial on integrating these fine-tuned OpenAI models easily into an Action in a custom GPT. I have been stuggling with getting the ChatGPT interface to always send inputs (in JSON format) to my specific fine-tuned model and output the results (also should be in JSON format). The model, which is actually a GPT-4 fine-tune, works seamlessly in the Playground though. Just wondering if you've tried this or plan to.
Hi, I think your video is great, but I have a question. I have already created assistants using OpenAI's assistant tools, but how do I apply this fine-tuning to an assistant that I've previously created? How would it be done in that case? Note, I'm talking about the assistants, not the agents, but the assistants that appear in OpenAI's admin panel.
When creating an assistant within the OpenAI dashboard, we do not yet have the capability to use our fine-tuned model as the base model. Fine-tuned models are used more in the context of calling the new endpoint to achieve better outputs within our software.
In the context of training the model, the more data you provide, the more expensive it will be to train. Therefore, I would opt for high-quality 10 data points. If you notice more reliability but certain outputs still don't hit the mark, either add more data points or adjust the prompt.
Thank you very much. We want to be very accurate as much as we can regardeless of the cost. We just dont want to delude the dat with 1000s entries.@@Corbin_Brown
Corbin, Thanks for this and all your other great explainers! I fine tuned a model on 40 or so Rodney Dangerfield jokes, but don't know how to call it outside playground (it's pretty funny btw, but gets no respect). I am not a coder, any ideas?
If you don't add System prompt (where you mention clearly Marv you are a funny/sarcastic chat bot) it seems that chat doesn't know to be funny/sarcastic if you ask the fine tunned model. The question is: if you add the System prompt to normal model it knows how to be funny without any training. so what the fine tune did in this case? (tested on gpt-4o-mini-2024-08-17)
I tried over 10 fine tuning jobs, using 70/ 30 split for validation. I had 0.8 loss on training and 0.2 on validation, pretty good score, i think. But it just doesn't work. We need proof that it works comparing a non tuned model to the tuned model with the same system prompt or no system prompt. Yet NOBODY has a video of showing this on TH-cam so I am wondering if this stuff actually works
I've been thinking of creating a chatbot that talks like me so I want to upload all my chats, social media posts, likes etc. This is for my wife in case something happens to me, maybe I will upload all the things I do around the house and how to fix stuff etc lol. So fine tuning looks like the way to go. I don't like the system content variable they use it seems repetitive and usually my system prompt is fairly long so it'd be repeated on each line.
When Creating the dataset to train gpt-3.5-turbo For a conversational AI If I want to train the model to answer something specific to a range of similar questions that can occur during a conversation. *The conversation is a script which the AI follows* Should I include only the question and the response or should I include all the conversation up to where that specific question is asked ?
so... i think you will receive a lot of views if you find a way to delete a finetuned by API... because is very boring create a lot of finetuning and cant delete after if you not use anymore..
Navigate to key moments👇
made via tubestamp.com
0:03 - Understanding the concept of fine-tuning models
1:44 - Importance of providing example inputs and outputs for fine-tuning
4:09 - Importance of having 10 different data points for fine-tuning
6:30 - Minimum of 10 example data points required for training
9:46 - Training cost estimation and model success
11:32 - Leveraging fine-tuned models in OpenAI playground
explain when you mentioned at the end of the video to “try jason or jvsc “ before finetuning if prompting didnt work. can you explain how to try json and java scripts what did you mean
Try using code to refine the outputs before using a fine-tuning model. For example, if in your outputs you are continually getting "", then write lines of code to remove "" from the output rather than updating the underlying prompt or model.
You're a LEGEND. Can't tell you how helpful this was
Awesome video!
I would like to see more content of how you can improve the model and make it even more powerful for clients))
It was awesome, if you could use then fine tuned model via another interfaces such as Zapier!
Nice to see some more technical vids, thx.
I get fine tuning a smaller model to be able to complete a task for cheaper but I feel like you’d need a really niche task to wanna fine tune chatgpt
I’m not tuned in though I’m sure people have lots of use cases
It's less about pricing and more about controlling the output. If you're trying to make a product to sell, you want some aspect of control over the output. Do you want OpenAi linking to competitors? Do you want it to get confused by bad information online? Fine tuning, in my mind, can often be used to get ahead of those issues and almost set "baselines"
I can speak to a use-case for it after battling with developing an auto-grader for a bit, gpt base models just don't have the level of accuracy and consistency needed for certain tasks.
Very helpful and valuable content. thank you corbin!
Glad it was helpful!
more of fine tuning cideo like this please
TYSM!!1 I could not find another vid that was actuallly updated! thx
Cheers!
nice. has there been a decrease in costs for using a fine tuned chatgpt model ? because it was unreasonably expensive a few months ago.
The cost is sitting at around $3.00 per 1M tokens for input.
That is why I would say try attempting to create an effective prompt with the base model or formatting the outputs before choosing to fine-tune a ChatGPT model like we did in this video.
Thank you for all the content so far. I work in a smaller company and want to be the go-to person for AI and specifically the section of AI you do your videos on. What would you suggest working on first? You should do some lessons even if those 'courses' are behind a pay wall, I want to learn everything.
My suggestion for anyone trying to leverage AI to its maximum capability is to get comfortable with automation platforms. Using platforms like Make or Zapier allows us to access external software and complete tasks automatically.
Check out my playlist on my channel; I have over 100+ videos dedicated to this topic.
Hello, thank you for the video. I would be very grateful if you could answer this question! I have various data from a transport company about origin, destination, and description of the load, prices charged, and data that the company has collected. I want to create an OpenAI assistant that helps me calculate the cost of transport services. However, what I have seen is that finetuning is a good option to teach the model how to answer questions in a certain style and things like that. But not so much for giving it a broader dataset so that with this new information it can make calculations or have a broader context. Is this true? What should I do in my case where I want it to take this data and improve the calculation of transport rates?
Did you figure that out?
@@tismine What I'm trying to do is take the data, convert it in embedindding and pass it to file search as a csv. I still don't know if it will work, but it seems to me that it may work.
How are collecting the "various data from a transport company" like in excel or like a pdf file?
LLMs are language models and aren't good with calculations, unfortunately. What ChatGPT would be good for your case is finding the data that is on your documents (price, description, origin, destination), and then you could programmatically make the calculation based on the price that the LLM collected from the source.
What did you mean at the and 11:37 by using JSON or JavaScript formatting the output??
an example video of this would be great!
So my understanding is that fine tuning can be used to create chat bots aka customer service chat bots for companies. Is there a way you can them get the bot to be called on a website!? like an API call
is it possible to fix the output space to only return certain strings? i have tried implementing a RAG system, but it provides too many possible strings and the LLM does not perform well. so I would like to pre-bake this into the LLM if possible
just want to ask if it costs money to run these ? or does it have some free limit so people can test this ?
this is really well explained
Can i do it with like input size of 3000 tokens and output size of 1000 tokens or so?
im aware theres different methods:
Chatbot Development Strategies
Large Language Model Fine-Tuning
When considering fine-tuning, you have several platforms to explore:
OpenAI's GPT models
Anthropic's Claude models
Google's PaLM/Gemini models
Open-source models like Llama
is OPenAI the best?
I have a big database of categories and special code names. Is fine tuned right for me? I was thinking training chat bot to learn my database rather than giving it as data, I think it will be more costly.
It would be more costly; I would approach by attempting both options, see which gives the best results for the cost, and make a decision there. I lean towards untrained models.
I have a similar usecase and fine running provides low quality results. The trained model does not use the complete training data in answers plus halucinates a lot.
thankyou , got now little direction to fine tuning!
Great!
Thanks so much. When I fine-tune Gemini 1.5 flash I get a model with only 16k tokens context. Do you know if there is a way to retain the 1M window context? Why did they do it this way? I guess it’s a distilled fine-tuned model?
What changes will be in the the model link if suffix is null I have not defined any kind of suffix and when I hit the completion api I get 404
Thank you so much for this video! It was so helpful
I have a question. I followed all the steps and successfully fine-tuned my model. However, I don't see my fine-tuned model in the playground. The only options available are GPT-4, GPT-3.5 Turbo-1106, GPT-3.5 Turbo, and GPT-3.5 Turbo-16k. Why are other models, including my fine-tuned one, not appearing?
Found the answer, I just happened not to give access to those models in project settings under Limits in Project.
very helpful!!
Hello , is there any way so i can disable the moderation or profanity while fine tuning my model.
thanks, did you group somewhere interesting exemples for fine tuning ? I see why doing finue tuning, but I miss interesting business cases. Moreover, how do you add your face on your videos AND contouring it ?
why the finetuned model is not showing up in the GUI of assistant
it's nice only the upload file step of the documentation is where I'm swimming now. But I'm sure I will get this.
To answer your question why I'm finetuning a model, it's because the standart model 4o-mini isn't strong enough to seperate languages reliably from one another and I'm hoping I can overcome that problem for a multilanguage podcast script generater.
Weird, I don’t have the option fine tuning in the left menu (plus subscription)
How do you fine tune it to write a book in a certain style for long content ? Thx
Your best approach would be developing a function that handles this in VSC or, alternatively, using automation software like shown here: th-cam.com/video/8zO0kJXmSG0/w-d-xo.html
Amazing video. There is jsut something that I don't understand. What to put in content in the fine-tuning file.
You're the man!
very helpful. thanks
Hi Corbin!! Your videos are so helpful. I would love a tutorial on integrating these fine-tuned OpenAI models easily into an Action in a custom GPT. I have been stuggling with getting the ChatGPT interface to always send inputs (in JSON format) to my specific fine-tuned model and output the results (also should be in JSON format). The model, which is actually a GPT-4 fine-tune, works seamlessly in the Playground though. Just wondering if you've tried this or plan to.
Thanks for the suggestion. I will look into adding this to my content agenda!
thanks👍
Hi, I think your video is great, but I have a question. I have already created assistants using OpenAI's assistant tools, but how do I apply this fine-tuning to an assistant that I've previously created? How would it be done in that case? Note, I'm talking about the assistants, not the agents, but the assistants that appear in OpenAI's admin panel.
When creating an assistant within the OpenAI dashboard, we do not yet have the capability to use our fine-tuned model as the base model.
Fine-tuned models are used more in the context of calling the new endpoint to achieve better outputs within our software.
Awesome Vid!!! How do I keep my data file safe and secure when training a model.
Nice...
Hi.. is there a way to fine tune gpt4?
How to fine tune using long form contents?
hello could I make a small database with TH-cam URLs?
Great video! can you do one showing how to custom an assistant and calling it over api? ^^
Urgent Question!!! can i fine turn GPT-3.5 model locally on my pC?
Check out this video that shows all the AI models we can install locally on our machine: th-cam.com/video/zrNKfiCuqCs/w-d-xo.html
@@Corbin_Brown thank you, the video was so helpful … is there a tutorial yet on how to fine tune ollama 8b ?
Dude thanks for video, bit as soon you create a video where the final results is what we want see, please spend more time showing the results
Amazing tutorial. Thanks for sharing. PS: what kinda app/soft u use to produce ur videos (green screen ?)
I know u use OBS but how do u crop out ur background ?
I use an Elgato collapsible green screen and then set up the chroma key in the OBS settings!
Cheers Corbin. Keep Rocking \m/
great video Corbin.. thanks. Is 10 enough or the more the merrier?
In the context of training the model, the more data you provide, the more expensive it will be to train.
Therefore, I would opt for high-quality 10 data points. If you notice more reliability but certain outputs still don't hit the mark, either add more data points or adjust the prompt.
Thank you very much. We want to be very accurate as much as we can regardeless of the cost. We just dont want to delude the dat with 1000s entries.@@Corbin_Brown
Corbin, Thanks for this and all your other great explainers! I fine tuned a model on 40 or so Rodney Dangerfield jokes, but don't know how to call it outside playground (it's pretty funny btw, but gets no respect). I am not a coder, any ideas?
how much did it costed you ?
Fine.
Tune
fine-tuning not support assistant ?
If you don't add System prompt (where you mention clearly Marv you are a funny/sarcastic chat bot) it seems that chat doesn't know to be funny/sarcastic if you ask the fine tunned model. The question is: if you add the System prompt to normal model it knows how to be funny without any training. so what the fine tune did in this case? (tested on gpt-4o-mini-2024-08-17)
I tried over 10 fine tuning jobs, using 70/ 30 split for validation. I had 0.8 loss on training and 0.2 on validation, pretty good score, i think. But it just doesn't work. We need proof that it works comparing a non tuned model to the tuned model with the same system prompt or no system prompt. Yet NOBODY has a video of showing this on TH-cam so I am wondering if this stuff actually works
💫🙏🎯👊🤙💪🗿🎬🔥🦅☯️ Thank You CB Great Value Add As Usual Sir
I've been thinking of creating a chatbot that talks like me so I want to upload all my chats, social media posts, likes etc. This is for my wife in case something happens to me, maybe I will upload all the things I do around the house and how to fix stuff etc lol. So fine tuning looks like the way to go. I don't like the system content variable they use it seems repetitive and usually my system prompt is fairly long so it'd be repeated on each line.
When Creating the dataset to train gpt-3.5-turbo
For a conversational AI
If I want to train the model to answer something specific to a range of similar questions that can occur during a conversation.
*The conversation is a script which the AI follows*
Should I include only the question and the response or should I include all the conversation up to where that specific question is asked ?
so... i think you will receive a lot of views if you find a way to delete a finetuned by API... because is very boring create a lot of finetuning and cant delete after if you not use anymore..
now your video is also 7 months old. I'll make a new one
lied :(
@@deroace wow never thought anyone would follow up on that
@@rodrigovm then I gess dont write, when you dont mean it be a man
@@deroace still no video :(
who doesn't love AI?