Thank you Stephen for answering my last question. I have another one! are you supposed to delimit in the training set between the different inputs? my set looks like: user_input= agent_output= user_input= agent_output= . . . is the GPT reading my entire file as one input, or do I need to separate each conversation, or does it matter? this way, when i use the GPT, i want the code to be: prompt = "user_input= agent_output=
Happy to help! Looks like you are diving right in. What you are doing right now is exactly the plan for one of the upcoming videos. Yes, delimiting is the correct approach. It's the industry's approach actually. They use delimiters like you have for this purpose. You have the correct format for input. The rust code should fully scan the entire dataset.txt file if you train it for long enough. You'll want to make sure to train for as long as possible once you are ready. Good news is you can pause and resume training as much as you want. The training function will sample your dataset.txt randomly. github.com/keyvank/femtoGPT/blob/main/src/gpt.rs#L28 here is the code that does the sample selection. It will take a random range of text. Something we could do is update that function to look for " " denoting end of input. That could help improve the model, preventing it from bleeding between sequences. The industry calls that the EOS / STOP token, the EOS_TOKEN.
@@StephenBlum thank you! this is very helpful. the current code is not at all what i need. I want the sampling to be just one line at a time from beginning to end!! i will make those changes !
Great to hear! Now that you have the model, the code needs some modification to run in inference mode. Thinking it makes sense to setup an axum web server, or you can just keep it as a CLI and run it on-demand as needed. Here is the inference function: let inference = gpt.infer( &mut rng, &tokenizer.tokenize("Your Prompt Here "), 100, inference_temperature, |_ch| {}, )?; println!("{}", tokenizer.untokenize(&inference)); // print model response You could put that in a second binary file, or parameterize the main.rs to run "inference mode" or "training mode" based on command line parameter. Lots of options! Making a follow-on video in a few weeks to show how that would work in a new github fork
@@StephenBlum I was trying to make it work by myselft and i think it finally worked ! I had to recreate the entire GPT object on a new binary file and even reload the dataset into the tokenizer, not the optimal way to do it, but it's working ! #[cfg(not(feature = "gpu"))] let graph = femto_gpt::graph::CpuGraph::new(); #[cfg(not(feature = "gpu"))] let is_gpu = false; #[cfg(feature = "gpu")] let graph = femto_gpt::graph::gpu::GpuGraph::new()?; #[cfg(feature = "gpu")] let is_gpu = true; let prompt: &str = "I just "; let training_state_path = Path::new("training_state.dat"); let mut rng = rand::thread_rng(); let inference_temperature = 0.7; // How creative? 0.0 min 1.0 max let dataset_char = fs::read_to_string("dataset.txt").expect("Should have been able to read the file"); let tokenizer = SimpleTokenizer::new(&dataset_char); let batch_size = 32; let num_tokens = 64; let vocab_size = tokenizer.vocab_size(); let embedding_degree = 64; let num_layers = 4; let num_heads = 4; let head_size = embedding_degree / num_heads; let dropout = 0.0; assert_eq!(num_heads * head_size, embedding_degree); println!("Vocab-size: {} unique characters", vocab_size); let mut gpt = GPT::new( &mut rng, graph, is_gpu.then(|| batch_size), vocab_size, embedding_degree, num_tokens, num_layers, num_heads, head_size, dropout, )?; gpt.sync()?; println!("Number of parameters: {}", gpt.num_params()); if training_state_path.is_file() { let mut ts_file = fs::File::open(training_state_path).unwrap(); let mut bytes = Vec::new(); ts_file.read_to_end(&mut bytes).unwrap(); let ts: TrainingState = bincode::deserialize(&bytes).unwrap(); gpt.set_training_state(ts, true)?; } println!(); println!("Starting the inference process..."); println!(); let inference = gpt.infer( &mut rng, &tokenizer.tokenize(prompt), 50, inference_temperature, |_ch| {}, )?; println!("{}", tokenizer.untokenize(&inference)); // print model response
I have just started the training on some data. How do I test it? Where do I give it a sentence and have it finish it? I think its: let inference = gpt.infer( &mut rng, &tokenizer.tokenize(" "), 100, inference_temperature, |_ch| {}, )?; replace the ' ' with my prompt? thanks
Oh yes good question! After training is complete, you will want to be able to use the model inference capabilities so that it can complete output sequences. Looking at your code it appears to be you are on the right track. I remember during the training cycles, there were moments where inference testing occurred and printed the output on the screen. This happened during training. That code is going to be the same code you use to run inference. Inference is used to use the model after it has been trained to generate the string patterns of letters based on your training data. I will validate the correct function in a follow-on comment here shortly 😄👍
good question! yes you can absolutely do this. your alphabet is "0-9. " with a period, and space character. you would be able to predict the next "number" in the series based on your training set. For example a stock quote price.
Hi Stephen, this is a great tutorial and perhaps only one I could found. I'm running it right now and seeing some good result. I wanted to ask how do I turn this retrained model to answer questions? Code please if possible as I'm not expert in rust. Cheers!
Hi Ajay! Good question. How to train the model into a Question/Answer model. You just have to change the dataset.txt to be in the "Question: ..." and "Answer: ..." format. Then you can train the model to answer questions. Note that you have to prefix the input as "Question: your_question_here" and the model will reply with "Answer: model_answer". You'll need a lot of data to get a good result.
@@StephenBlum Tx for the reply. The dataset is already formatted with that. My question is how to run any script or command (if it is already there) or write a new code to ask model a question for which it replies me back with an answer?
@@ajaykumarsinghlondon ah yes. Okay. So you'd actually have to code for this. You need to define an interface. A command line argument for example. And you'd use that as the input. Then you'd need to code that into the Rust app as the input. You'd print the output. You can create an updated src/main.rs file. You can remove the training sections, and execute the gpt.infer() function and print the output.
@@ajaykumarsinghlondon there is a comment on this page that shows the function needed to call: th-cam.com/video/jEyPQUyNhD0/w-d-xo.html&lc=Ugy7Zy4-ZvZTBQk5G4l4AaABAg.A3xgWTfwS0PA3xiq6nxncm (threaded comment) code example from that comment thread is: let inference = gpt.infer( &mut rng, &tokenizer.tokenize("Your Prompt Here "), 100, inference_temperature, |_ch| {}, )?; println!("{}", tokenizer.untokenize(&inference)); // print model response
I'm guessing it wouldn't be any smarter than the predictive text feature on a phone, since it's only predicting which letter is most likely to come next. If you can understand the code though, it could be interesting as an example of how these work.
With the transformer model, it should outperform your phones' predictive text feature. And the good news is that you can even customize how much CPU/Memory to allocate to make improvements as you need. It's really powerful! Testing training on a GPU it was able to learn the entire 1MB of text in a few minutes. Imagine, you can gave it specific text and way more than 1MB. Lots of opportunity! 😄
Nice! Updating your AI for PHP + HTML + JS + CSS sounds sounds like a great idea 😄 🙌 Tuning your AI for distinct use cases like this is powerful. I think we'll be seeing a lot more models like this going forward. Where we'll have better performing use-case specific models. 🎉
Ah yes. You are right. It uses OpenCL. Apple wants everyone to deprecate OpenCL and migrate to Metal. That is a drawback. There may be a way to set up a wrapper. Though it seems like that could be a bit of effort. femtoGPT will still work for the CPU. It will use every CPU core on your machine 📈
There's going to come a day when you can train a GPT-5 level model on an old computer and that's gonna be hilarious and quaint like running 20 different game emulators on a raspberry pi or something.
😂 you are right! The Amiga by Commodore level of quaint old computers. That day when the GPT-5 level models running on an old computer is closer than we might think.
You are very welcome! 😄 Happy to help. This is pretty exciting as you can use it to recreate anyone's digital likeness. It is powerful and can even recreate digital yourself if you have enough data 📈
Stephen, its Steven, whats up twin😂 Malik Yusef (Kanye's main collaborator) and I are launching a platform. I'm super creative, never claimed to be smart, so I tend to get myself in situations like these often, where I know it can be done, it’s just the learning curve.. LMK if you have time to connect, would love to run the platform by you and possibly get you involved, however that looks🙏🏼
Hey Stephen. Great explanation. I am trying to train a model on Jira tickets. Can you suggest the way I can format the data in the dataset file. I want to give the description of the ticket. The comments with the commenter name. The state changes and the values of other parameters and their changes like assignee name. This is the kind of thing I have in mind: NUMBER: BACK-356 /n TITLE: Invoice dump job failure /n DESCRIPTION: The job for ingesting invoices from the Production tables has failed on June 26th, 2024. We need yo resolve this because the financial reporting is due at the end of the month. /n ASSIGNEE: Ramesh Vesvaraya /n COMMENT: /n WRITER: Ram Gupta /n BODY: @Saurabh Sharma can you look into this.
Yes that's perfect! You'll want to add a "STOP_TOKEN" something like an "END" character that indicates to the generator to stop fetching. It can be anything.The format you have is amazing! This is a good start 🎉🙌🚀 What I did is find a double "
" where the model would output two new-line characters in a sequence. Your training data should add the "
" to the end of each training sample. Separating each Jira Ticket.
Keyvan here, thanks for your great video ❤
Hi Keyvan! femtoGPT is amazing 🤩 thank you!
how you found this video 😵💫
Thank you Stephen for answering my last question. I have another one!
are you supposed to delimit in the training set between the different inputs?
my set looks like:
user_input= agent_output=
user_input= agent_output=
.
.
.
is the GPT reading my entire file as one input, or do I need to separate each conversation, or does it matter?
this way, when i use the GPT, i want the code to be:
prompt = "user_input= agent_output=
Happy to help! Looks like you are diving right in. What you are doing right now is exactly the plan for one of the upcoming videos. Yes, delimiting is the correct approach. It's the industry's approach actually. They use delimiters like you have for this purpose. You have the correct format for input. The rust code should fully scan the entire dataset.txt file if you train it for long enough. You'll want to make sure to train for as long as possible once you are ready. Good news is you can pause and resume training as much as you want. The training function will sample your dataset.txt randomly. github.com/keyvank/femtoGPT/blob/main/src/gpt.rs#L28 here is the code that does the sample selection. It will take a random range of text. Something we could do is update that function to look for "
" denoting end of input. That could help improve the model, preventing it from bleeding between sequences. The industry calls that the EOS / STOP token, the EOS_TOKEN.
@@StephenBlum thank you! this is very helpful. the current code is not at all what i need. I want the sampling to be just one line at a time from beginning to end!! i will make those changes !
@@videos4mydad nice! 🙌😄
@@StephenBlum I'm planning to make something like this too. My dataset are built like this: {question}
{answer}
I will try to update this function to look for
at the end of input, I'm not a Rust Dev but I think I can do it with help of TH-cam and ChatGPT 😅😆
@@bruninhohenrri excellent idea 👍'
' will be a good stop-token for end of input. Rust + YT + ChatGPT = 🎉
Hey ! I finally had time to test it. Now that i have the model, how could it inference it ? Thanks for the video !
Great to hear! Now that you have the model, the code needs some modification to run in inference mode. Thinking it makes sense to setup an axum web server, or you can just keep it as a CLI and run it on-demand as needed. Here is the inference function:
let inference = gpt.infer(
&mut rng,
&tokenizer.tokenize("Your Prompt Here
"),
100,
inference_temperature,
|_ch| {},
)?;
println!("{}", tokenizer.untokenize(&inference)); // print model response
You could put that in a second binary file, or parameterize the main.rs to run "inference mode" or "training mode" based on command line parameter. Lots of options! Making a follow-on video in a few weeks to show how that would work in a new github fork
@@StephenBlum I was trying to make it work by myselft and i think it finally worked !
I had to recreate the entire GPT object on a new binary file and even reload the dataset into the tokenizer, not the optimal way to do it, but it's working !
#[cfg(not(feature = "gpu"))]
let graph = femto_gpt::graph::CpuGraph::new();
#[cfg(not(feature = "gpu"))]
let is_gpu = false;
#[cfg(feature = "gpu")]
let graph = femto_gpt::graph::gpu::GpuGraph::new()?;
#[cfg(feature = "gpu")]
let is_gpu = true;
let prompt: &str = "I just ";
let training_state_path = Path::new("training_state.dat");
let mut rng = rand::thread_rng();
let inference_temperature = 0.7; // How creative? 0.0 min 1.0 max
let dataset_char =
fs::read_to_string("dataset.txt").expect("Should have been able to read the file");
let tokenizer = SimpleTokenizer::new(&dataset_char);
let batch_size = 32;
let num_tokens = 64;
let vocab_size = tokenizer.vocab_size();
let embedding_degree = 64;
let num_layers = 4;
let num_heads = 4;
let head_size = embedding_degree / num_heads;
let dropout = 0.0;
assert_eq!(num_heads * head_size, embedding_degree);
println!("Vocab-size: {} unique characters", vocab_size);
let mut gpt = GPT::new(
&mut rng,
graph,
is_gpu.then(|| batch_size),
vocab_size,
embedding_degree,
num_tokens,
num_layers,
num_heads,
head_size,
dropout,
)?;
gpt.sync()?;
println!("Number of parameters: {}", gpt.num_params());
if training_state_path.is_file() {
let mut ts_file = fs::File::open(training_state_path).unwrap();
let mut bytes = Vec::new();
ts_file.read_to_end(&mut bytes).unwrap();
let ts: TrainingState = bincode::deserialize(&bytes).unwrap();
gpt.set_training_state(ts, true)?;
}
println!();
println!("Starting the inference process...");
println!();
let inference = gpt.infer(
&mut rng,
&tokenizer.tokenize(prompt),
50,
inference_temperature,
|_ch| {},
)?;
println!("{}", tokenizer.untokenize(&inference)); // print model response
Ok(())
@@StephenBlum I trained with a poor dataset. Now i think i'm going to make soemthing cool, like a text-to-sql AI model :D
@@bruninhohenrri Nice! text-to-sql sounds amazing 🤩
I have just started the training on some data.
How do I test it?
Where do I give it a sentence and have it finish it?
I think its:
let inference = gpt.infer(
&mut rng,
&tokenizer.tokenize("
"),
100,
inference_temperature,
|_ch| {},
)?;
replace the '
' with my prompt?
thanks
Oh yes good question! After training is complete, you will want to be able to use the model inference capabilities so that it can complete output sequences. Looking at your code it appears to be you are on the right track. I remember during the training cycles, there were moments where inference testing occurred and printed the output on the screen. This happened during training. That code is going to be the same code you use to run inference. Inference is used to use the model after it has been trained to generate the string patterns of letters based on your training data. I will validate the correct function in a follow-on comment here shortly 😄👍
Yes you found it. Confirmed. This is the right place to run your trained model using a prompt: `gpt.infer()` function ✅
what if this dataset consisted only of numeric data? how to train our custom gpt model then?
good question! yes you can absolutely do this. your alphabet is "0-9. " with a period, and space character. you would be able to predict the next "number" in the series based on your training set. For example a stock quote price.
Hi Stephen, this is a great tutorial and perhaps only one I could found. I'm running it right now and seeing some good result. I wanted to ask how do I turn this retrained model to answer questions? Code please if possible as I'm not expert in rust. Cheers!
Hi Ajay! Good question. How to train the model into a Question/Answer model. You just have to change the dataset.txt to be in the "Question: ..." and "Answer: ..." format. Then you can train the model to answer questions. Note that you have to prefix the input as "Question: your_question_here" and the model will reply with "Answer: model_answer". You'll need a lot of data to get a good result.
@@StephenBlum Tx for the reply. The dataset is already formatted with that. My question is how to run any script or command (if it is already there) or write a new code to ask model a question for which it replies me back with an answer?
@@ajaykumarsinghlondon ah yes. Okay. So you'd actually have to code for this. You need to define an interface. A command line argument for example. And you'd use that as the input. Then you'd need to code that into the Rust app as the input. You'd print the output. You can create an updated src/main.rs file. You can remove the training sections, and execute the gpt.infer() function and print the output.
@@ajaykumarsinghlondon there is a comment on this page that shows the function needed to call: th-cam.com/video/jEyPQUyNhD0/w-d-xo.html&lc=Ugy7Zy4-ZvZTBQk5G4l4AaABAg.A3xgWTfwS0PA3xiq6nxncm (threaded comment) code example from that comment thread is:
let inference = gpt.infer(
&mut rng,
&tokenizer.tokenize("Your Prompt Here
"),
100,
inference_temperature,
|_ch| {},
)?;
println!("{}", tokenizer.untokenize(&inference)); // print model response
@@ajaykumarsinghlondon here is the code: gist.github.com/stephenlb/9e919c0c2523048aeda022b1fafe91b7
is there any such program for python/javascript devs?
yes totally for Python you'll see it's already built by Meta: pytorch.org/docs/stable/generated/torch.nn.Transformer.html
@@StephenBlum thats big! thanks for sharing ♥️
I'm guessing it wouldn't be any smarter than the predictive text feature on a phone, since it's only predicting which letter is most likely to come next. If you can understand the code though, it could be interesting as an example of how these work.
With the transformer model, it should outperform your phones' predictive text feature. And the good news is that you can even customize how much CPU/Memory to allocate to make improvements as you need. It's really powerful! Testing training on a GPU it was able to learn the entire 1MB of text in a few minutes. Imagine, you can gave it specific text and way more than 1MB. Lots of opportunity! 😄
I’m updating my ai for PHP html js and css and it can build templates
Nice! Updating your AI for PHP + HTML + JS + CSS sounds sounds like a great idea 😄 🙌 Tuning your AI for distinct use cases like this is powerful. I think we'll be seeing a lot more models like this going forward. Where we'll have better performing use-case specific models. 🎉
ironically I was hoping to use the apple silicon for its neural engine yet this project uses amd and intel
Ah yes. You are right. It uses OpenCL. Apple wants everyone to deprecate OpenCL and migrate to Metal. That is a drawback. There may be a way to set up a wrapper. Though it seems like that could be a bit of effort. femtoGPT will still work for the CPU. It will use every CPU core on your machine 📈
GitHub Repository: github.com/keyvank/femtoGPT to download the code
Command: cargo run --features gpu --release
There's going to come a day when you can train a GPT-5 level model on an old computer and that's gonna be hilarious and quaint like running 20 different game emulators on a raspberry pi or something.
😂 you are right! The Amiga by Commodore level of quaint old computers. That day when the GPT-5 level models running on an old computer is closer than we might think.
Really nice explanation!
Thank you! Your feedback is excellent and it helps me continue to make better videos 😄🙌
YES!!!! This Sir! Thank you!
You are very welcome! 😄 Happy to help. This is pretty exciting as you can use it to recreate anyone's digital likeness. It is powerful and can even recreate digital yourself if you have enough data 📈
Would've been nice if you would show how to make it into a chatbot :)
Nice! Good idea. This definitely needs to happen. Adding it to the planned videos list 🙌🎉😊
great video!
Thank you! 😊 If you have questions or ideas to cover, let me know! 😄🙌
Stephen, its Steven, whats up twin😂 Malik Yusef (Kanye's main collaborator) and I are launching a platform. I'm super creative, never claimed to be smart, so I tend to get myself in situations like these often, where I know it can be done, it’s just the learning curve.. LMK if you have time to connect, would love to run the platform by you and possibly get you involved, however that looks🙏🏼
Hi Steven! Yes let's do it. send email to stephen@pubnub.com
you sharing gems😍😭😭
💎🙌🎉😄 great to hear thank you 😊
Are you Alex Honnold's brother?
Oh yeah! The free climber 🪨🧗that is scary and impressive 😄
Let's GOOOOOOOI
🎉🎉🎉 😄🙌
Hey Stephen. Great explanation.
I am trying to train a model on Jira tickets. Can you suggest the way I can format the data in the dataset file.
I want to give the description of the ticket. The comments with the commenter name. The state changes and the values of other parameters and their changes like assignee name.
This is the kind of thing I have in mind:
NUMBER: BACK-356 /n
TITLE: Invoice dump job failure /n
DESCRIPTION: The job for ingesting invoices from the Production tables has failed on June 26th, 2024. We need yo resolve this because the financial reporting is due at the end of the month. /n
ASSIGNEE: Ramesh Vesvaraya /n
COMMENT: /n
WRITER: Ram Gupta /n
BODY: @Saurabh Sharma can you look into this.
Yes that's perfect! You'll want to add a "STOP_TOKEN" something like an "END" character that indicates to the generator to stop fetching. It can be anything.The format you have is amazing! This is a good start 🎉🙌🚀 What I did is find a double "
" where the model would output two new-line characters in a sequence. Your training data should add the "
" to the end of each training sample. Separating each Jira Ticket.
@@StephenBlum Fantastic. Thanks.