I don't understand why you don't have much more views and engagement. Your videos are some of the best explanations out there. I've sent my students to your channel multiple times. Not a great timeline where virality reins over veracity. Amazing work.
Awesome explaination! I have few questions though: 1) At 24:00, you said we can do some matrix multiplication and addition to update the value of Wq so that the fine tuned information gets kinda infused in Wq which inturn allowed us to have faster inference time, but won't that hurt the performance in comparision to the case where we don't update Wq and keep A and B? Are we just trading performance for inference speed? 2) what if we do the same 'update Wq' part with additive adapters? That will also speed up their inference time?
In finetuning of LLM we have 2 options. 1) change the parameter of actual Base model. But this require High resource and time. 2) Add new layers and change the architecture of the model. In finetuning only change the weight of this additional layer and Base model remain frozen. In inferencing we use both Base model and this additional layer. LoRA helps us in reducing this additional layer by using Low Rank Matrices. This is my knowledge. I want to please react on it So I can Verify my knowledge!😊
I understand how LoRA speeds up the fine-tuning, but you mentioned in the video that it also speeds up the inference. Could you please explain how is that possible?
Custom GPTs or Gemini Gems are pretty spot on after you get good at making them. I would play around with these before building an AI agent with LangChain and vector embeddings.
The quizzes aren't well connected to the content. Heck if you could add a timestamp after each quiz of "if you got this wrong, check out this timestamp" that would be helpful
Your explanations are easy to understand and in-depth at the same time. Thank you for making my life easier.
I don't understand why you don't have much more views and engagement. Your videos are some of the best explanations out there. I've sent my students to your channel multiple times.
Not a great timeline where virality reins over veracity. Amazing work.
Thanks! This means a lot. I am just glad the channel is able to provide value. So thanks for sharing this around
Awesome explaination! I have few questions though:
1) At 24:00, you said we can do some matrix multiplication and addition to update the value of Wq so that the fine tuned information gets kinda infused in Wq which inturn allowed us to have faster inference time, but won't that hurt the performance in comparision to the case where we don't update Wq and keep A and B? Are we just trading performance for inference speed?
2) what if we do the same 'update Wq' part with additive adapters? That will also speed up their inference time?
In finetuning of LLM we have 2 options.
1) change the parameter of actual Base model. But this require High resource and time.
2) Add new layers and change the architecture of the model. In finetuning only change the weight of this additional layer and Base model remain frozen. In inferencing we use both Base model and this additional layer.
LoRA helps us in reducing this additional layer by using Low Rank Matrices.
This is my knowledge. I want to please react on it So I can Verify my knowledge!😊
This is a good overview 👍
I like simple methods yet extremely effective
I understand how LoRA speeds up the fine-tuning, but you mentioned in the video that it also speeds up the inference. Could you please explain how is that possible?
Back again ❤❤❤
Custom GPTs or Gemini Gems are pretty spot on after you get good at making them. I would play around with these before building an AI agent with LangChain and vector embeddings.
I enjoyed this video. Can you do QLoRA next?
When did you explain benefits of loras over adapters?
I seem to have missed it
You are not alone
Appreciate it!
Amazing, thank you. Can u do one for latent diffusion
Cursor with claude 3.5 or o1 mini is great. Use their shortcuts to save time. Still struggles with new languages and frameworks though
The quizzes aren't well connected to the content. Heck if you could add a timestamp after each quiz of "if you got this wrong, check out this timestamp" that would be helpful
LoRAs are the biggest thing to come out of AI since the transformer