I don't actively look out of new research papers but your channel keeps me updated with the hot trending updates in the field. I love this channel and thank you for uploading awesome content.
Great video, I'm interested in trying this out, im curious how it compaes to exllama2. Since exl2 is llama only, it would be nice to see a better optimized qlora for other models.
Really loving all of your videos, just one request. can you please share the slides you prepare so we can refer it whenver required for quick revision. Thanks so much for great content.
I don't actively look out of new research papers but your channel keeps me updated with the hot trending updates in the field. I love this channel and thank you for uploading awesome content.
Great video, I'm interested in trying this out, im curious how it compaes to exllama2. Since exl2 is llama only, it would be nice to see a better optimized qlora for other models.
is the same as QA LORA ?
Thank you for sharing our paper. 👍
looks like this method is simpler than QLora which does something like double quantization
Really loving all of your videos, just one request. can you please share the slides you prepare so we can refer it whenver required for quick revision. Thanks so much for great content.
Hello, i would love to see the explanation of LoftQ like you explained QLoRA.
I understood What is LoftQ, But what will be the difference in LoftQ & QLoRA ?
Yeah same confusion
LoftQ = Quantization optimized for LoRA + Better LoRA Adaptor initialization + LoRA fine-tuning vs. QLoRA = Regular Quantization + Regular LoRA fine-tuning
Low rank actually increases the inference time, instead of reducing,but reduces trainable parameters
Not after merging with the base model
@@sadaisystems in that case remains the same. In the video he mentioned a decrease in time.