LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?

Transformers (how LLMs work) explained visually | DL5

AI Is Making You An Illiterate Programmer

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

🔥🚀 Inferencing on Mistral 7B LLM with 4-bit quantization 🚀 - In FREE Google Colab

Rohan-Paul-AI

มุมมอง 13 619

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 31 ม.ค. 2025

ความคิดเห็น • 28

@MasterBrain182 ปีที่แล้ว
Astonishing content Man 🔥🔥🔥 🚀
@RohanPaul-AI ปีที่แล้ว
Thank you mate !
@MihneaStefanUngurenau 9 หลายเดือนก่อน
Nice video, good job!
@RohanPaul-AI 9 หลายเดือนก่อน
Thank you! Cheers!
@Mai-sq5cc ปีที่แล้ว ⁺¹
thanks for tutorial!!
@javiergimenezmoya86 ปีที่แล้ว ⁺¹
What is better quantify with "bitsandbytes" o do it with "cllama" GUFF? What is the difference?
@venkateshr6127 ปีที่แล้ว ⁺²
Great video , can you make video on finetuning llm with best method.
@RohanPaul-AI ปีที่แล้ว ⁺²
That's exactly whats planned Venkatesh. stay tuned..
@manueljan2117 ปีที่แล้ว ⁺¹
how to use your model in the lagchain agent? I used this but it says llm value is not a valid dict
agent = initialize_agent(tools,
model,
agent="zero-shot-react-description",
verbose=True,
handle_parsing_errors=True,
max_new_tokens=1000)
@samketola919 ปีที่แล้ว ⁺¹
thx 😀
@JavMend 6 หลายเดือนก่อน
hi, is there a simple change that can be made to the code to run inference in 8-bit?
@anuvratshukla7061 ปีที่แล้ว ⁺²
Can you make video how to use open source LLM to query structured databse (sql/pandas) for chat
@RohanPaul-AI ปีที่แล้ว ⁺²
Sure will try to do one.
@gazzalifahim 8 หลายเดือนก่อน
Hello there, this is exactly what I was looking for. Could you please give resources or any tutorial where details of those functions are discussed?
My teammate gave a Kaggle Notebook with the exact same code and I am continuing to make that a conversational chatbot. But since I am brand new to this, I feel lost now.
@saravanajogan1221 ปีที่แล้ว
Hi Sir,
Could you tell us the mic setup and how you make your videos with such clear qulaity. Thanks
@jamalabidalrahem8144 หลายเดือนก่อน
can i use the minstral7b sharded model as a chatbot, so i can ask it questions about specific data i have for example, a book?
@mikiyasfikadu6422 ปีที่แล้ว
Help full video
@LaylaBitar-z7z ปีที่แล้ว
great video, sweet and simple. However, how can we control the token max limit, and also, do we have the option of separating our messages into a system message and a user message just like in Openai?
@stabilitylabs ปีที่แล้ว ⁺³
thanks for your tutorial. I have question, how to generate output to 32k ?
@thehkmalhotra9714 ปีที่แล้ว
Loved your content buddy ❤. Can we keep this Google Colab instance keep running for free and how can we expose this model as an Rest API to use in hosted projects and that too not locally.
@tomasgarcia2420 6 หลายเดือนก่อน
Hi, I get my token from huggingface but I dont know where I have to put it in colab
@vinsmokearifka ปีที่แล้ว
Sir, any advice if I use japanese or chinese language for RAG? Thanks
@MrunalAshwinbhaiMania-b1d 9 หลายเดือนก่อน
Can we do this type of qunatization with any model?
@RohanPaul-AI 9 หลายเดือนก่อน
yes we can do very much. Checkout my tweet on this
twitter.com/rohanpaul_ai/status/1765688184753820073
@onesecondnanba ปีที่แล้ว ⁺¹
colab file not found pls give notebook link
@RohanPaul-AI ปีที่แล้ว ⁺¹
Corrected the link in the description, here it is
github.com/rohan-paul/LLM-FineTuning-Large-Language-Models/blob/main/Mistral-7B-Inferencing.ipynb
@onesecondnanba ปีที่แล้ว ⁺¹
how to fine tune this
@RohanPaul-AI ปีที่แล้ว ⁺¹
For finetuning checkout this video
th-cam.com/video/6DGYj1EEWOw/w-d-xo.html&ab_channel=Rohan-Paul-AI

ต่อไป

เล่นอัตโนมัติ

LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?

LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

AI Is Making You An Illiterate Programmer

AI Is Making You An Illiterate Programmer

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

ซินเดอเรลล่ากลายเป็นภรรยาของลุงสุดหล่อหลังจากคืนโรแมนติกนั้น ไม่รู้ว่าเธอได้พบกับมหาเศรษฐี

How Strong Is Tape?

How Strong Is Tape?

Mistral 7B FineTuning with_PEFT and QLORA

Mistral 7B FineTuning with_PEFT and QLORA

RAG Implementation Medical Chatbot with Mistral 7B LLM LlamaIndex GTE Colab Demo

RAG Implementation Medical Chatbot with Mistral 7B LLM LlamaIndex GTE Colab Demo

How to Use Llama 3 with PandasAI and Ollama Locally

How to Use Llama 3 with PandasAI and Ollama Locally

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

Understanding 4bit Quantization: QLoRA explained (w/ Colab)

Understanding 4bit Quantization: QLoRA explained (w/ Colab)

LoRA explained (and a bit about precision and quantization)

LoRA explained (and a bit about precision and quantization)

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Fine Tuning Phi 1_5 with PEFT and QLoRA | Large Language Model with PyTorch

Fine Tuning Phi 1_5 with PEFT and QLoRA | Large Language Model with PyTorch

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

How to treat Acne💉

How to treat Acne💉

The White Lotus Season 3 | Official Teaser | Max

The White Lotus Season 3 | Official Teaser | Max

🔴LIVE โหนกระแส ศึกชิงมรดก 500 ล้าน ทายาทฟ้องเด็กรับใช้ปลอมลายเซ็น

🔴LIVE โหนกระแส ศึกชิงมรดก 500 ล้าน ทายาทฟ้องเด็กรับใช้ปลอมลายเซ็น

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

#JasonDeruloTV // Funny #GotPermissionToPost From @SofiManassyan #SlowLow

#JasonDeruloTV // Funny #GotPermissionToPost From @SofiManassyan #SlowLow

บังอาจ ทาบบารมี ! ผ่าเบื้องลึก 1 วันก่อนสังหาร เดินเกมล้มตระกูล “วิลาวัลย์” #ถกไม่เถียง

บังอาจ ทาบบารมี ! ผ่าเบื้องลึก 1 วันก่อนสังหาร เดินเกมล้มตระกูล “วิลาวัลย์” #ถกไม่เถียง