The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Long-Context LLM Extension

Speculations on Test-Time Scaling (o1)

ช้างศึกโดนก่อน ไล่ยิงคืนสิงคโปร์ ทะลุน็อคเอาท์

Oren helps Durple escape Pinki in a way you wouldn't expect

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

AIF + DPO: Distilling Zephyr and friends

Sasha Rush 🤗

มุมมอง 3 865

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 17 ม.ค. 2025

ความคิดเห็น • 12

@1littlecoder ปีที่แล้ว ⁺⁴
This video is dope, Thanks so much!
@srush_nlp ปีที่แล้ว
Thanks! Yeah this stuff is cool.
@GeoffHulten ปีที่แล้ว
Great video Sasha - thanks.
@tribhuvanjoshi7675 ปีที่แล้ว
This video is really helpful. Please keep them coming. Would be great if you can also cover details of multi modal models.
@vivekpadman5248 ปีที่แล้ว
Thanks a lot for the detailed explaination man ❤
@420_gunna 9 หลายเดือนก่อน
Interesting, I don't _think_ the UltraChat paper explicitly says that it's using Self-Instruct, does it?
@420_gunna 9 หลายเดือนก่อน
Do we use just use "Self-Instruct" to mean "some seed of information enhanced by an LLM?"
@srush_nlp 9 หลายเดือนก่อน ⁺¹
I do feel like that is the common use at this point. I agree it is a bit subtle, but that paper first popularized the approach.
@rameshbalaji4741 ปีที่แล้ว
Thanks for the video very useful. One question I have is , how the migration happens from Step 2 to Step 3 is not really clear may I need to look at the DP paper. So in step 3 of the overall approach where you get the response from Base and new LLM, how are you comparing whether is it is winner or loser. you are using the Base and New LLM to perform the Winner and loser determination. Or are you creating a model in Step 2 on the preference data? If you throw some light it will be helpful. Appreciate that.
@srush_nlp ปีที่แล้ว
Good question. For zephyr step 3, we don't even use our model for data generation. We use a dataset called UltraFeedback that contains outputs from Llama, GPT-3.5, GPT-4 etc. The winner is the one GPT-4 preferred, and we use a random output as the loser. Unlike for OpenAI, for us Step 3 is after Step 2 but does not depend on it.
@rameshbalaji4741 ปีที่แล้ว
Thank you
@whiskeycalculus ปีที่แล้ว
Good work bro

ต่อไป

เล่นอัตโนมัติ

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Long-Context LLM Extension

Long-Context LLM Extension

Speculations on Test-Time Scaling (o1)

Speculations on Test-Time Scaling (o1)

ช้างศึกโดนก่อน ไล่ยิงคืนสิงคโปร์ ทะลุน็อคเอาท์

ช้างศึกโดนก่อน ไล่ยิงคืนสิงคโปร์ ทะลุน็อคเอาท์

Oren helps Durple escape Pinki in a way you wouldn't expect

Oren helps Durple escape Pinki in a way you wouldn't expect

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

วาทะลูกหนังขอเสนอ"แมนเชสเตอร์ ซิตี้ VS แมนเชสเตอร์ ยูไนเต็ด หลังเกม เรือใบสีฟ้าแพ้ปีศาจแดงคาบ้าน"

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

Are you using a Hacked AI system?

Are you using a Hacked AI system?

Danqi Chen: Data Selection for Pre-training and Instruction-tuning of LLMs

Danqi Chen: Data Selection for Pre-training and Instruction-tuning of LLMs

Best of CES 2025

Best of CES 2025

Street Fighting Transformers

Street Fighting Transformers

Sewon Min - Rethinking Data Use in Large Language Models

Sewon Min - Rethinking Data Use in Large Language Models

Best Ways to Use Gemini 2.0 (over ChatGPT & Perplexity)!

Best Ways to Use Gemini 2.0 (over ChatGPT & Perplexity)!

Should I do a Postdoc? - Niloofar Mireshghallah

Should I do a Postdoc? - Niloofar Mireshghallah

How to write an okay research paper.

How to write an okay research paper.

Simple Diffusion Language Models

Simple Diffusion Language Models

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

The White Lotus Season 3 | Official Teaser | Max

The White Lotus Season 3 | Official Teaser | Max

Uyurken Kendimi Kurtçukların Arasında Buldum🤯😬🪱

Uyurken Kendimi Kurtçukların Arasında Buldum🤯😬🪱

MARK 마크 '프락치 (Fraktsiya) (Feat. 이영지)' MV

MARK 마크 '프락치 (Fraktsiya) (Feat. 이영지)' MV

LIVE🔴 : Cambodia vs Timor-Leste | ASEAN Championship 2024 | 17.12.24

LIVE🔴 : Cambodia vs Timor-Leste | ASEAN Championship 2024 | 17.12.24

ถ้าม้าโดนแกล้งที่โรงเรียน ม้าจะฟ้องครูว่าอะไร #แต้มเซน #การ์ตูน #tamzen #ตลก #shortvideo #การ์ตูน

ถ้าม้าโดนแกล้งที่โรงเรียน ม้าจะฟ้องครูว่าอะไร #แต้มเซน #การ์ตูน #tamzen #ตลก #shortvideo #การ์ตูน

Oren helps Durple escape Pinki in a way you wouldn't expect

Oren helps Durple escape Pinki in a way you wouldn't expect

เดี่ยว - วันที่ได้คำตอบ - Live Show - The Voice Thailand 2024 - 15 Dec 2024

เดี่ยว - วันที่ได้คำตอบ - Live Show - The Voice Thailand 2024 - 15 Dec 2024