Mixture of Experts: The Secret Behind the Most Advanced AI

DINO: Self-distillation with no labels

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

Uyurken Kendimi Kurtçukların Arasında Buldum🤯😬🪱

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

Teacher-Student Neural Networks: The Secret to Supercharged AI

Computing For All

มุมมอง 4 430

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 ม.ค. 2025

ความคิดเห็น • 17

@jeromeeusebius 10 หลายเดือนก่อน ⁺³
Thank you for sharing this video explaining Knowledge distillation, and describing how the cross-entropy loss (hard_target_loss) is combined with the distillation_loss using KLDiv, which compares the soft probabilities of the teacher and student model, using the parameter alpha. Thanks also for provding the sample code and walkthrough the code. The use of a simple and student models being the same network and seeing the same amount of data, but having different validation accuracies, does show that the student did indeed learnt "the dark knowledge" from the teacher model, much richer knowledge whcih we can see from the results: student accuracy being better that the simple model accuracy. Cheers.
@C4A 10 หลายเดือนก่อน
Thank you for watching and commenting! Have a wonderful day.
@theelysium1597 24 วันที่ผ่านมา
Thank you very much for your explanation!
Unrelated: In the best positive sense: I love your expressive eyebrows!
@imadsaddik 6 หลายเดือนก่อน ⁺¹
Man thank you, I loved the explanation
@C4A 6 หลายเดือนก่อน
Glad to hear it! Thank you for watching and commenting.
@AshutoshKumar-cw8tw 7 หลายเดือนก่อน ⁺¹
Nice Explanation..Thanks :)
@C4A 7 หลายเดือนก่อน
I am glad to hear that you liked it! Thank you for watching and commenting.
@sharma01ketan 9 หลายเดือนก่อน ⁺²
Thank you sir :)
@C4A 9 หลายเดือนก่อน
You are welcome! Thank you for watching.
@LokeshB-l8o 10 หลายเดือนก่อน ⁺¹
Here what is the name of Teacher and student model?
@C4A 10 หลายเดือนก่อน ⁺¹
Thank you for watching. In the example code, both the Teacher and the Student models are examples of artificial neural network models.
The key difference between these models is their complexity and intended role in the training process. The Teacher model is larger and more complex, intended to capture a deep understanding of the data. The Student model is simpler and aims to approximate the performance of the Teacher model while being more computationally efficient.
@LokeshB-l8o 10 หลายเดือนก่อน ⁺¹
@@C4A Now i understand it because we are trying to do this knowledge distallation with two different model that why i asked you.Thank you
@C4A 10 หลายเดือนก่อน
@@LokeshB-l8o You are most welcome!
@aminedahane5874 11 หลายเดือนก่อน ⁺²
Good job
@C4A 11 หลายเดือนก่อน
Thank you!
@ankitghosh3865 11 หลายเดือนก่อน ⁺³
i love you sir
@C4A 11 หลายเดือนก่อน
Thank you for the kind words?

ต่อไป

เล่นอัตโนมัติ

Mixture of Experts: The Secret Behind the Most Advanced AI

Mixture of Experts: The Secret Behind the Most Advanced AI

DINO: Self-distillation with no labels

DINO: Self-distillation with no labels

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

#เดอะตุ๊ก !! เจาะเดือด ทีมชาติ ผ่าฟอร์ม !! ทีมชาติไทย มันส์ เปิด สาเหตุ !! ระบบ+แท็คติก

Uyurken Kendimi Kurtçukların Arasında Buldum🤯😬🪱

Uyurken Kendimi Kurtçukların Arasında Buldum🤯😬🪱

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

How ChatGPT Cheaps Out Over Time

How ChatGPT Cheaps Out Over Time

Better not Bigger: Distilling LLMs into Specialized Models

Better not Bigger: Distilling LLMs into Specialized Models

All Machine Learning algorithms explained in 17 min

All Machine Learning algorithms explained in 17 min

Large Language Models explained briefly

Large Language Models explained briefly

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

But what is a neural network? | Deep learning chapter 1

But what is a neural network? | Deep learning chapter 1

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Distilling the Knowledge in a Neural Network

Distilling the Knowledge in a Neural Network

Knowledge Distillation: A Good Teacher is Patient and Consistent

Knowledge Distillation: A Good Teacher is Patient and Consistent

ถ้าต้องทำ การบ้าน ตลอดชีวิต? คุณจะเลือกแบบไหน!

ถ้าต้องทำ การบ้าน ตลอดชีวิต? คุณจะเลือกแบบไหน!

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

ผู้หญิงแต่งงานกับขอทาน แต่กลับถูกดูหมิ่น ในที่สุดชายขเทานก็เผยตัวตย#ละครหวานๆ#ชอบ

ผู้หญิงแต่งงานกับขอทาน แต่กลับถูกดูหมิ่น ในที่สุดชายขเทานก็เผยตัวตย#ละครหวานๆ#ชอบ

Mache leckere Lutscher mit diesem PRO-Gadget! 🚽🍭

Mache leckere Lutscher mit diesem PRO-Gadget! 🚽🍭

ช้างศึกโดนก่อน ไล่ยิงคืนสิงคโปร์ ทะลุน็อคเอาท์

ช้างศึกโดนก่อน ไล่ยิงคืนสิงคโปร์ ทะลุน็อคเอาท์

นี่ไม่ใช่ลูกผม ผม63ปีแล้ว ผมแก่เกินจะมีลูก #สาระแทบไม่มี

นี่ไม่ใช่ลูกผม ผม63ปีแล้ว ผมแก่เกินจะมีลูก #สาระแทบไม่มี

LIVE🔴 : Cambodia vs Timor-Leste | ASEAN Championship 2024 | 17.12.24

LIVE🔴 : Cambodia vs Timor-Leste | ASEAN Championship 2024 | 17.12.24

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣