Thank you for sharing this video explaining Knowledge distillation, and describing how the cross-entropy loss (hard_target_loss) is combined with the distillation_loss using KLDiv, which compares the soft probabilities of the teacher and student model, using the parameter alpha. Thanks also for provding the sample code and walkthrough the code. The use of a simple and student models being the same network and seeing the same amount of data, but having different validation accuracies, does show that the student did indeed learnt "the dark knowledge" from the teacher model, much richer knowledge whcih we can see from the results: student accuracy being better that the simple model accuracy. Cheers.
Thank you for watching. In the example code, both the Teacher and the Student models are examples of artificial neural network models. The key difference between these models is their complexity and intended role in the training process. The Teacher model is larger and more complex, intended to capture a deep understanding of the data. The Student model is simpler and aims to approximate the performance of the Teacher model while being more computationally efficient.
Thank you for sharing this video explaining Knowledge distillation, and describing how the cross-entropy loss (hard_target_loss) is combined with the distillation_loss using KLDiv, which compares the soft probabilities of the teacher and student model, using the parameter alpha. Thanks also for provding the sample code and walkthrough the code. The use of a simple and student models being the same network and seeing the same amount of data, but having different validation accuracies, does show that the student did indeed learnt "the dark knowledge" from the teacher model, much richer knowledge whcih we can see from the results: student accuracy being better that the simple model accuracy. Cheers.
Thank you for watching and commenting! Have a wonderful day.
Thank you very much for your explanation!
Unrelated: In the best positive sense: I love your expressive eyebrows!
Man thank you, I loved the explanation
Glad to hear it! Thank you for watching and commenting.
Nice Explanation..Thanks :)
I am glad to hear that you liked it! Thank you for watching and commenting.
Thank you sir :)
You are welcome! Thank you for watching.
Here what is the name of Teacher and student model?
Thank you for watching. In the example code, both the Teacher and the Student models are examples of artificial neural network models.
The key difference between these models is their complexity and intended role in the training process. The Teacher model is larger and more complex, intended to capture a deep understanding of the data. The Student model is simpler and aims to approximate the performance of the Teacher model while being more computationally efficient.
@@C4A Now i understand it because we are trying to do this knowledge distallation with two different model that why i asked you.Thank you
@@LokeshB-l8o You are most welcome!
Good job
Thank you!
i love you sir
Thank you for the kind words?