the model that lets you use both L1 and L2 regularization techniques is called Elastic Net. it has an extra parameter which takes values in range of 0 and 1. I just read about it yesterday in a book. Anyway, thanks for this great series, i am a complete beginner to NNs, and this series is helping me a lot in understanding the big picture and all the basic concepts and procedures of NNs.
The goal of regularization is to spread out the transfer from one layer to the next to as many connections as possible, thereby forcing the network to consider many aspects of the connection between the input and output. This is done by penalizing 'tunnelling' through few connections. And that is exactly that penalizing large weights does.
Mısra Hanım merhabalar. Örneklem sayısı az olan bir veri seti ile Ridge regresyon yöntemini kullanarak bir model oluşturmak istiyorum. Ancak modeli oluştururken çözümü el ile yapacağım. Bu konuda yardımcı olabilir misiniz?
Wow thank you for this video!!! this 8min video was better than my instructor's 8hour class on the same topic.
the model that lets you use both L1 and L2 regularization techniques is called Elastic Net. it has an extra parameter which takes values in range of 0 and 1. I just read about it yesterday in a book. Anyway, thanks for this great series, i am a complete beginner to NNs, and this series is helping me a lot in understanding the big picture and all the basic concepts and procedures of NNs.
not in range of 0 and 1 but 0 and pos. inf.
The goal of regularization is to spread out the transfer from one layer to the next to as many connections as possible, thereby forcing the network to consider many aspects of the connection between the input and output. This is done by penalizing 'tunnelling' through few connections. And that is exactly that penalizing large weights does.
Amazing explanation
Ok I subscribed! Like I'm a simple NN I see talent I converge to my optimum solution
Thank you Misra.Great content!
You're very welcome!
5:40'da parametrenin adini "alpha" diye belirtmissiniz, λ degil mi dogrusu?
Mısra Hanım merhabalar. Örneklem sayısı az olan bir veri seti ile Ridge regresyon yöntemini kullanarak bir model oluşturmak istiyorum. Ancak modeli oluştururken çözümü el ile yapacağım. Bu konuda yardımcı olabilir misiniz?
L1 was solid, I wish L2 was explained as well as L1.
Why can't I use alpha>1? Also doesn't this fail for networks with batchnorm for example?