SVM for REAL data. SVM Intuition Video: • Support Vector Machine... Hard-Margin SVM Video: • SVM (The Math) : Data ... Hinge Loss Video: • Loss Functions : Data ...
Awesome video, thank you for clarifying these topics for us. The format is pristine and I get a lot from the different ways you present information because by the second or third video I have a good foundation for the tougher parts to chew. Again, thank you!
Third video on SVM from this guy and I'm now a subscriber. Best explanation so far, and I watched a bunch before getting to these videos! Two thumbs up!
Loved both hard margin and soft margin videos, everything is clear in 25 minutes collectively. Thanks a lot Ritvik! May your channel thrive more, will share a word for you.
Thank you so much!!!! You are a life saver!!! I had been troubled by the soft margin svm for a week until your video explained to me very clearly. What I didn't understand was the lambda part but now I do!!! THANKSSSSSSSSSSSSSSSSS
Really explained well. If you want to get the theoretical concepts one could try doing the MIT micromasters. It’s rigorous and demands 10 to 15 hours a week.
These explanations are so brilliantly and intuitively given, making daunting-looking equations and concepts understandable. Thank you @ritvikmath, you are truly a gift to data science.
Great video! I have a question. The optimisation formula for soft-margin svm that I usually see in textbooks is : min ||w|| + c * sum over theta. How does the equation in your videos relate to this one? Is it pretty much the same just with different symbols or is it different? Thanks!
How do you choose your support vectors, if they are no longer the closest vector to the decision boundary? Does the value of "1" get generated automatically when you plug the values of X and Y in? Or is there some scaling that takes place to set one of the vectors value to "1"?
Hi Ritvikmath, Many thanks for the wonderful, I really love the simple notations you have used for the equations which make them very easy to understand. Can you suggest any books/courses that follow the similar notation to yours or can you please provide the source which helped you in creating these contents..? Thanks in Advance
Shoutout to my previous TA!! Also do you mind uploading pictures of whiteboard only for future videos, as it might be easier for us to check notes? Thank you!
hello, thank you for tutorials . how to apply SVM model to classify an alpha data, to realize the detection of driver’s sleepless? very looking forward for your reply.
The margin is taken into account twice in a weird way. The obvious one is the lambda ||w||. But the hingeloss has the margin as a unit of measurement. So if a datapoint is at distance five from the support vector, the hinge loss can drastically change depending on the size of the margin. Is this double accounting of the margin intended? Should there be a normalization for this? I believe deviding the hinge loss by ||w|| should work.
so can we mathematically solve for the vector w and the value b the soft-margin svm optimisation problem? and if so can anyone point where to read up on this?
how can we still have some data between margins even after rescaling w vector so that min |w^Tx + b| = 1? doesnt it mean we find the closest possbile data points to the hyperplane and rescale the w vectors so that the distrant from closest data points to the hyperplane falls into 1? this way, there shouldn't be any plots between margins... could u help correct ?
I think you can find optimal params in 2 ways: the first one consists in minimizing with respect to w and b the primal formulation problem, and the second one consists in maximizing with respect to a certain alpha (which is a Lagrange multiplier) the dual formulation of the problem. In the second case, once you have computed the optimal alpha, you can replace it in the equation of w (written in function of alpha) and you will find the optimal w. In order to find the best b you have to rearrange some conditions, but i am not sure about that.
Question about Lambda: Does that mean when Lambda is LARGE -> We care more about MisClassfication Error. When Lambda is SMALL, we care about Minimize the Weight Vector and Maximize the Margin ???
The margin for a hard margin SVM is pretty intuitive. But not with soft margin SVM. With hard margin, it's a rule that both margin lines must lie on at least one of their respective points. I think with soft margin, there's a rule that for any value of lambda, at least one of the margin lines must lie on at least one of their respective points, but it's not mandatory that both do. Do you concur?
I love your board work, but you should really have an image of the board without you in it, or just delay your walk into the picture after a second or two at the beginning so I can snag shot for my notes a bit easier, lol.
Haha. I have dedicated a whole hard disk for ritvik’s data science videos. I just hope he is going to write a book or even better do an end to end data science course on coursera😍😍
This guy deserves to be paid for this stuff. It's brilliant.
Haha, glad you think so!
Absolutely, I have the same thing in my mind
You are a great teacher, hope this channel thrives!
I hope so too!
you saved my life, I will watch all your videos before my exam on machine learning
Awesome video, thank you for clarifying these topics for us. The format is pristine and I get a lot from the different ways you present information because by the second or third video I have a good foundation for the tougher parts to chew. Again, thank you!
Third video on SVM from this guy and I'm now a subscriber. Best explanation so far, and I watched a bunch before getting to these videos! Two thumbs up!
Loved both hard margin and soft margin videos, everything is clear in 25 minutes collectively. Thanks a lot Ritvik! May your channel thrive more, will share a word for you.
You're the absolute best at explaining complex things in such an easy way, it's even relaxing
Thanks for making SVM easy. You’re a great communicator.
Came here after Andrew Ng s videos. Found yours to be way more intuitive. Brilliant
What an amazing video. Absolutely Gold. Please make more videos. Never stop making one
Thank you so much!!!! You are a life saver!!! I had been troubled by the soft margin svm for a week until your video explained to me very clearly. What I didn't understand was the lambda part but now I do!!! THANKSSSSSSSSSSSSSSSSS
Definitely best video about svm I' ve found online; better than my university lectures (sadly). Great job!
great job man...keep bringing us these kinds of amazing stuff
Thank you very much, Ritvik, for simplifying this topic and even ML. God bless you more and more
explaining complex concepts in a simple manner. That is how these topics must be taught. Wow!
You deserve more subs and likes. Thank you for this!
I appreciate that!
best teacher, very articulate. looking forward to more videos
This videos are real hidden gems. And they deserve to be not hidden any more..
Gem of lectures!
Brilliant explanation! Thank you!
You are gem to the data science community!
Thanks a lot, i mightn't be able to understand SVM,without this..
Your videos are the best!
This man has single handedly saved my life.
You are really good at this man
Such a brilliant explanation!
I love this. Thank you so much. Helped me a lot
Thanks so much for your amazing works. Keep it up.
Well explained! Very helpful!
this teacher deserves Nobel price!
AMAAZING VIDEOO ! You are so awesome.
the search to find a good youtube video on SVM has finally ended, gotta watch other topics too.
Just perfect mate
cystal clear, thanks a lot
Thank you from an MSC Data Science student at Exeter University in exam season
thank you for this video, very helpful!
Great explanation!
you are a great teacher, I dont know why youtube doesnt reccomend your videos.
Also please try some social media marketing.
Perfect!
Great clarification video
Awesome.
شرح رائع ❤❤❤
So good.
Wow great lecture clear explanation...tq rit
Excellent
Really explained well. If you want to get the theoretical concepts one could try doing the MIT micromasters. It’s rigorous and demands 10 to 15 hours a week.
These explanations are so brilliantly and intuitively given, making daunting-looking equations and concepts understandable.
Thank you @ritvikmath, you are truly a gift to data science.
This is clearly explained!! Love your teaching. One question here, how do you choose lamda? What is the impact of a higher or lower lamda?
great explanation thank you
🌟MAgnificient🌟Very nice Thanks helps in interview questions.
Great video! I have a question. The optimisation formula for soft-margin svm that I usually see in textbooks is : min ||w|| + c * sum over theta. How does the equation in your videos relate to this one? Is it pretty much the same just with different symbols or is it different? Thanks!
greatest teacher ever
wow thanks!
You are the best! Please consider teaching at a university!
GOAT!
Very good, thanks
thanks a lot !
I came here to understand lambda and I am not disappointed. Thank you.
Of course!
Nice video..Thank you..
underrated chanel
Hopefully not for long :D
How do you choose your support vectors, if they are no longer the closest vector to the decision boundary? Does the value of "1" get generated automatically when you plug the values of X and Y in? Or is there some scaling that takes place to set one of the vectors value to "1"?
Hi Ritvikmath, Many thanks for the wonderful, I really love the simple notations you have used for the equations which make them very easy to understand. Can you suggest any books/courses that follow the similar notation to yours or can you please provide the source which helped you in creating these contents..? Thanks in Advance
Shoutout to my previous TA!! Also do you mind uploading pictures of whiteboard only for future videos, as it might be easier for us to check notes? Thank you!
Hi Yifan! Hope you're doing well. Yes for the newer videos I am remembering to show the final whiteboard only
Very nice!!!
Thank you! Cheers!
Awesome Thank you very much
You are very welcome
Great videos, will you be also talking about kernel trick?
Yes I will! It is on the agenda
What if you made observations based upon latent variables? Could that remove the need for parameter lambda for a prior?
Thanks
hello, thank you for tutorials . how to apply SVM model to classify an alpha data, to realize the detection of driver’s sleepless? very looking forward for your reply.
Your board work is great!
Why are you using an L2 loss for w, rather than L1 based on what showed up in the previous video?
I guess that it's because L2 loss is much easier to derive, rather than L1. And also L1 is not differentiable if w=0
Thanks! Having smooth derivatives does help a lot.
@@vldanl Isn't the Hinge loss part already pretty hard to derive? Compared to ||w||?
amazing
Thank you! Cheers!
what was Vapnik on when he invented this?
The margin is taken into account twice in a weird way. The obvious one is the lambda ||w||. But the hingeloss has the margin as a unit of measurement. So if a datapoint is at distance five from the support vector, the hinge loss can drastically change depending on the size of the margin. Is this double accounting of the margin intended? Should there be a normalization for this? I believe deviding the hinge loss by ||w|| should work.
so can we mathematically solve for the vector w and the value b the soft-margin svm optimisation problem? and if so can anyone point where to read up on this?
how can we still have some data between margins even after rescaling w vector so that min |w^Tx + b| = 1? doesnt it mean we find the closest possbile data points to the hyperplane and rescale the w vectors so that the distrant from closest data points to the hyperplane falls into 1? this way, there shouldn't be any plots between margins... could u help correct ?
Very detailed explanation! I'd like to know, how are we hoing to find w and b params? Using gradient descent or another technique?
I had the same question
I think you can find optimal params in 2 ways: the first one consists in minimizing with respect to w and b the primal formulation problem, and the second one consists in maximizing with respect to a certain alpha (which is a Lagrange multiplier) the dual formulation of the problem. In the second case, once you have computed the optimal alpha, you can replace it in the equation of w (written in function of alpha) and you will find the optimal w. In order to find the best b you have to rearrange some conditions, but i am not sure about that.
Can use Gradient descent and update weights and bias for every example like shown in this video: th-cam.com/video/UX0f9BNBcsY/w-d-xo.html
Question about Lambda:
Does that mean when Lambda is LARGE -> We care more about MisClassfication Error. When Lambda is SMALL, we care about Minimize the Weight Vector and Maximize the Margin ???
Where does the kernel come in?
Why are we minimizing ||w|| to the power of 2 for soft SVM but only ||w|| for hard SVM?
The margin for a hard margin SVM is pretty intuitive. But not with soft margin SVM. With hard margin, it's a rule that both margin lines must lie on at least one of their respective points. I think with soft margin, there's a rule that for any value of lambda, at least one of the margin lines must lie on at least one of their respective points, but it's not mandatory that both do. Do you concur?
Is hinge loss is differentiable
Explained 3 hours lecture in less than 1 hour
I love your board work, but you should really have an image of the board without you in it, or just delay your walk into the picture after a second or two at the beginning so I can snag shot for my notes a bit easier, lol.
Noted! I'm starting to remember this for my new videos. Thanks!
Haha. I have dedicated a whole hard disk for ritvik’s data science videos. I just hope he is going to write a book or even better do an end to end data science course on coursera😍😍
could be nicer if you talk about slack variables
EUREKA!
Thank you !
/p''P' 2'z mrjn hlf txt ~arch tmp XP < VIN 58#/
/mrjn djz bx 2'Cn'' < avn cg kntrl ~ ferris R''/
/smltz 2'Cn'' wth abv mrjn djz bx ~arch tmp/
/r'' intr sktz of visocity ++ mrjn hlf txt vrchal/
/XP(cR'' mrjn VIN 58# ~ rchz ferris avn cg kntr cntr LN'' hlf txt ++ symbol/