Pretty great and concise... Perhaps it is worth to mention that for linear regression you could find the minimum through a closed formula. This is more interesting somehow for the multidimensional case... but amusingly, finding the minimum involves inverting a matrix, and for very large dimensions, it is faster to just use gradient descent instead of inverting the matrix. Another important part of all those machine learning algorithms is that finding an "exact" minimum is not relevant because whatever minimum you find is a random variable anyways (you will find different minimums for different set of data points), so finding a close enough to the minimum is good enough
Great observation! You're absolutely right. In the context of linear regression, especially in the multidimensional case, closed-form solutions like the normal equation involving matrix inversion can indeed be used to find the exact minimum. However, as you rightly pointed out, the computational cost of inverting matrices, especially for large datasets, can make gradient descent a more efficient choice and this is something I will explain after I made a video about linear regression with multiple features. Additionally, your insight about the variability of the minimum due to different sets of data points is indeed something I can explain in one of the next videos. Thanks for adding this valuable perspective to the discussion and I shall explain both in the upcoming videos!
I just uploaded the video explaining the normal equation method. I would appreciate your feedback on the clarity of the explanation. Are there any areas that you think need further clarification or improvement?
If you have any questions, feel free to ask them in the comment section below 👇
This was really helpful. Thank you!
You're welcome!
Pretty great and concise... Perhaps it is worth to mention that for linear regression you could find the minimum through a closed formula. This is more interesting somehow for the multidimensional case... but amusingly, finding the minimum involves inverting a matrix, and for very large dimensions, it is faster to just use gradient descent instead of inverting the matrix. Another important part of all those machine learning algorithms is that finding an "exact" minimum is not relevant because whatever minimum you find is a random variable anyways (you will find different minimums for different set of data points), so finding a close enough to the minimum is good enough
Great observation! You're absolutely right. In the context of linear regression, especially in the multidimensional case, closed-form solutions like the normal equation involving matrix inversion can indeed be used to find the exact minimum. However, as you rightly pointed out, the computational cost of inverting matrices, especially for large datasets, can make gradient descent a more efficient choice and this is something I will explain after I made a video about linear regression with multiple features. Additionally, your insight about the variability of the minimum due to different sets of data points is indeed something I can explain in one of the next videos. Thanks for adding this valuable perspective to the discussion and I shall explain both in the upcoming videos!
Looking forward to the next ones!
I just uploaded the video explaining the normal equation method. I would appreciate your feedback on the clarity of the explanation. Are there any areas that you think need further clarification or improvement?
@@TheAIGuyExplains Thanks for keeping me posted... i'll comment on that video!
thank you!
Great video!
Thank you!