Gradient Descent Implementation from Scratch in Python
ฝัง
- เผยแพร่เมื่อ 29 ก.ย. 2024
- In this video we show how you can implement the batch gradient descent and stochastic gradient descent algorithms from scratch in python.
** SUBSCRIBE:
www.youtube.co...
You can find the Jupyter Notebook for this video on our Github repo here: github.com/end...
** Gradient descent for linear regression video: • Linear Regression with...
** Follow us on Instagram for more endless engineering:
/ endlesseng
** Like us on Facebook:
/ endlesseng
** Check us out on twitter:
/ endlesseng
Stochastic gradient descent implementation is not correct. you supposed to shuffle the data and every iteration pick only 1 random data point ( or multiple in case of mini-batch) for calculating the params(ETA) value.
Please zoom into the notebook for better visibility
Thank you for the feedback! Will do
Hey..... Thanks so much for this video...... But please can you do the same with OCTAVE PROGRAMMING LANGUAGE
Hey Kant, thanks for watching.
The reason I choose to go with python is because it is used a lot in the data science / machine learning community. It is also in higher demand by companies as compared to Octave. I will think about making some future videos with Octave, but for now I think it will mostly be python
how is h(xi) = theta transpose . x bar? Please explain. Thanks in advance
The formulation of the model is derived in detail in my Linear Regression video, see here the explanation for it th-cam.com/video/fkS3FkVAPWU/w-d-xo.html
heey guy this was extremly useful 😍😍😍 did yourself know ??
I'm currently reading Python Machine Learning by Sebastian V.
I wish that all the code in this book was as clear as it is in your video.
Thank you for posting it
Thank you! I am glad you found this video clear and useful. Please let me know if there are other topics you would like to see videos on
no "while" loop in your stochastic gradient decent function??? What happened there?
Hi Ying, not sure I understand your question. There is a for loop in the stochastic gradient descent function
@Nutty Jedi the function that used to split the data is sklearn.model_selection.train_test_split, that funciton by default shuffles the data see the link below for the funciton documentation from the sklearn page.
scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Best explanation EVER!
SUBSCRIBED!!
Thank you for watching Rohan! I am glad you found the video useful
Thank you for amazing explanation
Hi Rishabh,
You are most welcome! Thank you for watching!
excellent absolutely we need more implementations of these algorithms
Thank you for watching! I'm glad you found this useful
what about a vectorized cost function? :D
Rachel Newell yeah, you can do this and it increases efficiency so much. In the video above, to calculate the cost and the gradient, he is looping over the data. The way you can vectorize it is as follows (Pay attention to single quotes as transpose operator!): cost = (y-X*theta)’*(y-X*theta)/2N, where y is Nx1, X is Nx2 (the i-th row is (1, x_i)), and theta is 2x1. The gradient is then grad = X’*X*theta - X’*y, so that the updating rule is theta = theta - alpha*grad!
very good explained.
It will be very appreciated if you implement the structure from Motion algorithm.
or segest a good resource to learn how to do that.
thanks
also i think gradient = np.array([1.0,X]) * (y_hat -y)
Sir, Thanks a lot.
Hi, many thanks for providing these materials -- for free! However, I didn't really get understand the implementation of the SGD as it seems you didn't shuffle and randomly choose but looped over the entire dataset. Kindly clarify.
Hi Obinna, thanks for your question.
I used the sklearn.model_selection.train_test_split functionality ti split the data. That functionality has an option to shuffle the data, and it is true by default. See the documentation here: scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Can you please provide me your email.I have an error in my code and its killing me from days
Hi Annie, you can send an email to endlessengineeringphd@gmail.com
@7:11 what do you mean by params[:,iteration] = params
pls explain this line
Thank you
Hi Kabilan, thanks for you questions.
This line stores the value of the parameters for every iteration, since there are more than one parameters I use the [:, iteration] to store a column of parameters. You can take a look at numpy array slicing to get more information on that.
how can i turn this into a polynomial model? i need to increase the complexity of the model and see the cost decrease while the complexity goes higher.
You can still use gradient descent with a general n-order polynomial model, you just have to collect your data in a form that fits y = (theta^T) X to do for example batch gradient descent. The wikipedia page on polynomial regression has a good example en.wikipedia.org/wiki/Polynomial_regression
@10:35 is it not " params = params - alpha * gradient/num_sampls" ?
Only if you make the gradient = np.array([1.0,X]) * (y_hat -y). Bit that is not how I have defined the problem, please see the start of the video for the mathematical description
I love the format !
Glad you enjoyed it! Thanks for watching
Hi
Thanks for your nice illustration
In your code, you supposed that we already know the derivative of the cost function. Could you please show an example of some complicated function and then show us how we can write a python code to derive it!
Thanks
Hi Mohammed, thanks for your question.
Please checkout my video Linear Regression with Gradient Descent + Least Squares for all the mathematical details
video here --> th-cam.com/video/fkS3FkVAPWU/w-d-xo.html
Endless Engineering
Thanks for your response,
I already checked that video, and thank you again for the nice job!
My question is how to write a Python code to derive the cost function !
@@mjar3799 I am not sure I understand your question. Is it that you want python code to compute the derivative of any cost function? That is a little out of scope here, since for linear regression we assume the cost function has a certain structure. Which is why the derivative is mathematically computed. If you want code that computes the derivative of any function you can try to use something like SymPy www.sympy.org/en/index.html
If you want numerical computation of the derivative I would recommend just using the tools in scipy or nympy
what about a vectorized cost function? :D
Hey Rachel! Do you mean a cost function that generates a cost vector? That would certainly be mathematically possible, but the math would get a little messy. And I am not exactly sure what that would buy you
A vectorised cost function apparently does all the calculations in one go rather than iteratively so it is much faster for larger amounts of data.
Also im a noob in ML... Just started studying but read this in my book
Ty man, any chance to implement fuzzy c-means (FCM)? I'm suffering trying to understand it and implement the kernel fuzzy c-means (KFCM). Nice project, ty again!
Thanks! Glad you enjoyed the video. I do not have one planned for FCM soon, but I will put it on my list!
Nice video, subbed
Awesome! Thank you!
I am glad you enjoyed the video