Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 ก.ย. 2024
  • Get my Free NumPy Handbook:
    www.python-eng...
    In this Machine Learning from Scratch Tutorial, we are going to implement the Linear Regression algorithm, using only built-in Python modules and numpy. We will also learn about the concept and the math behind this popular ML algorithm.
    ~~~~~~~~~~~~~~ GREAT PLUGINS FOR YOUR CODE EDITOR ~~~~~~~~~~~~~~
    ✅ Write cleaner code with Sourcery: sourcery.ai/?u... *
    📓 Notebooks available on Patreon:
    / patrickloeber
    ⭐ Join Our Discord : / discord
    If you enjoyed this video, please subscribe to the channel!
    The code can be found here:
    github.com/pat...
    Further readings:
    ml-cheatsheet....
    ml-cheatsheet....
    You can find me here:
    Website: www.python-eng...
    Twitter: / patloeber
    GitHub: github.com/pat...
    #Python #MachineLearning
    ----------------------------------------------------------------------------------------------------------
    * This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏

ความคิดเห็น • 141

  • @vazhamamatsashvili7791
    @vazhamamatsashvili7791 3 ปีที่แล้ว +12

    In case anyone's wondering why increasing the learning rate decreased the cost, it's because the number of iterations is very low in order for gradient descent to converge. So if lower learning rate gives you higher cost, you should increase the number of iterations.

  • @junyanyao6977
    @junyanyao6977 3 ปีที่แล้ว

    Very good explanation! When I try the Boston Housing dataset from sklearn on this regression module. It throw me a runtime warning on the self.weight -=self.learning_rate * dw. When I reduce the iter from 1000 to 100, it works but with terrible MSE

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      What warning do you get ? Maybe you should scale the dataset and clip values that are too large or close to 0

  • @conradlewis5689
    @conradlewis5689 2 ปีที่แล้ว

    Linear algebraic method:
    ///////////////////////////////////////////////////////////////////////////
    # This is linear least squares
    b = np.linalg.inv( np.dot( X.T, X ) ).dot( X.T ).dot( y )
    # predict using the coefficients
    y_predicted = X.dot( b )
    ///////////////////////////////////////////////////////////////////////////
    Then you have your solution. I was trying to get the videos implementation to have a better overall mean square error than the linear algebra formulation but could not do it. Anyway, this is just another way if anyone is looking. Great video series btw! I have enjoyed going through it.

  • @dr_flunks
    @dr_flunks ปีที่แล้ว +5

    this is a really nice summary of the fundamentals. perfect for brushing up after learning about ML from 5 years ago. really solid.

  • @psgpyc
    @psgpyc 3 ปีที่แล้ว +7

    He sounds like Janice from FRIENDS 😂

  • @GodzillaTheInventor
    @GodzillaTheInventor 6 หลายเดือนก่อน +1

    Thank you very much! I am from China, not only can you give us good tutorial but also your English is very cleary so that I can improve my English listening level. Thank you very much! 唯一真神。

  • @paulaperdomo7921
    @paulaperdomo7921 3 ปีที่แล้ว +4

    Hello, I was wondering why at minute 12:25, when calculating the dw parameters you don't use np.sum if it includes the sum in the formula. Also is there a reason its included with parameter db?

    • @ashewood3918
      @ashewood3918 3 ปีที่แล้ว +1

      It's bc the dot product already performs the summation so the np.sum command would be redundant. Since db only uses for vector subtraction, it needs the np.sum command

  • @MahdiMashayekhi
    @MahdiMashayekhi 2 ปีที่แล้ว +1

    Thank you very much sir❤❤

  • @hoangphanhuy1992
    @hoangphanhuy1992 2 ปีที่แล้ว

    regressor = LinearRegression()
    regressor.fit(X_train, y_train)
    predicted = regressor.predict(X_test)
    def mse(y_true, predicted):
    return np.mean((y_true - predicted)**2)
    mse_value=mse(y_test, predicted)
    def mse(y_true, predicted):
    ---> 11 return np.mean((y_true - predicted)**2)
    12 mse_value=mse(y_test, predicted)
    13 print(mse_value)
    ValueError: operands could not be broadcast together with shapes (20,) (20,80)
    i dont know why i can run this code, i follow excactly what him did. Can anybody help me. tks

  • @soumyas.tripathi5534
    @soumyas.tripathi5534 4 ปีที่แล้ว +10

    Sir you deserve more than you have right now! you made a 13 year old guy teach those high level concepts. Loads of love from India!

    • @patloeber
      @patloeber  4 ปีที่แล้ว +3

      This is nice to hear :) Thanks!

  • @rxz8862
    @rxz8862 2 ปีที่แล้ว

    The second derivate (df/db) is wrong because d/db(sum_i=1toN(b))=N

  • @osiris1102
    @osiris1102 3 ปีที่แล้ว +1

    Why didn't you used np.sum in dw?

    • @patloeber
      @patloeber  3 ปีที่แล้ว +2

      because we need the dot product for the vectors (which includes a sum). db (bias) is only a single value so no need for the dot product

    • @osiris1102
      @osiris1102 3 ปีที่แล้ว

      @@patloeber thanks! Now it makes sense.

  • @mouadakharraz3460
    @mouadakharraz3460 4 ปีที่แล้ว +2

    thank you so much for thoses great tut, can u please re-explain the line 20 from fit method i cant understand why we transpose the X matrix...

    • @patloeber
      @patloeber  4 ปีที่แล้ว +9

      Hi. Let's say we multiply two matrices A and B. Then the number of columns of A must be number of rows of B:
      The size formula is: A(lxm) * B(mxn) = C(lxn)
      In our example:
      np.dot(X, self.weights) --> (80,1) * (1,) = (80,) -->for every sample we get one value (the 1 is because we have 1 feature)
      and in the transposed case:
      np.dot(X.T, (y_predicted - y)) --> (1, 80) * (80,) = (1,) --> for every feature we get one value

    • @mouadakharraz3460
      @mouadakharraz3460 4 ปีที่แล้ว +1

      ​@@patloeber Thank you now that's make sense to me

  • @valerysalov8208
    @valerysalov8208 3 ปีที่แล้ว +1

    Why is there no 2 in the code for df/dw? Also why use Transpose for one and dot when calculating prediction,and what about this www.statisticshowto.com/probability-and-statistics/regression-analysis/find-a-linear-regression-equation/ , is this a different way to solve linear regression?

    • @andresfernandoaranda5498
      @andresfernandoaranda5498 3 ปีที่แล้ว +2

      1. Sometimes people multiple MSE * 1/2 so it cancels out when you take the derivative (datascience.stackexchange.com/questions/52157/why-do-we-have-to-divide-by-2-in-the-ml-squared-error-cost-function)
      2. For this simple case you can use LSE as a different approach to optimization methods like Gradient Descent (like shown in video) towardsdatascience.com/https-medium-com-chayankathuria-optimization-ordinary-least-squares-gradient-descent-from-scratch-8b48151ba756

  • @oxanakovalenko457
    @oxanakovalenko457 4 ปีที่แล้ว +1

    Hmmm...I wonder why mse is so high, the graph looks prefectly fine. Anyway, great tutorials, thank you so much! Definitely waiting for gradient boosting tutorials!

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Thank you very much! I think it's so high because the range is not normalized (e.g. y ranges from -160 to 200), So when you sum over all the test points and also square it, it gets high. MSE is always sample dependent.

  • @msrahman2010
    @msrahman2010 2 ปีที่แล้ว

    I don't see the use of m1 and m2 at the end when you plot the regression line

  • @Ayush-cn7tq
    @Ayush-cn7tq 3 หลายเดือนก่อน

    i think you have applied stochastic gradient

  • @eric92920
    @eric92920 2 ปีที่แล้ว +1

    Hello; while this video is mostly very clear and easy to understand, I do have one question, namely, doesn't line 23 perform the same action for every element in self.weights? if this is the case, and since every element in self.weights starts as 0, then why do you use an array instead of a constant?

    • @dongtrung6685
      @dongtrung6685 ปีที่แล้ว

      hey man, did you figure out the answer?

  • @global_southerner
    @global_southerner 4 ปีที่แล้ว +1

    J'(w,b) instead ofJ'(m,b)

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Correct, thanks for the hint!

  • @softknk1422
    @softknk1422 3 ปีที่แล้ว +1

    Nice video dude! Can you tell me the font you use in your Editor please?!

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      in this video it's the monokai theme, now I use Nightowl theme

  • @tscoms5472
    @tscoms5472 ปีที่แล้ว

    I have seen MSE calculated as (1/2n)... Whats the difference between using (1/n) and (1/2n)?

  • @shubhamchoudhary5461
    @shubhamchoudhary5461 2 ปีที่แล้ว

    why 2 is missing in bias formula ???🤔🤔
    can you please help me to understand ?

  • @fandibataineh4586
    @fandibataineh4586 2 ปีที่แล้ว

    i think that you forgot to add termination criterion to your for loop
    that you must add a break condition if |new_mse - old_mse| < epsilon

  • @tusharshukla9361
    @tusharshukla9361 2 ปีที่แล้ว

    hey I have a question what is the 'predicted' in mse_value = mse(y_test, predicted)???

  • @dakshagiwal2593
    @dakshagiwal2593 2 ปีที่แล้ว

    does this not work with datasets that have more than 3 features...the code is giving me a nan array as the prediction

  • @prigithjoseph7018
    @prigithjoseph7018 2 ปีที่แล้ว

    I have a question, how can we add mse in class method, LinearRegression

  • @kloki2k1
    @kloki2k1 2 ปีที่แล้ว

    Actually gradient is the steepest ascent, so to descent to minima we go in opposite direction of the gradient.

  • @imdadood5705
    @imdadood5705 3 ปีที่แล้ว

    Patrik, I was wondering if scikit learn use gradient descent for their Linear Regression? Any comments on that?

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      sklearn Linear Regression uses Ordinary Least Squares solver, but they also have a SGDRegressor that uses gradient descent

  • @ahmarhussain8720
    @ahmarhussain8720 2 ปีที่แล้ว

    can you please explain what you mean by features and samples in the linear regression part?

  • @MicahJohns
    @MicahJohns 3 ปีที่แล้ว

    Subbed, thanks for the video! A bit over my head (for now) but i'm getting there.

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      Thanks, and great to hear you're getting into ML! No worries, I know it can take some time :)

  • @tassoskat8623
    @tassoskat8623 ปีที่แล้ว

    Excellent work!
    To evaluate our model we could also use the coefficient of determination (R**2) which is not affected by the scale of our measurements like MSE does.

  • @VeyselDeste-p4l
    @VeyselDeste-p4l 11 หลายเดือนก่อน

    Is this ANN example?

  • @psaikiranyadav
    @psaikiranyadav 3 ปีที่แล้ว

    where can I find jupyter notebook explained

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      ML Notebooks are available on Patreon

  • @yangthomas5064
    @yangthomas5064 8 หลายเดือนก่อน

    such a nice content , it made me so satisfied. love from China

  • @dhananjaykansal8097
    @dhananjaykansal8097 4 ปีที่แล้ว +1

    ModuleNotFoundError: no module named linear_regression. It always shows like this.

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      your file must be named linear_regression.py and it must be in the same folder

    • @dhananjaykansal8097
      @dhananjaykansal8097 4 ปีที่แล้ว

      @@patloeber It is the same. Everything is good. I don't understand what's happening sir.

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      ​@@dhananjaykansal8097 Try to write "from .logistic_regression import ...", so prefix it with a dot "."

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      You can also copy my repo from github and try to run this with the exact same folder structure. Or maybe this is helpful: stackoverflow.com/questions/4142151/how-to-import-the-class-within-the-same-directory-or-sub-directory

    • @dhananjaykansal8097
      @dhananjaykansal8097 4 ปีที่แล้ว

      @@patloeber Alright sir. I shall try that tomorrow for sure. I'm so glad that you're replying

  • @BrianPondiGeoGeek
    @BrianPondiGeoGeek หลายเดือนก่อน

    Wonderful.

  • @nirmalyabakshi4001
    @nirmalyabakshi4001 3 ปีที่แล้ว

    In linear_regression_test.py what are noise and random_state ?

  • @ashish-blessings
    @ashish-blessings 2 ปีที่แล้ว

    Not just the content, I love your accent. :D

  • @valerysalov8208
    @valerysalov8208 3 ปีที่แล้ว

    Why is there no 2 in the code for df/dw?

  • @vikramm4967
    @vikramm4967 3 ปีที่แล้ว

    N is also a constant while calculating db and dw like 2......So can we omit N while calculating dw?

    • @kwasiopoku8589
      @kwasiopoku8589 2 ปีที่แล้ว

      N is not necessarily a constant. It depends on the data points you have. If you remove N form the equation, you cannot apply the package to different datasets

  • @shivamdubey4783
    @shivamdubey4783 3 ปีที่แล้ว

    man great but plzz make it in one tab its vry confusing to see which function you have called

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      thanks for the feedback!

  • @Hugo-go6yq
    @Hugo-go6yq 10 หลายเดือนก่อน

    Super helpful thanks a lot!

  • @12WeMet1
    @12WeMet1 4 ปีที่แล้ว

    Awesome vid. I would love to access that exact notebook with the actual math equations but I don't see it in your github. Is it possible to access that?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      not yet. I am planning to release them on my website

  • @thecros1076
    @thecros1076 4 ปีที่แล้ว

    CANT WE EQUATE THE DERIVATIVE DIRECTLY TO ZERO AND CALACULATE MINIMUM. CAN ANYBODY ANSWER THIS

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      See my comment in logistic regression video.

  • @prajganesh
    @prajganesh 4 ปีที่แล้ว

    great videos! keep doing it. Just a quick question. With packages like Pytorch and Tensorflow making a lot of these things very simple, what is the thought process for implementing in Python. Is it for better understanding or do you think it's better way to learn?

    • @patloeber
      @patloeber  4 ปีที่แล้ว +5

      Implementing from scratch is a great way to really understand the concepts behind those algorithms!

  • @ncheymbamalu4013
    @ncheymbamalu4013 2 ปีที่แล้ว

    Thanks for the vid. Great work. One question; however, what if you have more than two weights/model parameters, w, to optimize? Couldn’t you just use the normal equation, ((X.T.dot(X))**-1)*(X.T.dot(y)), to solve for w…or np.linalg.solve(X.T.dot(X), X.T.dot(y))?

  • @emojiman745
    @emojiman745 3 ปีที่แล้ว

    13:15 Why using X.T and not X?

  • @abhishekbhardwaj7214
    @abhishekbhardwaj7214 3 ปีที่แล้ว

    Short and sweet :), extremely helpful.

  • @thanhquocbaonguyen8379
    @thanhquocbaonguyen8379 3 ปีที่แล้ว

    thank you sir you just saved me from my assignment this semester. the instructions were very clear and visual. please keep producing videos!

  • @manideepgupta2433
    @manideepgupta2433 4 ปีที่แล้ว

    Thank you very much for such wonderful explanations....!! I hope you make more such videos...

  • @flamboyantperson5936
    @flamboyantperson5936 4 ปีที่แล้ว

    Bro, I really really really need you. Please make more videos on Machine Learning and please make more frequently. You take 1 week to make 1 video which is not fair. Please 2-3 videos in a week. Please it's my humble request.

    • @patloeber
      @patloeber  4 ปีที่แล้ว +2

      I am hearing your request, and I try to increase my frequency at some point in the future! But I still have a normal day job and it takes time and effort for machine learning videos :)

    • @flamboyantperson5936
      @flamboyantperson5936 4 ปีที่แล้ว

      @@patloeber Thank you so much and please continue to use OPPS concept to build model I want to learn so many things from you.

  • @ScriptureFirst
    @ScriptureFirst 3 ปีที่แล้ว

    Dude. That switching dark to *BRIGHT* mode 😳

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      you mean from jupyter notebook to VS Code and vice versa? :D

    • @ScriptureFirst
      @ScriptureFirst 3 ปีที่แล้ว

      @@patloeber ya, I sometimes watch you after class at night when the lights are low & I was blown away. lol. I'm over it, but that was a blast! Thank you for the work you put into this :)

  • @advaithsahasranamam6170
    @advaithsahasranamam6170 ปีที่แล้ว

    Beautiful explaination!

  • @AdEngineer
    @AdEngineer 4 ปีที่แล้ว

    Hi. Is it possible to share the jupyter notebooks?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Hi. I'm planning to release them on my website when I find the time :)

  • @carltondaniel8966
    @carltondaniel8966 4 ปีที่แล้ว

    thank god , wonderful explanation

  • @pynation
    @pynation 4 ปีที่แล้ว

    This worked really well on this dataset, but it gives error on pandas dataframe even after i have converted them to numpy arrays.

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Which error ? Common errors with other datasets are incorrect types or incorrect shape of your data

    • @pynation
      @pynation 4 ปีที่แล้ว

      @@patloeber After a few iterations the value of weight becomes an array of nan. And so does bias. Same thing works fine with sklearn's linear regression. Tried fixing it but couldn't. So I had to ask you at last. Thanks for responding.

    • @damianwysokinski3285
      @damianwysokinski3285 4 ปีที่แล้ว

      Faraz Khan does your pandas dataframe contain nans?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Sorry I missed your response earlier, but yes as Damian said it is very likely that your dataset has invalid or incorrect data. My algorithm is not optimized and does not include error checks like in sklearn. You may also try applying a standard scaler before fitting the data.

    • @pynation
      @pynation 4 ปีที่แล้ว

      @@patloeber Hey I have made some changes to both linear regression and knn to make it compatible with both dataframes and sklearn's data. Can I make a pull request ?

  • @kougamishinya6566
    @kougamishinya6566 3 ปีที่แล้ว

    This helped me so much, you explain it so clearly and your code is really neat and easy to understand! Thanks so much! Subbed, liked and will be using your videos as my ML bible from now on XD

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      Awesome, thank you!

  • @Shaitender
    @Shaitender 3 ปีที่แล้ว

    Thanks for your amazing explaination

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      Glad you like it!

  • @satyakikc9152
    @satyakikc9152 3 ปีที่แล้ว

    Thank you so much for the video..keep up the good work !

  • @Christopher_Tron
    @Christopher_Tron 3 ปีที่แล้ว

    Couldn't have done this without your help! And I needed this, thanks man

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      glad it was helpful!

  • @angadhs
    @angadhs 4 ปีที่แล้ว

    which editor is best for python scripts?

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      VS Code or PyCharm

  • @szenghoe
    @szenghoe 3 ปีที่แล้ว

    why dJ/dw = dw in 4:48?

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      dw should just be short for dJ/dw so I can use this expression in the code

  • @rickymacharm9867
    @rickymacharm9867 3 ปีที่แล้ว

    Good work with detailed explanations. Always happy when I am struggling with a concept to see you have a video out even remotely related to what I am working on.

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      That’s nice to hear :)

  • @patrickali1987
    @patrickali1987 4 ปีที่แล้ว

    very nice tutorial here. Please can you post the code also?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      I already did. You can find the link in the video description :)

  • @ramakanthrama8578
    @ramakanthrama8578 4 ปีที่แล้ว

    Hi, I cant find the I python notebook in the github link .

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      It' not yet on Github, but maybe I add it in the future

    • @ramakanthrama8578
      @ramakanthrama8578 4 ปีที่แล้ว

      @@patloeber Can you please make a video on Gradient boosting as well ? from scratch ?
      Thanks :)

  • @linlinsun2833
    @linlinsun2833 4 ปีที่แล้ว

    Thanks for your amazing explaination about all ml algorithms, which helped me a lot!

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Glad it is helpful :)

  • @muhammadzubairbaloch3224
    @muhammadzubairbaloch3224 4 ปีที่แล้ว

    really good. I find the really knowledgeable material

  • @NairodTheBeast
    @NairodTheBeast 4 ปีที่แล้ว

    Wonderful video, thank you!

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Glad you like it

  • @sanjibyadav7973
    @sanjibyadav7973 4 ปีที่แล้ว

    Thanks. This code worked for most of the datasets but returns nan for boston dataset and returned different weights for diabetes datasets. But using Closed Form Normal equation instead of gradient descent gives correct weights and bias. why so.....??

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      My code is not optimized at all and might not always produce good results. Also try applying feature normalization before passing it to the algorithm. This usually improves performance here a lot

    • @sanjibyadav7973
      @sanjibyadav7973 4 ปีที่แล้ว

      @@patloeber thanks for reply

    • @vikramm4967
      @vikramm4967 3 ปีที่แล้ว

      Yes...I tried boston dataset..It didnt work!!Did you find any method for that to work?

    • @sanjibyadav7973
      @sanjibyadav7973 3 ปีที่แล้ว

      @@vikramm4967 use the closed form solution, which have no parameters to tweak and hence always converges and gives right solution

    • @vikramm4967
      @vikramm4967 3 ปีที่แล้ว

      @@sanjibyadav7973 But that has more time complexity right?

  • @verberaunt
    @verberaunt 3 ปีที่แล้ว

    danke schön!

  • @tassoskat8623
    @tassoskat8623 4 ปีที่แล้ว

    Wow

  • @hitashukanjani4430
    @hitashukanjani4430 4 ปีที่แล้ว

    what if we have more features

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Hi, the code should work for more features, too :) I just used one dimension so that we can have a simple plot of the results

  • @shatandv
    @shatandv 3 ปีที่แล้ว

    That was great, thanks!

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      Glad you liked it!

  • @nikolayandcards
    @nikolayandcards 4 ปีที่แล้ว

    This is gold

  • @johnparker2486
    @johnparker2486 4 ปีที่แล้ว

    Very Helpful. Keep it up