Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial

แชร์
ฝัง
  • เผยแพร่เมื่อ 22 ธ.ค. 2024

ความคิดเห็น • 141

  • @GodzillaTheInventor
    @GodzillaTheInventor 9 หลายเดือนก่อน +2

    Thank you very much! I am from China, not only can you give us good tutorial but also your English is very cleary so that I can improve my English listening level. Thank you very much! 唯一真神。

  • @dr_flunks
    @dr_flunks ปีที่แล้ว +5

    this is a really nice summary of the fundamentals. perfect for brushing up after learning about ML from 5 years ago. really solid.

  • @soumyas.tripathi5534
    @soumyas.tripathi5534 4 ปีที่แล้ว +10

    Sir you deserve more than you have right now! you made a 13 year old guy teach those high level concepts. Loads of love from India!

    • @patloeber
      @patloeber  4 ปีที่แล้ว +3

      This is nice to hear :) Thanks!

  • @vazhamamatsashvili7791
    @vazhamamatsashvili7791 3 ปีที่แล้ว +12

    In case anyone's wondering why increasing the learning rate decreased the cost, it's because the number of iterations is very low in order for gradient descent to converge. So if lower learning rate gives you higher cost, you should increase the number of iterations.

  • @thanhquocbaonguyen8379
    @thanhquocbaonguyen8379 3 ปีที่แล้ว

    thank you sir you just saved me from my assignment this semester. the instructions were very clear and visual. please keep producing videos!

  • @paulaperdomo7921
    @paulaperdomo7921 3 ปีที่แล้ว +4

    Hello, I was wondering why at minute 12:25, when calculating the dw parameters you don't use np.sum if it includes the sum in the formula. Also is there a reason its included with parameter db?

    • @ashewood3918
      @ashewood3918 3 ปีที่แล้ว +1

      It's bc the dot product already performs the summation so the np.sum command would be redundant. Since db only uses for vector subtraction, it needs the np.sum command

  • @rickymacharm9867
    @rickymacharm9867 4 ปีที่แล้ว

    Good work with detailed explanations. Always happy when I am struggling with a concept to see you have a video out even remotely related to what I am working on.

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      That’s nice to hear :)

  • @tassoskat8623
    @tassoskat8623 ปีที่แล้ว

    Excellent work!
    To evaluate our model we could also use the coefficient of determination (R**2) which is not affected by the scale of our measurements like MSE does.

  • @yangthomas5064
    @yangthomas5064 11 หลายเดือนก่อน

    such a nice content , it made me so satisfied. love from China

  • @Christopher_Tron
    @Christopher_Tron 3 ปีที่แล้ว

    Couldn't have done this without your help! And I needed this, thanks man

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      glad it was helpful!

  • @abhishekbhardwaj7214
    @abhishekbhardwaj7214 3 ปีที่แล้ว

    Short and sweet :), extremely helpful.

  • @ashish31416
    @ashish31416 2 ปีที่แล้ว

    Not just the content, I love your accent. :D

  • @mouadakharraz3460
    @mouadakharraz3460 5 ปีที่แล้ว +2

    thank you so much for thoses great tut, can u please re-explain the line 20 from fit method i cant understand why we transpose the X matrix...

    • @patloeber
      @patloeber  5 ปีที่แล้ว +9

      Hi. Let's say we multiply two matrices A and B. Then the number of columns of A must be number of rows of B:
      The size formula is: A(lxm) * B(mxn) = C(lxn)
      In our example:
      np.dot(X, self.weights) --> (80,1) * (1,) = (80,) -->for every sample we get one value (the 1 is because we have 1 feature)
      and in the transposed case:
      np.dot(X.T, (y_predicted - y)) --> (1, 80) * (80,) = (1,) --> for every feature we get one value

    • @mouadakharraz3460
      @mouadakharraz3460 5 ปีที่แล้ว +1

      ​@@patloeber Thank you now that's make sense to me

  • @ahmarhussain8720
    @ahmarhussain8720 2 ปีที่แล้ว

    can you please explain what you mean by features and samples in the linear regression part?

  • @emojiman745
    @emojiman745 3 ปีที่แล้ว

    13:15 Why using X.T and not X?

  • @prigithjoseph7018
    @prigithjoseph7018 2 ปีที่แล้ว

    I have a question, how can we add mse in class method, LinearRegression

  • @MahdiMashayekhi
    @MahdiMashayekhi 2 ปีที่แล้ว +1

    Thank you very much sir❤❤

  • @tscoms5472
    @tscoms5472 ปีที่แล้ว

    I have seen MSE calculated as (1/2n)... Whats the difference between using (1/n) and (1/2n)?

  • @eric92920
    @eric92920 2 ปีที่แล้ว +1

    Hello; while this video is mostly very clear and easy to understand, I do have one question, namely, doesn't line 23 perform the same action for every element in self.weights? if this is the case, and since every element in self.weights starts as 0, then why do you use an array instead of a constant?

    • @dongtrung6685
      @dongtrung6685 ปีที่แล้ว

      hey man, did you figure out the answer?

  • @advaithsahasranamam6170
    @advaithsahasranamam6170 ปีที่แล้ว

    Beautiful explaination!

  • @tusharshukla9361
    @tusharshukla9361 2 ปีที่แล้ว

    hey I have a question what is the 'predicted' in mse_value = mse(y_test, predicted)???

  • @softknk1422
    @softknk1422 4 ปีที่แล้ว +1

    Nice video dude! Can you tell me the font you use in your Editor please?!

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      in this video it's the monokai theme, now I use Nightowl theme

  • @dakshagiwal2593
    @dakshagiwal2593 3 ปีที่แล้ว

    does this not work with datasets that have more than 3 features...the code is giving me a nan array as the prediction

  • @linlinsun2833
    @linlinsun2833 4 ปีที่แล้ว

    Thanks for your amazing explaination about all ml algorithms, which helped me a lot!

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Glad it is helpful :)

  • @ScriptureFirst
    @ScriptureFirst 3 ปีที่แล้ว

    Dude. That switching dark to *BRIGHT* mode 😳

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      you mean from jupyter notebook to VS Code and vice versa? :D

    • @ScriptureFirst
      @ScriptureFirst 3 ปีที่แล้ว

      @@patloeber ya, I sometimes watch you after class at night when the lights are low & I was blown away. lol. I'm over it, but that was a blast! Thank you for the work you put into this :)

  • @Hugo-go6yq
    @Hugo-go6yq ปีที่แล้ว

    Super helpful thanks a lot!

  • @imdadood5705
    @imdadood5705 3 ปีที่แล้ว

    Patrik, I was wondering if scikit learn use gradient descent for their Linear Regression? Any comments on that?

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      sklearn Linear Regression uses Ordinary Least Squares solver, but they also have a SGDRegressor that uses gradient descent

  • @nirmalyabakshi4001
    @nirmalyabakshi4001 3 ปีที่แล้ว

    In linear_regression_test.py what are noise and random_state ?

  • @shubhamchoudhary5461
    @shubhamchoudhary5461 3 ปีที่แล้ว

    why 2 is missing in bias formula ???🤔🤔
    can you please help me to understand ?

  • @MicahJohns
    @MicahJohns 3 ปีที่แล้ว

    Subbed, thanks for the video! A bit over my head (for now) but i'm getting there.

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      Thanks, and great to hear you're getting into ML! No worries, I know it can take some time :)

  • @kougamishinya6566
    @kougamishinya6566 3 ปีที่แล้ว

    This helped me so much, you explain it so clearly and your code is really neat and easy to understand! Thanks so much! Subbed, liked and will be using your videos as my ML bible from now on XD

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      Awesome, thank you!

  • @VeyselDeste-p4l
    @VeyselDeste-p4l ปีที่แล้ว

    Is this ANN example?

  • @msrahman2010
    @msrahman2010 2 ปีที่แล้ว

    I don't see the use of m1 and m2 at the end when you plot the regression line

  • @satyaki44
    @satyaki44 4 ปีที่แล้ว

    Thank you so much for the video..keep up the good work !

  • @valerysalov8208
    @valerysalov8208 4 ปีที่แล้ว

    Why is there no 2 in the code for df/dw?

  • @vikramm4967
    @vikramm4967 3 ปีที่แล้ว

    N is also a constant while calculating db and dw like 2......So can we omit N while calculating dw?

    • @kwasiopoku8589
      @kwasiopoku8589 2 ปีที่แล้ว

      N is not necessarily a constant. It depends on the data points you have. If you remove N form the equation, you cannot apply the package to different datasets

  • @szenghoe
    @szenghoe 3 ปีที่แล้ว

    why dJ/dw = dw in 4:48?

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      dw should just be short for dJ/dw so I can use this expression in the code

  • @carltondaniel8966
    @carltondaniel8966 4 ปีที่แล้ว

    thank god , wonderful explanation

  • @flamboyantperson5936
    @flamboyantperson5936 4 ปีที่แล้ว

    Bro, I really really really need you. Please make more videos on Machine Learning and please make more frequently. You take 1 week to make 1 video which is not fair. Please 2-3 videos in a week. Please it's my humble request.

    • @patloeber
      @patloeber  4 ปีที่แล้ว +2

      I am hearing your request, and I try to increase my frequency at some point in the future! But I still have a normal day job and it takes time and effort for machine learning videos :)

    • @flamboyantperson5936
      @flamboyantperson5936 4 ปีที่แล้ว

      @@patloeber Thank you so much and please continue to use OPPS concept to build model I want to learn so many things from you.

  • @muhammadzubairbaloch3224
    @muhammadzubairbaloch3224 4 ปีที่แล้ว

    really good. I find the really knowledgeable material

  • @manideepgupta2433
    @manideepgupta2433 4 ปีที่แล้ว

    Thank you very much for such wonderful explanations....!! I hope you make more such videos...

  • @Shaitender
    @Shaitender 3 ปีที่แล้ว

    Thanks for your amazing explaination

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      Glad you like it!

  • @psaikiranyadav
    @psaikiranyadav 3 ปีที่แล้ว

    where can I find jupyter notebook explained

    • @patloeber
      @patloeber  3 ปีที่แล้ว +1

      ML Notebooks are available on Patreon

  • @osiris1102
    @osiris1102 4 ปีที่แล้ว +1

    Why didn't you used np.sum in dw?

    • @patloeber
      @patloeber  4 ปีที่แล้ว +2

      because we need the dot product for the vectors (which includes a sum). db (bias) is only a single value so no need for the dot product

    • @osiris1102
      @osiris1102 4 ปีที่แล้ว

      @@patloeber thanks! Now it makes sense.

  • @NairodTheBeast
    @NairodTheBeast 4 ปีที่แล้ว

    Wonderful video, thank you!

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Glad you like it

  • @shatandv
    @shatandv 3 ปีที่แล้ว

    That was great, thanks!

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      Glad you liked it!

  • @junyanyao6977
    @junyanyao6977 4 ปีที่แล้ว

    Very good explanation! When I try the Boston Housing dataset from sklearn on this regression module. It throw me a runtime warning on the self.weight -=self.learning_rate * dw. When I reduce the iter from 1000 to 100, it works but with terrible MSE

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      What warning do you get ? Maybe you should scale the dataset and clip values that are too large or close to 0

  • @oxanakovalenko457
    @oxanakovalenko457 5 ปีที่แล้ว +1

    Hmmm...I wonder why mse is so high, the graph looks prefectly fine. Anyway, great tutorials, thank you so much! Definitely waiting for gradient boosting tutorials!

    • @patloeber
      @patloeber  5 ปีที่แล้ว

      Thank you very much! I think it's so high because the range is not normalized (e.g. y ranges from -160 to 200), So when you sum over all the test points and also square it, it gets high. MSE is always sample dependent.

  • @12WeMet1
    @12WeMet1 4 ปีที่แล้ว

    Awesome vid. I would love to access that exact notebook with the actual math equations but I don't see it in your github. Is it possible to access that?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      not yet. I am planning to release them on my website

  • @psgpyc
    @psgpyc 4 ปีที่แล้ว +7

    He sounds like Janice from FRIENDS 😂

  • @BrianPondiGeoGeek
    @BrianPondiGeoGeek 3 หลายเดือนก่อน

    Wonderful.

  • @prajganesh
    @prajganesh 4 ปีที่แล้ว

    great videos! keep doing it. Just a quick question. With packages like Pytorch and Tensorflow making a lot of these things very simple, what is the thought process for implementing in Python. Is it for better understanding or do you think it's better way to learn?

    • @patloeber
      @patloeber  4 ปีที่แล้ว +5

      Implementing from scratch is a great way to really understand the concepts behind those algorithms!

  • @ncheymbamalu4013
    @ncheymbamalu4013 2 ปีที่แล้ว

    Thanks for the vid. Great work. One question; however, what if you have more than two weights/model parameters, w, to optimize? Couldn’t you just use the normal equation, ((X.T.dot(X))**-1)*(X.T.dot(y)), to solve for w…or np.linalg.solve(X.T.dot(X), X.T.dot(y))?

  • @sanjibyadav7973
    @sanjibyadav7973 4 ปีที่แล้ว

    Thanks. This code worked for most of the datasets but returns nan for boston dataset and returned different weights for diabetes datasets. But using Closed Form Normal equation instead of gradient descent gives correct weights and bias. why so.....??

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      My code is not optimized at all and might not always produce good results. Also try applying feature normalization before passing it to the algorithm. This usually improves performance here a lot

    • @sanjibyadav7973
      @sanjibyadav7973 4 ปีที่แล้ว

      @@patloeber thanks for reply

    • @vikramm4967
      @vikramm4967 3 ปีที่แล้ว

      Yes...I tried boston dataset..It didnt work!!Did you find any method for that to work?

    • @sanjibyadav7973
      @sanjibyadav7973 3 ปีที่แล้ว

      @@vikramm4967 use the closed form solution, which have no parameters to tweak and hence always converges and gives right solution

    • @vikramm4967
      @vikramm4967 3 ปีที่แล้ว

      @@sanjibyadav7973 But that has more time complexity right?

  • @kloki2k1
    @kloki2k1 2 ปีที่แล้ว

    Actually gradient is the steepest ascent, so to descent to minima we go in opposite direction of the gradient.

  • @ramakanthrama8578
    @ramakanthrama8578 4 ปีที่แล้ว

    Hi, I cant find the I python notebook in the github link .

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      It' not yet on Github, but maybe I add it in the future

    • @ramakanthrama8578
      @ramakanthrama8578 4 ปีที่แล้ว

      @@patloeber Can you please make a video on Gradient boosting as well ? from scratch ?
      Thanks :)

  • @verberaunt
    @verberaunt 3 ปีที่แล้ว

    danke schön!

  • @valerysalov8208
    @valerysalov8208 4 ปีที่แล้ว +1

    Why is there no 2 in the code for df/dw? Also why use Transpose for one and dot when calculating prediction,and what about this www.statisticshowto.com/probability-and-statistics/regression-analysis/find-a-linear-regression-equation/ , is this a different way to solve linear regression?

    • @andresfernandoaranda5498
      @andresfernandoaranda5498 3 ปีที่แล้ว +2

      1. Sometimes people multiple MSE * 1/2 so it cancels out when you take the derivative (datascience.stackexchange.com/questions/52157/why-do-we-have-to-divide-by-2-in-the-ml-squared-error-cost-function)
      2. For this simple case you can use LSE as a different approach to optimization methods like Gradient Descent (like shown in video) towardsdatascience.com/https-medium-com-chayankathuria-optimization-ordinary-least-squares-gradient-descent-from-scratch-8b48151ba756

  • @nikolayandcards
    @nikolayandcards 4 ปีที่แล้ว

    This is gold

  • @AdEngineer
    @AdEngineer 4 ปีที่แล้ว

    Hi. Is it possible to share the jupyter notebooks?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Hi. I'm planning to release them on my website when I find the time :)

  • @angadhs
    @angadhs 4 ปีที่แล้ว

    which editor is best for python scripts?

    • @patloeber
      @patloeber  4 ปีที่แล้ว +1

      VS Code or PyCharm

  • @fandibataineh4586
    @fandibataineh4586 2 ปีที่แล้ว

    i think that you forgot to add termination criterion to your for loop
    that you must add a break condition if |new_mse - old_mse| < epsilon

  • @pynation
    @pynation 4 ปีที่แล้ว

    This worked really well on this dataset, but it gives error on pandas dataframe even after i have converted them to numpy arrays.

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Which error ? Common errors with other datasets are incorrect types or incorrect shape of your data

    • @pynation
      @pynation 4 ปีที่แล้ว

      @@patloeber After a few iterations the value of weight becomes an array of nan. And so does bias. Same thing works fine with sklearn's linear regression. Tried fixing it but couldn't. So I had to ask you at last. Thanks for responding.

    • @damianwysokinski3285
      @damianwysokinski3285 4 ปีที่แล้ว

      Faraz Khan does your pandas dataframe contain nans?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Sorry I missed your response earlier, but yes as Damian said it is very likely that your dataset has invalid or incorrect data. My algorithm is not optimized and does not include error checks like in sklearn. You may also try applying a standard scaler before fitting the data.

    • @pynation
      @pynation 4 ปีที่แล้ว

      @@patloeber Hey I have made some changes to both linear regression and knn to make it compatible with both dataframes and sklearn's data. Can I make a pull request ?

  • @dhananjaykansal8097
    @dhananjaykansal8097 5 ปีที่แล้ว +1

    ModuleNotFoundError: no module named linear_regression. It always shows like this.

    • @patloeber
      @patloeber  5 ปีที่แล้ว

      your file must be named linear_regression.py and it must be in the same folder

    • @dhananjaykansal8097
      @dhananjaykansal8097 5 ปีที่แล้ว

      @@patloeber It is the same. Everything is good. I don't understand what's happening sir.

    • @patloeber
      @patloeber  5 ปีที่แล้ว +1

      ​@@dhananjaykansal8097 Try to write "from .logistic_regression import ...", so prefix it with a dot "."

    • @patloeber
      @patloeber  5 ปีที่แล้ว +1

      You can also copy my repo from github and try to run this with the exact same folder structure. Or maybe this is helpful: stackoverflow.com/questions/4142151/how-to-import-the-class-within-the-same-directory-or-sub-directory

    • @dhananjaykansal8097
      @dhananjaykansal8097 5 ปีที่แล้ว

      @@patloeber Alright sir. I shall try that tomorrow for sure. I'm so glad that you're replying

  • @johnparker2486
    @johnparker2486 4 ปีที่แล้ว

    Very Helpful. Keep it up

  • @thecros1076
    @thecros1076 4 ปีที่แล้ว

    CANT WE EQUATE THE DERIVATIVE DIRECTLY TO ZERO AND CALACULATE MINIMUM. CAN ANYBODY ANSWER THIS

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      See my comment in logistic regression video.

  • @hoangphanhuy1992
    @hoangphanhuy1992 2 ปีที่แล้ว

    regressor = LinearRegression()
    regressor.fit(X_train, y_train)
    predicted = regressor.predict(X_test)
    def mse(y_true, predicted):
    return np.mean((y_true - predicted)**2)
    mse_value=mse(y_test, predicted)
    def mse(y_true, predicted):
    ---> 11 return np.mean((y_true - predicted)**2)
    12 mse_value=mse(y_test, predicted)
    13 print(mse_value)
    ValueError: operands could not be broadcast together with shapes (20,) (20,80)
    i dont know why i can run this code, i follow excactly what him did. Can anybody help me. tks

  • @rxz8862
    @rxz8862 2 ปีที่แล้ว

    The second derivate (df/db) is wrong because d/db(sum_i=1toN(b))=N

  • @shivamdubey4783
    @shivamdubey4783 3 ปีที่แล้ว

    man great but plzz make it in one tab its vry confusing to see which function you have called

    • @patloeber
      @patloeber  3 ปีที่แล้ว

      thanks for the feedback!

  • @Ayush-cn7tq
    @Ayush-cn7tq 6 หลายเดือนก่อน

    i think you have applied stochastic gradient

  • @hitashukanjani4430
    @hitashukanjani4430 4 ปีที่แล้ว

    what if we have more features

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Hi, the code should work for more features, too :) I just used one dimension so that we can have a simple plot of the results

  • @global_southerner
    @global_southerner 4 ปีที่แล้ว +1

    J'(w,b) instead ofJ'(m,b)

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      Correct, thanks for the hint!

  • @patrickali1987
    @patrickali1987 4 ปีที่แล้ว

    very nice tutorial here. Please can you post the code also?

    • @patloeber
      @patloeber  4 ปีที่แล้ว

      I already did. You can find the link in the video description :)

  • @tassoskat8623
    @tassoskat8623 4 ปีที่แล้ว

    Wow

  • @conradlewis5689
    @conradlewis5689 2 ปีที่แล้ว

    Linear algebraic method:
    ///////////////////////////////////////////////////////////////////////////
    # This is linear least squares
    b = np.linalg.inv( np.dot( X.T, X ) ).dot( X.T ).dot( y )
    # predict using the coefficients
    y_predicted = X.dot( b )
    ///////////////////////////////////////////////////////////////////////////
    Then you have your solution. I was trying to get the videos implementation to have a better overall mean square error than the linear algebra formulation but could not do it. Anyway, this is just another way if anyone is looking. Great video series btw! I have enjoyed going through it.