Gradient Descent From Scratch In Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 มิ.ย. 2024
  • We'll learn about gradient descent, a technique for training neural networks. We'll then implement gradient descent from scratch in Python, so you can understand how it works. We'll implement gradient descent by training a linear regression model to predict the weather. In future videos, we'll build on this to create complex neural networks!
    You can see a full explanation and code here - github.com/VikParuchuri/nnet_... .
    Chapters
    0:00 Introduction
    01:49 - Linear Regression Intuition
    07:53 - Measuring Loss
    15:28 - Parameter Updates
    16:11 - Gradients And Partial Derivatives
    23:29 - Learning Rate
    28:35 - Implement Linear Regression
    36:09 - Training Loop
    This video is part of our new course, Zero to GPT - a guide to building your own GPT model from scratch. By taking this course, you'll learn deep learning skills from the ground up. Even if you're a complete beginner, you can start with the prerequisites we offer at Dataquest to get you started.
    If you're dreaming of building deep learning models, this course is for you.
    Best of all, you can access the course for free while it's still in beta!
    Sign up today!
    bit.ly/4016NfK

ความคิดเห็น • 19

  • @vikasparuchuri
    @vikasparuchuri ปีที่แล้ว

    Hi everyone! The code and explanations behind this video are here - github.com/VikParuchuri/zero_to_gpt/blob/master/explanations/linreg.ipynb . You can also find all the lessons in this series here - github.com/VikParuchuri/zero_to_gpt .

  • @hchattaway
    @hchattaway ปีที่แล้ว +1

    Please keep doing these, they are really excellent!

  • @envision6556
    @envision6556 10 หลายเดือนก่อน

    love your work, so clear

  • @TheMISBlog
    @TheMISBlog ปีที่แล้ว

    Very useful,Thanks

  • @hussainsalih3520
    @hussainsalih3520 ปีที่แล้ว

    amazing

  • @fd2444
    @fd2444 ปีที่แล้ว +2

    Is there a discord to discuss the projects on this channel?

  • @josuecurtonavarro8979
    @josuecurtonavarro8979 ปีที่แล้ว

    Hi Vik! Thanks o much for the amazing work! Your content is always one of my best choices when it comes to learning DataScience and ML. I have a doubt though about the video in minute 40:56. You mention that in the init_params function, if we substract 0.5 from the result of np.random.rand() , it would rescale weights from -0.5 to 0.5. But wouldn't it just gives us (randomly) some negative values (depending also on the chosen seed) whenever the ones returned by np.random.rand() function are less than 0.5? Thanks so much again and please, keep on doing what you do! I've already come a long way thanks to all your work!

    • @Dataquestio
      @Dataquestio  ปีที่แล้ว

      Thanks :) np.random.rand returns values from 0 to 1 by default, so subtracting .5 will rescale that range to -.5, .5 .

  • @FootballIsLife00
    @FootballIsLife00 7 หลายเดือนก่อน

    Exactly at 19:44, you mention that the derivative of loss function regarding b is the same as loss function but I don't think so, because derivative of :
    dL/db ( (wx+b) - y )^2 = 2((wx+b)-y)
    and
    dL/dw = 2x((ws+b)-y)
    can anyone help me out ?

  • @anfedoro
    @anfedoro ปีที่แล้ว

    finally I have managed to implement the gr descent for linear regression myself :-).. almost with no looking back to Vik's notebook. Can consider now that I understand how it works and all math underlying. Just curious, why my final weights and bias are very different compare to that sklean is calculating ? I plot all three - original test labels, calulated via my own procedures and calculated via sklearn.. I see that my is less acurate vs sklearn. Why it could be ?

    • @Dataquestio
      @Dataquestio  ปีที่แล้ว +1

      Congrats on implementing it yourself! Scikit-learn doesn't use gradient descent to calculate the coefficients (I believe they use analytical solutions in most cases). This would lead to a different solution.
      Even when using gradient descent, it is possible to use better initializations or optimizers (ie, don't use SGD).
      I would only be concerned if your error is significantly higher (say more than 50% higher), or your gradient descent iterations aren't improving over time.

    • @anfedoro
      @anfedoro 11 หลายเดือนก่อน

      @@Dataquestio thanks.. I played further with more iterations and got mae better than sklearm given. Just as I understand this doest matter much due possible overfitting.. right?

    • @anfedoro
      @anfedoro 11 หลายเดือนก่อน

      @@Dataquestio playing further I have implemented floating learning rate and got faster convergence as well as far better MSE :-)

  • @rjchen83
    @rjchen83 ปีที่แล้ว +1

    Thanks for the tutorial! Could you also add access to the data 'clean_weather.csv'

    • @Dataquestio
      @Dataquestio  ปีที่แล้ว +1

      You should be able to download the file here - drive.google.com/file/d/1O_uOTvMJb2FkUK7rB6lMqpPQqiAdLXNL/view?usp=share_link

    • @AI_BotBuilder
      @AI_BotBuilder ปีที่แล้ว +3

      @Seekersbay Learn to say that politely rather than a command when the man‘s actually putting content out there for everyone. Replace your ´should‘ with could and a please, it changes the tone a lot Ser…

  • @AvinashChandrashukla
    @AvinashChandrashukla 4 หลายเดือนก่อน

    What i am neet step by step bussiness analyst

  • @kurtisunknown
    @kurtisunknown ปีที่แล้ว

    This is the easy form of the gradient, how about when we have a difficult form of cost function ?

    • @pandalanhukuk804
      @pandalanhukuk804 ปีที่แล้ว

      That’s your job, to make the next step.