Tutorial 12- Stochastic Gradient Descent vs Gradient Descent

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ค. 2019
  • Below are the various playlist created on ML,Data Science and Deep Learning. Please subscribe and support the channel. Happy Learning!
    Deep Learning Playlist: • Tutorial 1- Introducti...
    Data Science Projects playlist: • Generative Adversarial...
    NLP playlist: • Natural Language Proce...
    Statistics Playlist: • Population vs Sample i...
    Feature Engineering playlist: • Feature Engineering in...
    Computer Vision playlist: • OpenCV Installation | ...
    Data Science Interview Question playlist: • Complete Life Cycle of...
    You can buy my book on Finance with Machine Learning and Deep Learning from the below url
    amazon url: www.amazon.in/Hands-Python-Fi...
    🙏🙏🙏🙏🙏🙏🙏🙏
    YOU JUST NEED TO DO
    3 THINGS to support my channel
    LIKE
    SHARE
    &
    SUBSCRIBE
    TO MY TH-cam CHANNEL

ความคิดเห็น • 96

  • @BalaguruGupta
    @BalaguruGupta 3 ปีที่แล้ว +14

    Amazing explanation Sir! You'll always be the hero for the AI Enthusiasts. Thanks a lot!

  • @nagesh866
    @nagesh866 3 ปีที่แล้ว +5

    what an amazing teacher you are. Crystal clear.

  • @ravindrav1895
    @ravindrav1895 2 ปีที่แล้ว +1

    whenever i am confused with some topics , i come back to this channel and watch your videos and it helps me a lot sir .Thank you sir for an amazing explanation

  • @saurabhnigudkar6115
    @saurabhnigudkar6115 4 ปีที่แล้ว +5

    Best Deep Learning playlist on youtube

  • @ajithtolroy5441
    @ajithtolroy5441 4 ปีที่แล้ว +2

    I saw many videos but this one is quite comprehensible and informative

  • @lakshminarasimhanvenkatakr3754
    @lakshminarasimhanvenkatakr3754 3 ปีที่แล้ว +3

    This is excellent explanation so that anyone can understand with so much granular level of details.

  • @fedisalhi6320
    @fedisalhi6320 4 ปีที่แล้ว +8

    Excellent explanation, it was really helpful thank you.

  • @archanamaurya89
    @archanamaurya89 3 ปีที่แล้ว +6

    This video is such a light bulb moment for me :D Thank you so very much!!

  • @shashanktripathi3034
    @shashanktripathi3034 3 ปีที่แล้ว +5

    Krish sir your youtube channel is just like GITA for me as one gets all the answers to life in GITA I get all my doubts cleared on your channel.
    Thank you, SIr.

    • @kartikdave659
      @kartikdave659 3 ปีที่แล้ว

      after becoming member how can i get the data science material, can you please tell me?

  • @VVV-wx3ui
    @VVV-wx3ui 4 ปีที่แล้ว +1

    Superb...simply superb. understood the concept now from the Loss function. Well don Krish.

  • @nitayg1326
    @nitayg1326 4 ปีที่แล้ว +15

    My God! Finally am clear about GD SGD and mini batch SGD!

  • @severnsevern1445
    @severnsevern1445 3 ปีที่แล้ว

    Great explanation . Very clear . Thank!

  • @Skandawin78
    @Skandawin78 4 ปีที่แล้ว

    Your vidoes are excellent reference to brush up these concepts

  • @allaboutdata2050
    @allaboutdata2050 4 ปีที่แล้ว +1

    What an explaination 🧡 . Great !! Awesome !! .

  • @taranilakshmi9680
    @taranilakshmi9680 4 ปีที่แล้ว

    Explained very well. Thankyou.

  • @khuloodnasher1606
    @khuloodnasher1606 4 ปีที่แล้ว

    Really this is the best video i'v seen ever explaining the concept better than famous. school

  • @tonyzhang2501
    @tonyzhang2501 3 ปีที่แล้ว +1

    Thank you, It is clear explanation. I got it!

  • @gayathrijpl
    @gayathrijpl ปีที่แล้ว

    such a clean way of explanation

  • @gauravsingh2425
    @gauravsingh2425 4 ปีที่แล้ว

    Thanks Krish !!! very nice explanation

  • @chinmaybhat9636
    @chinmaybhat9636 4 ปีที่แล้ว

    Awesome @KrishNaik Sir.

  • @guytonedhai
    @guytonedhai ปีที่แล้ว

    How are you so good at explaining 😭😭😭😭😭 Thanks a lot ♥♥♥

  • @ArthurCor-ts2bg
    @ArthurCor-ts2bg 4 ปีที่แล้ว

    Krish you concise subject most meaningfully

  • @uttamchoudhary5229
    @uttamchoudhary5229 4 ปีที่แล้ว +1

    Great video man 👍👍..Please keep it up. I am waiting for next videos

  • @sandipansarkar9211
    @sandipansarkar9211 4 ปีที่แล้ว +1

    Thanks Krish. Good video.I want to use all this knowledge in my next batch of deep learning by ineuron

  • @rabidub733
    @rabidub733 3 หลายเดือนก่อน

    thanks for this! great explanation

  • @koustavdutta5317
    @koustavdutta5317 3 ปีที่แล้ว +2

    Hi Krish, one request to you ...like this playlist, please make long videos for the ML Playlist with the Loss Functions , Optimizers used in various ML Algorithms --> mainly in case of Classification Algorithms

  • @ashwanikumar-zh1mq
    @ashwanikumar-zh1mq 3 ปีที่แล้ว

    Good Good clearly explained nobody can explained like this

  • @Kurtmind
    @Kurtmind 2 ปีที่แล้ว

    Excellent explanation Sir!

  • @vinuvarshith6412
    @vinuvarshith6412 ปีที่แล้ว

    Top notch explanation!

  • @bhavanapurohit2627
    @bhavanapurohit2627 3 ปีที่แล้ว +2

    Hi, is it completely theoretical or will you code in further sessions?

  • @syedsaqlainabatool3399
    @syedsaqlainabatool3399 3 ปีที่แล้ว

    This is what i was looking for

  • @rameshthamizhselvan2458
    @rameshthamizhselvan2458 4 ปีที่แล้ว

    Excellent!

  • @akfvc8712
    @akfvc8712 3 ปีที่แล้ว

    greate video excelent effort. appreciated!!

  • @alsabtilaila1923
    @alsabtilaila1923 3 ปีที่แล้ว

    Great one!

  • @nansonspunk
    @nansonspunk ปีที่แล้ว

    yes i really liked this explanation thanks

  • @rdf1616
    @rdf1616 3 ปีที่แล้ว

    good explanation! thankss

  • @aminuabdulsalami4325
    @aminuabdulsalami4325 4 ปีที่แล้ว

    Great guy.

  • @sreejus8218
    @sreejus8218 3 ปีที่แล้ว

    If we use a sample of output to find the loss, will we use its derivative for changing whole weight or change the weights of the respective output

  • @nikkitha92
    @nikkitha92 4 ปีที่แล้ว +1

    Sir your videos are amazing. Can you please explain about latest methodologies such as BERT , ELMO

  • @ting-yuhsu4229
    @ting-yuhsu4229 4 ปีที่แล้ว

    You are AWESOME! :)

  • @praneethcj6544
    @praneethcj6544 4 ปีที่แล้ว

    Perfect ..!!!

  • @aditisrivastava7079
    @aditisrivastava7079 4 ปีที่แล้ว +2

    Just wanted to ask to ask if you could also suggest some good resources online that we can read which could bring more clarity.......

  • @response2u
    @response2u 2 ปีที่แล้ว

    Thank you, sir!

  • @Anand-uw2uc
    @Anand-uw2uc 4 ปีที่แล้ว +9

    Good Explanation! But you did not speak much about when to use SGD although you clarified better on GD and Mini Batch SGD

    • @vishaldas6346
      @vishaldas6346 3 ปีที่แล้ว +1

      There is nothing much to explain about SGD when you are talking about 1 datapoint at a time while considering dataset of 1000 datapoints.

  • @RaviRanjan_ssj4
    @RaviRanjan_ssj4 4 ปีที่แล้ว

    great video !!

  • @siddharthachatterjee9959
    @siddharthachatterjee9959 4 ปีที่แล้ว

    Good attempt 👍. Please record with camera on manual focus.

  • @jiayuzhou6051
    @jiayuzhou6051 หลายเดือนก่อน

    the only video that explains

  • @SandeepKashyap-ek2hx
    @SandeepKashyap-ek2hx 2 ปีที่แล้ว

    You are a HERO sir

  • @achrafkmout9398
    @achrafkmout9398 3 ปีที่แล้ว

    very good explanation

  • @ruchikalalit1304
    @ruchikalalit1304 4 ปีที่แล้ว +1

    have you make the videos of practical implementation of all the work if so please share the links

  • @vishaljhaveri7565
    @vishaljhaveri7565 2 ปีที่แล้ว

    Thank you sir.

  • @vineetagarwal18
    @vineetagarwal18 ปีที่แล้ว

    Great Sir

  • @rababmaroc3354
    @rababmaroc3354 4 ปีที่แล้ว

    thank you very much for your efforts. please how can we solve a portfolio allocation problem using this algorithm? please answer me

  • @phaneendra3700
    @phaneendra3700 3 ปีที่แล้ว

    hats off man

  • @percyjardine5724
    @percyjardine5724 3 ปีที่แล้ว

    thanks Krish

  • @goodnewsdaily-tamil1990
    @goodnewsdaily-tamil1990 ปีที่แล้ว

    1000 likes for you man👏👍

  • @louerleseigneur4532
    @louerleseigneur4532 3 ปีที่แล้ว

    Thanks buddy

  • @r7918
    @r7918 3 ปีที่แล้ว

    I have 1 question regarding this topic. Is this concept applicable to linear regression, right?

  • @muhammedsahalot8683
    @muhammedsahalot8683 หลายเดือนก่อน

    which have more convergence speed SGD or GD ?

  • @thanicssubakar6303
    @thanicssubakar6303 4 ปีที่แล้ว +1

    Nice bro

  • @muralimohan6974
    @muralimohan6974 3 ปีที่แล้ว

    How can we take k inputs at the same time

  • @rohitsaini8480
    @rohitsaini8480 ปีที่แล้ว

    Sir, please solve my problem, in my view we are doing gradient descent to find the best value of m (slop in case of linear regression, considering b = 0) so if we use all the point then we must came to know at which point the value of m is less, so why we have to use learning rate to update weight because we already know the best value.

  • @sathvikambati3464
    @sathvikambati3464 ปีที่แล้ว

    Thanks

  • @AjanUnderscore
    @AjanUnderscore 2 ปีที่แล้ว

    Thank u sir 🙏🙏🙌🧠🐈

  • @pareesepathak7348
    @pareesepathak7348 3 ปีที่แล้ว

    can you share the paper for reference and also can you share the resources for deep learning for image processing.

  • @manojsalunke2842
    @manojsalunke2842 4 ปีที่แล้ว

    9.28 time, you said sgd will take time to converge than gd, then which is fast , sgd or gd????

  • @bijaynayak6473
    @bijaynayak6473 4 ปีที่แล้ว +5

    Hello Sir, could you share the link for the code where you explained, these videos series are very nice with short of the period we can cover so many concepts. :)

  • @ankitbiswas8380
    @ankitbiswas8380 2 ปีที่แล้ว

    when you mentioned SGD takes place in linear regression . I didnt understand that comment . Even in your linear regression videos for the mean square error we are having sum of squares for all data points . So how SGD got linked in linear regression ?

  • @abhrapuitandy3327
    @abhrapuitandy3327 4 ปีที่แล้ว

    please do tell about stochastic gradient ascent also

  • @_JoyshreeMozumder
    @_JoyshreeMozumder 3 ปีที่แล้ว

    what is resource of data point?

  • @yukeshnepal4885
    @yukeshnepal4885 4 ปีที่แล้ว +2

    8:58 , using GD it converge quickly and while using mini-batch SGD it follows zigzag path, How??

    • @kannanparthipan7907
      @kannanparthipan7907 4 ปีที่แล้ว +1

      In case of mini batch sgd, we are considering only some points so some deviations will be there in the calculation compared to usual gradient descent where we are considering all values. Simple example GD is like total population and mini SGD is like sample population, it will never be equal and in sample population some deviation always will be there in distribution compared to total population distribution.
      We cant use GD everywhere, due to time computation factor, using mini SGD will give approximate correct result.

    • @bhargavpotluri5147
      @bhargavpotluri5147 4 ปีที่แล้ว +1

      @@kannanparthipan7907 Deviation will be there in the final output or in the final converge result. Question is why do we have during the process of convergence. Also for every epoch if we consider different samples then understood that there can be zig zag results in the process of convergence. But if only one sample of k records are considered then why is that zig zag during convergence?

    • @bhargavpotluri5147
      @bhargavpotluri5147 4 ปีที่แล้ว +2

      Ok now I got it. For every iteration, samples are picked at random, so is zig zag. Just gone through other artciles

  • @a.sharan8876
    @a.sharan8876 ปีที่แล้ว

    py:28: RuntimeWarning: overflow encountered in scalar power
    cost = (1/n)*sum([value**2 for value in(y-y_predicted)]) hey bro . ia m stuck here with this error , i could not understand the error itself, if you suggests me some solution. .... just now i started to practice a ml algorthm.

  • @shubhangiagrawal336
    @shubhangiagrawal336 3 ปีที่แล้ว

    good video

  • @minakshiboruah1356
    @minakshiboruah1356 3 ปีที่แล้ว

    @12:02 Sir it should bemini batch stocastic g.d.

  • @samiabidah4197
    @samiabidah4197 3 ปีที่แล้ว

    please what the difference between GD and Batch GD !

  • @khushboosoni2788
    @khushboosoni2788 ปีที่แล้ว

    sir can you explain me SPGD algorithm please

  • @funpoint3966
    @funpoint3966 3 หลายเดือนก่อน

    please workout your camera issue it seems like it is set to auto focus resulting in a little disturbance.

  • @jsverma143
    @jsverma143 4 ปีที่แล้ว

    negative weights and positive weights best explained as--
    since the angle of tangent is more than 90 degree in left side of the curve so this results in -ve values and for other its less than 90 degree so it would be +ve

  • @soheljagirdar8830
    @soheljagirdar8830 3 ปีที่แล้ว +1

    4:17 SGD have minimum 256 records to find error / minima you said it's 1 record at a time

    • @pramodyadav4422
      @pramodyadav4422 3 ปีที่แล้ว +1

      I read few articles which says In "SGD a randomly one data point is picked from the whole data set at each iteration". 256 records which you're talking about may be Mini Batch SGD "It is also common to sample a small number of data points instead of just one point at each step and that is called “mini-batch” gradient descent."

    • @tejasvigupta07
      @tejasvigupta07 3 ปีที่แล้ว

      @@pramodyadav4422 yeah ,even I have read that in SCD only one data point is selected and updated in each iteration instead of all.

  • @shekharkumar1902
    @shekharkumar1902 4 ปีที่แล้ว

    Confusing one !

  • @atchutram9894
    @atchutram9894 4 ปีที่แล้ว

    Switch the auto focus feature in your camera. It is distracting.

  • @chalapathinagavarmabhupath8432
    @chalapathinagavarmabhupath8432 4 ปีที่แล้ว

    our videos are good but camara was bad

  • @devaryan2201
    @devaryan2201 2 ปีที่แล้ว

    do change your method of teaching seems like someone has read a book and just trying to copy thatt content from ones side .....use your own ideologies for it
    :)