Lecture 6: Backpropagation

แชร์
ฝัง
  • เผยแพร่เมื่อ 13 ธ.ค. 2024

ความคิดเห็น • 60

  • @sachinpaul2111
    @sachinpaul2111 3 ปีที่แล้ว +18

    Prof...stop ...stop...it's already dead! Oh BP you thought you were this tough complex thing and then you met Prof. Justin Johnson who ended you once and for all!
    The internet is 99.99% garbage but content like this makes me so glad that it exists. What a masterclass! What a man!

  • @quanduong8917
    @quanduong8917 3 ปีที่แล้ว +84

    this lecture is an example of a perfect technical lecture

  • @ShuaiGe-n3g
    @ShuaiGe-n3g 13 ชั่วโมงที่ผ่านมา

    I 've just watched 30 minutes, but I 'm so excited to comment here that it's definately the best course for back propagation!!!!

  • @ritvikkhandelwal1462
    @ritvikkhandelwal1462 3 ปีที่แล้ว +33

    Amazing! One of the best Backprop explanation out there!

  • @odysy5179
    @odysy5179 3 หลายเดือนก่อน +2

    I work in ML and am doing review for interviews, this lecture is extremely thorough!

  • @piotrkoodziej4336
    @piotrkoodziej4336 3 ปีที่แล้ว +15

    Sir, you are amazing! I've wasted hours reading and watching internet gurus on this topic, and they could not explain it at all, but your lecture worked!

  • @vardeep277
    @vardeep277 4 ปีที่แล้ว +7

    Dr. JJ, you sly sun of a gun. This is one of the best things ever. 47:39, the way he asks if it is clear. It is damn clear man. Well Done!

  • @rookie2641
    @rookie2641 2 ปีที่แล้ว +7

    Best lecture ever on explanation of backpropagation in math

  • @achronicstudent
    @achronicstudent 2 หลายเดือนก่อน +1

    Finally!! I understood how to apply backpropagation. Thank you sir! Thank you!

  • @ryliur
    @ryliur 3 ปีที่แล้ว +10

    Future reference for anybody, but I think there's a typo @ 50:24. It should be dz/dx * dL/dz when using chain rule to find dL/dx

  • @dbzrz1048
    @dbzrz1048 2 ปีที่แล้ว +3

    finally some coverage on backprop with tensors

  • @liviumircea6905
    @liviumircea6905 4 หลายเดือนก่อน

    At 58:56 prof Johnson tells something huge imho , the final equation is not formed by jacobians , finally I got it..simply the best explanation on the backprop .Thank you prof Johnson

  • @shoumikchow
    @shoumikchow 4 ปีที่แล้ว +9

    10:02. Dr. Johnson means, "right to left" not "left to right"

  • @tomashaddad
    @tomashaddad 3 ปีที่แล้ว +13

    I don't get how back propagation tutorials by 3B1B, StatQuest, etc, get so much praise, but neither of them are as succinct as you were in those first two examples. Fuck that was simple.

  • @KeringKirwa
    @KeringKirwa 9 หลายเดือนก่อน

    You earned a like, a comment and a subscriber ... what an explanation .

  • @VikasKM
    @VikasKM 3 ปีที่แล้ว +2

    wooooowww.. what a superb lecture on backpropagation. simply amazing.

  • @kentu3892
    @kentu3892 6 หลายเดือนก่อน

    Such an amazing lecture with easy-to-understand examples!

  • @minhlong1920
    @minhlong1920 2 ปีที่แล้ว +2

    Such awesome and intuitive explaination!

  • @mihailshutov105
    @mihailshutov105 5 หลายเดือนก่อน

    Thank you very much! I really enjoy this lecture! Hello from Russia with love :)

  • @artcellCTRL
    @artcellCTRL 2 ปีที่แล้ว +2

    22:22 the local gradient should be "[1-sigma(1.00)]*sigma(1.00)" where 1.00 is the input to the sigmoid-fcn block

  • @mohamedgamal-gi5ws
    @mohamedgamal-gi5ws 4 ปีที่แล้ว +11

    The good thing about these lectures is that finally Dr.Johnson has more time to speak compared to cs231n !

  • @anupriyochakrabarty4822
    @anupriyochakrabarty4822 2 ปีที่แล้ว +2

    how come u are getting the value of e^x as -0.20. Could u explain

  • @AndyLee-xq8wq
    @AndyLee-xq8wq 2 ปีที่แล้ว

    Amazing courses!

  • @smitdumore1064
    @smitdumore1064 ปีที่แล้ว

    Top notch content

  • @훼에워어-u1n
    @훼에워어-u1n ปีที่แล้ว

    this is extremly hard. but this is a great lecture for sure. you are awesome Mr Johnson

  • @shauryasingh9553
    @shauryasingh9553 5 หลายเดือนก่อน

    I finally understand backprop!

  • @akramsystems
    @akramsystems 2 ปีที่แล้ว

    Beautifully done!

  • @nityunjgoel1438
    @nityunjgoel1438 2 หลายเดือนก่อน

    Masterpiece!!!!

  • @apivovarov2
    @apivovarov2 ปีที่แล้ว

    @49:44 - Mistake in dL/dx formula - 2nd operand should be dL/dz (not dL/dx)

  • @arisioz
    @arisioz ปีที่แล้ว +2

    At around 18:20 shouldn't the original equation have a w_2 term that gets added to w_0*x_0+w_1*x_1?

  • @נירבןזכרי
    @נירבןזכרי 3 ปีที่แล้ว

    THANK YOU SO MUCH! finally not shallow and excellent explanation.

  • @debasishdas9610
    @debasishdas9610 7 หลายเดือนก่อน

    19:38 Shouldn't 0.39 be 0.4 and 0.59 be 0.6 -- not sure where the rounding errors have creeped in.
    49:45 would it not be much easier to use Einstein index notation?

  • @MiD-k7u
    @MiD-k7u ปีที่แล้ว

    Great lecture thank you. I have a question, would be great if anyone could clarify. When you first introduce vector valued backpropagation, you have the example showing 2 inputs to the node, each input is a vector of DIFFERENT dimension - when would this be the case in a real scenario? I thought the vector formulation was so that we could compute the gradient for a batch of data (e.g. 100 training points) rather than running backprop 100x. In that case the input vectors and output vectors would always be of the same dimension (100). Thanks!

  • @tornjak096
    @tornjak096 ปีที่แล้ว

    1:03:00 should the dimension of grad x3 / x2 be D2 x D3?

  • @dmitrii-petukhov
    @dmitrii-petukhov 4 ปีที่แล้ว +3

    Awesome explanation of Backpropagation! Amazing slides! Much better than CS231n.

  • @Nihit-n5n
    @Nihit-n5n 4 ปีที่แล้ว

    great video.thanks for posting it

  • @jungjason4473
    @jungjason4473 3 ปีที่แล้ว

    Can anyone explain 1:08:05? dL/dx1 should be next to dL/dL, not L when it is subject to function f2'. Thereby back propagation cannot connect fs and f's.

  • @YoshuaAIL
    @YoshuaAIL 6 หลายเดือนก่อน

    Amazing!

  • @matthewsocoollike
    @matthewsocoollike 10 หลายเดือนก่อน

    19:00 where did w2 come from?

  • @zainbaloch5541
    @zainbaloch5541 2 ปีที่แล้ว

    19:14 Can someone explain computing the local gradient of exponential function. I mean how the result -0.2 comes? I'm lost there!!!

    • @beaverknight5011
      @beaverknight5011 2 ปีที่แล้ว +1

      Our upstream gradient was -0.53 right? And now we need the local gradient of e^-x which is -e^-x and -e^-(-1)= -0.36. So upstreamgrad(-0.53) multiplied with local grad (-0.36) is 0.1949 which is approximately 0.2. So 0.2 is not local grad it is local multiplied with upstream

    • @zainbaloch5541
      @zainbaloch5541 2 ปีที่แล้ว

      @@beaverknight5011 got it, thank you so much!

    • @beaverknight5011
      @beaverknight5011 2 ปีที่แล้ว

      @@zainbaloch5541 you are welcome, good luck with your work

    • @Valdrinooo
      @Valdrinooo ปีที่แล้ว +1

      I don't think beaver's answer is quite right. The upstream gradient is -0.53. But the local gradient comes from the function e^x not e^-x. The derivative of e^x is e^x. Now we plug in the input which is -1 and we get e^-1 as the local gradient. This is approximately 0.37. Now that we have the local gradient we just multiply it with the upstream gradient -0.53 which results in approximately -0.20.

  • @qingqiqiu
    @qingqiqiu 2 ปีที่แล้ว

    Can anyone clarify the computation of hessian matrix in detail ?

  • @genericperson8238
    @genericperson8238 2 ปีที่แล้ว

    46:16, shouldn't dl/dx be 4, 0, 5, 9 instead of 4, 0, 5, 0?

    • @kevalpipalia5280
      @kevalpipalia5280 ปีที่แล้ว

      No, the operation is not relu, its calculation of the downstream gradient. since last row of jacobian is 0 meaning that changes in that value does not affect the output, so 0.

    • @kevalpipalia5280
      @kevalpipalia5280 ปีที่แล้ว

      For the point of passing or killing the value of the upstream matrix, you have to decide pass or kill by looking at the input matrix, here that is [ 1, -2, 3, -1] so looking at -1, we will kill that value from the upstream matrix, so 0.

  • @maxbardelang6097
    @maxbardelang6097 3 ปีที่แล้ว

    54:51 when my cd player gets stuck on a old eminem track

  • @haowang5274
    @haowang5274 2 ปีที่แล้ว

    thanks, good god, best wish to you.

  • @Nur_Md._Mohiuddin_Chy._Toha
    @Nur_Md._Mohiuddin_Chy._Toha 6 วันที่ผ่านมา

    👍👍👍👍

  • @DED_Search
    @DED_Search 3 ปีที่แล้ว

    45:00 Jacobean matrix does not have to be diagonal right?

    • @blakerichey2425
      @blakerichey2425 3 ปีที่แล้ว

      Correct. That was unique to the ReLU function.
      The "local gradient slices" in his discussion at 53:00 are slices of a more complex Jacobian.

  • @aoliveira_
    @aoliveira_ 2 ปีที่แล้ว

    Why is he calculating derivatives relative to the inputs?

  • @jorgeanicama8625
    @jorgeanicama8625 ปีที่แล้ว

    It is actually muchhhhhh more simpler than the way he used to explain. I believe he was redundant and too many symbols that hides the beauty of the underneath reason of the algorithm and the math behind it. It all could have been explained in less amount of time.

    • @kushaagra098
      @kushaagra098 3 หลายเดือนก่อน

      do you have any resources that explain this better?

  • @benmansourmahdi9097
    @benmansourmahdi9097 ปีที่แล้ว

    terrible sound quality !

  • @Hedonioresilano
    @Hedonioresilano 3 ปีที่แล้ว +6

    it seems the coughing guy got the china virus at that time

    • @arisioz
      @arisioz ปีที่แล้ว +1

      I'm pretty sure you'd be called out as racist back in the days of your comment. Now that it's almost proven to be a china virus...