NN - 10 - Cross Entropy and Softmax - Derivatives

แชร์
ฝัง
  • เผยแพร่เมื่อ 3 ก.พ. 2025

ความคิดเห็น • 22

  • @pedrocolangelo5844
    @pedrocolangelo5844 8 หลายเดือนก่อน +1

    This video is everything I asked for! Thank you so much, Meerkat!

  • @aytacsekmen2138
    @aytacsekmen2138 3 หลายเดือนก่อน

    Such a good explanation. Appreciated it boss

  • @aeideevie6237
    @aeideevie6237 ปีที่แล้ว +1

    just like magic, a matrix vector multiplication turned into a vector difference. Thanks alot

  • @skgudipati3379
    @skgudipati3379 3 หลายเดือนก่อน

    Thanks bro, i was struggling to get this since 3 days.. at last got this.. thanks a lot..

  • @yazmat182
    @yazmat182 ปีที่แล้ว

    Just want to say thank you, I was following Andrej Karpathy's make more series and trying to implement everything in C# from scratch with no tensor and no automatic differentiation, I only have raw bi-dimensional arrays and thanks to your explanation I was able to calculate the gradients to train the network! thanks dude your explanation is the best!

  • @farhanshadiquearronno7453
    @farhanshadiquearronno7453 8 หลายเดือนก่อน

    Just wanted to say Thank You !!! Very well explained & so much intuitive 👍

  • @prateekyadav9811
    @prateekyadav9811 6 หลายเดือนก่อน +1

    Thanks so much brother! I was struggling with this so much. I am following NNs from scratch book by Sentdex and I was stuck at the derivative of softmax because I was not able to understand the notations. Now, I understand that the j=k is referring to the diagonal elements of the gradient matrix :) Thanks

  • @dandan1364
    @dandan1364 5 หลายเดือนก่อน

    Wish you would have described Y better in the video … eventually figured it out had to go back and forth, but so far this makes a lot of sense, thanks.

  • @ВладимирНовиков-ы5с
    @ВладимирНовиков-ы5с ปีที่แล้ว +1

    YOU ARE THE BEST THANK YOU, YOU HELP ME TO PREPARE TO MY EXAM I LOVE U!!!!! HELLO FROM RUSSIA

  • @月-x9w7m
    @月-x9w7m ปีที่แล้ว

    This is one of the best explanation of the softmax I have found, thank you so much!!! Hello from Ukraine

    • @MeerkatStatistics
      @MeerkatStatistics  ปีที่แล้ว +1

      Слава Україні!

    • @月-x9w7m
      @月-x9w7m ปีที่แล้ว

      @@MeerkatStatistics Героям Слава!

  • @jvdp9660
    @jvdp9660 17 วันที่ผ่านมา

    Why not use the jacobian and use the cross entropy derivative as a row vector in the final step

  • @shajidmughal3386
    @shajidmughal3386 หลายเดือนก่อน

    thanks man

  • @lordcasper3357
    @lordcasper3357 3 หลายเดือนก่อน

    thank you boss

  • @sophia17965
    @sophia17965 8 หลายเดือนก่อน

    at 3:30 , i dont understand why ez1 + ez2 + ez 3 = 1
    can someone please explain?
    thanks

    • @MeerkatStatistics
      @MeerkatStatistics  8 หลายเดือนก่อน

      because they are also divided by the exact same number... so it turns into 1.

    • @sophia17965
      @sophia17965 8 หลายเดือนก่อน

      @@MeerkatStatistics oh 🤦‍♀🤦‍♀duh. Thanks

  • @waisyousofi9139
    @waisyousofi9139 ปีที่แล้ว

    btw can u tell, when will we need that diagonal part.

  • @danielmendozacastrillon5469
    @danielmendozacastrillon5469 ปีที่แล้ว

    Excelente video

  • @samuelmcdonagh1590
    @samuelmcdonagh1590 ปีที่แล้ว

    Quick question. In most other videos, such as the StatQuest on neural network part 7 deriving the backprop for CCEL and Softmax, they seem to arrive at the answer that the derivative of the loss with respect to the inputs to the softmax would be (in the case that y_true = [0,0,1,0] for ease), [a3, a3, a3- 1, a3]; wheras, you get [a1, a2, a3-1, a4], which is also what i get. Do you know whether this is a discrepany in their work or yours?

  • @waisyousofi9139
    @waisyousofi9139 ปีที่แล้ว

    wow!!!
    Thanks