What is Automatic Differentiation?

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 พ.ย. 2024

ความคิดเห็น • 111

  • @anjelpatel7918
    @anjelpatel7918 3 ปีที่แล้ว +242

    I like how more and more people are adopting 3b1b's style. Makes the content much better and easier to understand. This slowly converts a lot of the more complicated topics into easy-to-digest modules.

    • @Artaxerxes.
      @Artaxerxes. 2 ปีที่แล้ว +5

      It literally uses manim

    • @platypusfeathers
      @platypusfeathers 2 ปีที่แล้ว +13

      3B1B’s creator Grant Sanderson created an animation library for himself to use to make videos. People forked that library (made a copy of it) and now there is a community supported version of it for creators, while he continues to use his own ( as well as the community one). Pretty cool stuff!

    • @atotoole21
      @atotoole21 ปีที่แล้ว

      @@Artaxerxes. Nice! I didn't know about manim or that 3B1B's animation technic was python based. I assumed it was done by hand using Illustrator or something.

    • @umbraemilitos
      @umbraemilitos ปีที่แล้ว +2

      Yes, though I don't think 3B1B wants their videos to be a template to copy. I think he's happy to inspire, but doesn't think that his Manim program is the right tool for most cases. He released a video explaining the SOME criteria, and it allows for lots of creative expression in teaching.

    • @andreypopov6166
      @andreypopov6166 6 หลายเดือนก่อน +1

      3b1b or any other style on its own doesn't mean that the content is easier to understand.

  • @DaveJ6515
    @DaveJ6515 หลายเดือนก่อน +1

    Automatic Differentiation, both forward and backward, plus mixed mode and the Hessian example at the end, all in less than 15 minutes, and totally clear.
    Great content.

  • @arkasaha4412
    @arkasaha4412 4 ปีที่แล้ว +44

    Man this is pure gold. We all use this stuff but hardly have a clear idea about it's nitty-gritties. Thanks for thre awesome content and presentation, keep it up! :)

  • @raminbohlouli1969
    @raminbohlouli1969 ปีที่แล้ว +6

    I knew basically 0 about AD and didn't know where to start since all the articles, websites ,books etc that I have looked into, explained everything in a really comlicated way. I would like to thank you immensly for this very informative yet simple video! Now I know enough to dive deeper into the concept. This video was all I needed. Keep up the great work! You got yourself a new follower.

  • @stathius
    @stathius ปีที่แล้ว +6

    Class act, being concise and clear at the same time is no easy feat. Thank you.

  • @andrewbeatty5912
    @andrewbeatty5912 4 ปีที่แล้ว +25

    Best summary I've ever seen !

  • @arnold-pdev
    @arnold-pdev 3 ปีที่แล้ว

    Went from complete ignorance to understanding in 15 min. Thank you!

  • @chandank5266
    @chandank5266 4 ปีที่แล้ว +7

    Your way of explanation is outstanding.....love from india sir♥️

  • @TheLokiGT
    @TheLokiGT ปีที่แล้ว +1

    Very good job. One of the very few good videos I've seen around about autodiff.

  • @jorgeanicama8625
    @jorgeanicama8625 ปีที่แล้ว +2

    Thank you Ari. I used symbolic computation in the past but this novel way of calculating derivatives is quite interesting. Learnt lots by watching your video. For sure, I will follow up with the recommended literature

  • @koushik7604
    @koushik7604 ปีที่แล้ว

    This is highly motivated by Andrej Karpathy's lecture, but very clear explanation. It is indeed a good addition to my resource list.

    • @ram-my6fl
      @ram-my6fl 3 หลายเดือนก่อน

      did andrej karpathy use same graphs or images ?

  • @abhishek.shenoy
    @abhishek.shenoy 3 ปีที่แล้ว +7

    This is so well explained! I love the quality of your videos!

  • @esaliya
    @esaliya 3 ปีที่แล้ว

    This is a neat summary that's hard to find in a single place!

  • @VHenrik007
    @VHenrik007 5 หลายเดือนก่อน +1

    Just as a note for anyone wondering, the arxiv link doesn't work because it includes the closing parenthesis. Otherwise great video!

  • @jaf7979
    @jaf7979 2 ปีที่แล้ว +2

    Well done, superbly explained in context of other differentiation methods. Exactly what I needed!

  • @tom-sz
    @tom-sz 6 หลายเดือนก่อน +1

    Great video! Where can I learn more about the rounding and truncation errors plot at 2:06? I need to make an analysis of these errors for a project. Thanks :)

  • @stansilverman1901
    @stansilverman1901 4 ปีที่แล้ว +1

    In order to explain this to my wife, I differentiated voter rights-the analog process humans decide who should be allowed to vote, someone who looks like me, or everyone?. I think she got it. Brilliant Ari

  • @BrianAmedee
    @BrianAmedee 4 ปีที่แล้ว +3

    Excellent presentation mate. That was an awesome explanation and a nice trip down memory lane (university days).

  • @datamike7457
    @datamike7457 4 ปีที่แล้ว +8

    Ari, this is great content! I used to call symbolic differentiation 'analytical'. It is obnoxious to track all of the coefficients.

  • @pandatory1108
    @pandatory1108 4 ปีที่แล้ว +6

    Excellent video Ari. Thanks for such a great explanation!
    Also, your animations were really well done. I suspected you might be using manim based on the style and then I read the description :)

  • @SohailKhan-zb5td
    @SohailKhan-zb5td 2 ปีที่แล้ว

    Thanks a lot. This kind of videos are really a lot of hardwork to produce. Thanks a lot

  • @GordonWade-kw2gj
    @GordonWade-kw2gj 6 หลายเดือนก่อน

    Wonderful video. The detailed example helps tremendously.
    And I think there's an error: At t=6.24, sInce $v_6 = v_5\times v_4$, in $\dot{v}_6$ shouldn't there be a plus sign where you've got a minus sign?

  • @pulusound
    @pulusound 4 ปีที่แล้ว

    very well explained video with lovely calm background music. i need to brush up on my vector calculus and come back but this gave me a good intuition. hope you make more of these!

  • @amadlover
    @amadlover ปีที่แล้ว

    timely information about source code manipulation and google tangent. It was a kind of confirmation for me that it was indeed possible.
    I started to learn meta programming hoping to generate code for the differentials, based on the function, without actually knowing if it was possible., basically a shot in the dark.
    cheers

  • @AJ-et3vf
    @AJ-et3vf 3 ปีที่แล้ว

    Awesome presentation! I understand autodiff a little bit more. I'll rewatch several more times in the future to understand it better till I completely understand it :)

  • @KulvinderSingh-pm7cr
    @KulvinderSingh-pm7cr ปีที่แล้ว

    This is exceptionally well explained.

    • @KulvinderSingh-pm7cr
      @KulvinderSingh-pm7cr ปีที่แล้ว

      And thanks a lot for references too, they're very useful.

  • @jorgeanicama8625
    @jorgeanicama8625 ปีที่แล้ว

    One more note ARI. I think there is a small typo. From minute 7:36 until 7:46 the derivative of V6 should be a "+" instead of a "-".

  • @ΔημητρηςΚατσικης-π9η
    @ΔημητρηςΚατσικης-π9η 3 ปีที่แล้ว

    Thanks you so much. This video really helps me to understand a little more what is automatic differentiation is.

  • @halneufmille
    @halneufmille 3 ปีที่แล้ว

    Thanks! I never understood this before, but it became obvious in one second.

  • @dullyvampir83
    @dullyvampir83 10 หลายเดือนก่อน

    Great video, thank you!
    Just a question, you said a main problem with symbolic differentiation is that no control flow operations can be part of the function. Is that in any way different for Automatic differentiation?

  • @YorkiePP
    @YorkiePP 4 ปีที่แล้ว

    Fantastic video on autodiff, really cleared up a lot of things I wasn't sure about.

  • @ccgarciab
    @ccgarciab 4 ปีที่แล้ว +1

    Looking forward to your future videos

  • @jkkang9666
    @jkkang9666 4 ปีที่แล้ว +2

    Thanks for the great summary and the nice video.

  • @newbie8051
    @newbie8051 ปีที่แล้ว

    Beautiful video but I lost track quite a few times, is there any pre-requisite topics/stuff I should know before trying to understand this

  • @aldaszarnauskas27
    @aldaszarnauskas27 ปีที่แล้ว

    Great video, well presented, clearly explained, nice visualisation... Thank you!

  • @juandavidnavarro
    @juandavidnavarro ปีที่แล้ว

    Excellent video!! thank you so much. I have a question: is there any AD reverse mode based on dual numbers?

  • @prydt
    @prydt 3 ปีที่แล้ว

    Amazing explanation of Autograd and wonderful visualizations!!! Thank you so much.

  • @PahenPWNZ
    @PahenPWNZ 3 ปีที่แล้ว +1

    Awesome explanation, thanks!
    But I still have one question, can someone explain please, at 12:05, right column (Adjoints)
    I don't understand how did we get these values (f. e. v bar 5 = v4 * v bar 6, etc...) From where did these values come from?
    If we use the formula at the previous slide with sum of children nodes, I get different values..

    • @MarkKrebs
      @MarkKrebs 3 ปีที่แล้ว

      Hi I have same Q. The moment when adjoints are defined is a break to me. vbar5 = v4 * vbar6 seems "backwards." I see it matches the formula given on the prior graph page, but not the intuition for it. "The sum of the output values, weighted by my leverage in creating them," is as close as I can get.

    • @abhaysolanki9284
      @abhaysolanki9284 2 ปีที่แล้ว +1

      I know when he said children I automatically thought of v3 and v4. But instead the children in the case v5 is only v6. And children for v4 are v5 and v6. Children are the nodes that the node is pointing to.

  • @ぶらえんぴん
    @ぶらえんぴん ปีที่แล้ว

    I like your tutorial video because it is short
    and good

  • @paulpassek6118
    @paulpassek6118 4 ปีที่แล้ว +2

    Thanks for the superb video. I think you made a little mistake in the forward mode example at 6:24. Shouldn't it be v̇_6 = v̇_5*v_4 + v̇_4*v5 ?

    • @ariseffai
      @ariseffai  3 ปีที่แล้ว

      Thanks Paul, good catch-placed this under errata.

  • @nathanielscreativecollecti6392
    @nathanielscreativecollecti6392 3 ปีที่แล้ว

    Bravo! I have a final today and now I get it!

  • @ktugee
    @ktugee ปีที่แล้ว

    slight type : @6.29 : v6' = v5'v4 + v4'v5. ( there should a + instead of - )

  • @weinansun9321
    @weinansun9321 4 ปีที่แล้ว +2

    more videos please, this is amazing!

  • @Roshan-xd5tl
    @Roshan-xd5tl 2 ปีที่แล้ว

    Brilliant video, Ari. Thank you!

  • @홍성의-i2y
    @홍성의-i2y ปีที่แล้ว

    어쨌든 요점은, 모든 것을 다 closed form으로 저장해서 gradient를 매번 구하는 게 아니라는 점이다. 한 번 계산할 때마다, output value와 더불어 gradient value도 함께 계산해두어, 나중에 forward / backward 할 때 사용한다.

  • @thivinanandh4430
    @thivinanandh4430 3 ปีที่แล้ว

    Awesome Explanation..!!!!!
    Keep rocking..!!!

  • @asdf56790
    @asdf56790 2 ปีที่แล้ว

    Exactly what I was looking for! Thank you :)

  • @garlictoastreviews
    @garlictoastreviews 2 หลายเดือนก่อน

    Is the reason that we know the tangents of v_-1 and v_0 is because we are taking the partial with respect to x_1?

  • @chnlior
    @chnlior 3 ปีที่แล้ว

    Great summary, Ari. Thank you.
    I think there is small error in 6:23. v6' = v5'v4 + v4'v5 and not "-".

    • @ariseffai
      @ariseffai  3 ปีที่แล้ว

      Thanks Lior, good catch-placed this under errata.

  • @rachelellis6655
    @rachelellis6655 2 ปีที่แล้ว

    Derivative at 0:43 would actually be: f' (x) = (2x)e^(2x-1)- 3x^2 ... would it not?
    Great video.. I've subscribed! I'm just learning derivative and chain rule so I want to be sure I'm understanding the concept/rules/procedures correctly. I'm probably wrong though, that's why I'm asking for verification... thanks!

  • @UnnamedThe
    @UnnamedThe 3 ปีที่แล้ว

    12:26
    May I ask where you got that c

    • @ariseffai
      @ariseffai  3 ปีที่แล้ว +1

      Baydin (arxiv.org/abs/1502.05767) references this bound in Sec. 3.2. I don't have the exact location for it in Griewank and Walther.

    • @UnnamedThe
      @UnnamedThe 3 ปีที่แล้ว +1

      @@ariseffai Thank you a lot! That is already very helpful.

  • @Abhinavneelam
    @Abhinavneelam 4 หลายเดือนก่อน +1

    one thing i don't understand is why can't forward pass do it for multiple input variables? is there a limitation im unaware of?

  • @jishnuak3000
    @jishnuak3000 ปีที่แล้ว

    Very intuitive explanation, thanks

  • @superagucova
    @superagucova 3 ปีที่แล้ว

    Loved this video! Are you using 3b1b's Manim?

    • @ariseffai
      @ariseffai  3 ปีที่แล้ว +1

      Yep! Manim is awesome

  • @proweiqi
    @proweiqi 4 ปีที่แล้ว +2

    this is very good. but some of the stuff moves too fast and not explaining things like the primal part clearly enough

  • @garlictoastreviews
    @garlictoastreviews 2 หลายเดือนก่อน

    Is it a typo when you first show the primals and tangent values, v_6 tangent should be the sum of the v_5*v_4 and v_5v_4*? Thus using the product rule?

  • @andersgadlauridsen1533
    @andersgadlauridsen1533 ปีที่แล้ว

    So is so great content, please keep making more :)

  • @advitranawade3039
    @advitranawade3039 4 หลายเดือนก่อน

    For an ML application, why is it that O(ops(f)) time for automatic diff is considered a faster runtime than O(n) for numerical diff - it seems to me as though the # inputs should be a lower bound for how many operations there are between those inputs .... if this is the case then why use automatic diff at all for ML?

  • @alfcnz
    @alfcnz 7 หลายเดือนก่อน

    @Ari, this is really great! 🤩🤩🤩

    • @ariseffai
      @ariseffai  7 หลายเดือนก่อน

      Thanks Alfredo!

  • @bryanbischof4351
    @bryanbischof4351 4 ปีที่แล้ว +3

    This is quite good. I’m wondering if a part 2 digging deeper yet into how the implementation takes advantage of the concept you introduce here would be possible?

    • @ariseffai
      @ariseffai  4 ปีที่แล้ว +1

      Thanks Bryan. That's a possibility. It would certainly be interesting to dig deeper into the implementation schemes, which were only briefly described here. In the meantime, check out some of the links for further information on implementations.

  • @sandropollastrini2707
    @sandropollastrini2707 3 ปีที่แล้ว

    Beautiful and clear!

  • @setsunakevin6861
    @setsunakevin6861 3 ปีที่แล้ว

    Amazing video! Very well explained.

  • @vijaymaraviya9443
    @vijaymaraviya9443 4 ปีที่แล้ว

    Awesome summary👌

  • @deepanshuchoudhary4598
    @deepanshuchoudhary4598 3 ปีที่แล้ว +1

    Please reply to my Question.
    Where do you learn these and how are you able to grasp them completely, I'm a data science student and i need to know it badly. Pls share insights.

    • @ariseffai
      @ariseffai  3 ปีที่แล้ว +1

      I found the survey by Baydin et al. to be particularly helpful. See the description for links!

  • @garlictoastreviews
    @garlictoastreviews 2 หลายเดือนก่อน

    Could anybody explain why v bar 6 is equal to 1 conceptually?

  • @manumerous
    @manumerous 3 ปีที่แล้ว

    This video is genius! love it.

  • @SuperDonalByrne
    @SuperDonalByrne 9 หลายเดือนก่อน

    Great video!

  • @Vaporizer41
    @Vaporizer41 3 ปีที่แล้ว

    Great video!, I love your content, hope you will keep making many more :)

  • @kong1397
    @kong1397 3 ปีที่แล้ว

    Wow, that's great explanation.

  • @amirrezarezayan8121
    @amirrezarezayan8121 5 หลายเดือนก่อน

    great great great , Thanks a million 😃

  • @tom_verlaine_again
    @tom_verlaine_again 3 ปีที่แล้ว

    Great lesson! Thank you.

  • @hadik4497
    @hadik4497 3 ปีที่แล้ว

    Thanks! This is phenomenal!

  • @M3rtyville
    @M3rtyville 5 หลายเดือนก่อน

    Reverse-on-Forward sounds like ACA.

  • @gabrielmccartney7975
    @gabrielmccartney7975 2 ปีที่แล้ว

    Hello! Can we use dual numbers for integration?

  • @diodin8587
    @diodin8587 2 ปีที่แล้ว

    not mention *dual number*?

  • @jianwang7433
    @jianwang7433 2 ปีที่แล้ว

    thanks for sharing

  • @Rems766
    @Rems766 2 ปีที่แล้ว +1

    chain rule rules

  • @bitahasheminezhad2887
    @bitahasheminezhad2887 3 ปีที่แล้ว

    That was awesome, thank you

  • @sofa33
    @sofa33 3 ปีที่แล้ว

    Thank you so much!

  • @softerseltzer
    @softerseltzer 4 ปีที่แล้ว +1

    Love it!

  • @rtcoffee1235
    @rtcoffee1235 3 ปีที่แล้ว

    thanks for this!

  • @germangonzalez3063
    @germangonzalez3063 3 ปีที่แล้ว

    Very useful

  • @sirallen2591
    @sirallen2591 2 ปีที่แล้ว

    Thanks!

  • @zappist751
    @zappist751 ปีที่แล้ว

    THANK YOU LORD THANK YOU JESUS AND THANK YOU SIR

  • @a.osethkin55
    @a.osethkin55 3 ปีที่แล้ว

    Thanks!!!

  • @9888622400
    @9888622400 3 ปีที่แล้ว

    thanks bro!

  • @bokibogi
    @bokibogi 2 ปีที่แล้ว

    4:27 automatic differentiation ...

  • @sarvasvarora
    @sarvasvarora 3 ปีที่แล้ว +1

    Reddit gang?

  • @심재훈-q7g
    @심재훈-q7g 3 ปีที่แล้ว

    Do you get paid to make such videos? Definitely should

  • @yavarjn2055
    @yavarjn2055 2 ปีที่แล้ว

    Wooow

  • @Manishsingh-dl6ho
    @Manishsingh-dl6ho 3 ปีที่แล้ว

    Fking Great!!!

  • @maxyazhbin826
    @maxyazhbin826 3 ปีที่แล้ว +1

    please no music, fantastic otherwise

  • @MariaFernandez-pv9hn
    @MariaFernandez-pv9hn 3 ปีที่แล้ว

    You should point on the screen what you are talking about when doing examples.

  • @ollllj
    @ollllj 11 หลายเดือนก่อน

    on expression-swell:
    one of my proudest computations (and hard to debug code) is the automated differentiation 3rd derivative of the general quotient rule within [shadertoy ... /WdGfRw ReTrAdUi39] , with identical parts already pre-multiplied out by how much it is constantly repeated.
    webgl code:
    Struct d000{float a;float b;float c;float d;};//1 domains t,dt,dt²,dt³ , sure, this could just be a vec4, but i REALLY needed my custom labels for debugging.
    d000 di(d000 a,d000 b){return d000( //autodiff up to 3 derivatives for division , up to 3 iterations of; quotient rule within chain rule)
    a.a/b.a //0th derivative, simple division
    ,(a.b*b.a-a.a*b.b)/(b.a*b.a) //dx first derivative
    ,((a.c*b.a+a.b*b.b-a.b*b.b-a.a*b.c)*(b.a*b.a)-2.*(a.b*b.a-a.a*b.b)*(b.a*b.b))/(b.a*b.a*b.a*b.a) //dxdx second derivative
    ,((((a.d*b.a+a.c*b.b+a.c*b.b+a.b*b.c-a.c*b.b-a.b*b.c-a.b*b.c-a.a*b.d)*(b.a*b.a)
    +(a.c*b.a+a.b*b.b-a.b*b.b-a.a*b.c)*(b.b*b.a*b.a*b.b))
    +(-2.*(a.c*b.a+a.b*b.b-a.b*b.b-a.a*b.c)*(b.a*b.b)
    +(a.b*b.a-a.a*b.b)*(b.b*b.b+b.a*b.c)))*(b.a*b.a*b.a*b.a)
    -((a.c*b.a+a.b*b.b-a.b*b.b-a.a*b.c)*(b.a*b.a)
    -2.*(a.b*b.a-a.a*b.b)*(b.a*b.b))
    *4.*(b.b*b.a*b.a*b.a))/(b.a*b.a*b.a*b.a*b.a*b.a*b.a*b.a)) //dxdxdx //3rd derivative quotient rule sure is something
    ;}