Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/ Corrections: 13:05 When the residual is negative, the pink circle should be on the left side of the y-axis. And when the residual is positive, the pink circle should be on the right side.
I have started my machine learning journey a month ago and I stumbled onto a myriad of resources that explain linear models using the RSS function but no one, and I mean no one, managed to explain it with as much clarity and elegance as you have in just under 20 minutes. You sir are a boon to the world.
Over the past three years, I have been studying neural networks and delving into the world of coding. However, despite my best efforts, I struggled to grasp the true essence of this complex subject. That is until I stumbled upon your enlightening video. I cannot emphasize enough how much your video has helped me. It has shed light on the intricate aspects of neural networks, allowing me to comprehend the subject matter with greater clarity and depth. The way you presented the material was truly remarkable, and it made a profound impact on my understanding. What astounds me even more is that you provide such valuable content for free. It is a testament to your passion for educating and empowering individuals like myself. Your dedication to spreading knowledge and fostering learning is truly commendable. Thanks to your channel, I have been able to unlock the true essence of mathematics and its relationship with neural networks. The confidence and clarity I now have in this subject are invaluable to my personal and professional growth. Your video has been a game-changer for me, and I am grateful beyond words. Please continue your fantastic work and know that your efforts are deeply appreciated.
As someone who is doing medical research and needs to learn little-by-little about statistics, neural networks and machine learning as my project goes on, your channel is a literal life-saver! It has been so hard to try to keep my M.D. stuff together with my PhD research all the while learning statistics, programming and neural network structures and machine learning. Trying to arrange courses from my uni to fit in with all the other stuff is simply impossible, so I've been left to my own devices and find a way to gain knowledge about said subjects and your channel has done just that. Your teaching is great and down-to-earth enough to be easily grasped, but you also delve deep into the subject after the initial baby steps, so the person watching isn't just left with "nice to know"-infobits. Love it! Keep up the great work!
Take my words Josh you are the best teacher on the internet who teaches Statistics........ and the chain rule made me crazy.......... by your explanation.
Awesome!! None of my math teachers in high school or collage never explained to me WHY chain rule works this way. but you explained it with a very simple example. I'm certain that from now on I'll never forget the chain rule formula. Thanks a million. 👌✔
We could have had a "dreaded terminology alert" : "decomposition of functions". But even without it: this was a perfect explanation of the chain rule , with great practical examples. Bravo, Josh!
BY FAR the best explanation of the chain rule I have ever seen (and trust me - I've seen A LOT) You, sir, just earned yourself yet another well-deserved subscriber. F'n brilliant!!!
Josh you are a master in teaching, you make difficult topics so easy to understand which is really amazing. My mother language is not English but you explain so well and clear that I can understand everything. Congratulations Sir, please keep doing this job.
@@statquestI know this isn't related to this video, i just want you to help me because you replied to this comment. With gradeint descent, how am i supposed to get the derivative for each weight and bias in a loss function dynamically? because surely for networks with more than 100 neurons there would be a way, i know there is i just don't know. When i am calculating the derivative for one varaible in the loss function, to optimize it, i get some overly complicated function, but i see some papers on it and it isn't complicated.
I love StatQuest! I got my SQ mug in the morning and just got the Illustrated Guide to Machine Learning. Super excited to start! Thank you for all the great content!
I would insert a BAM at 5:25. :) ...also, I realized the thing I like about your videos is you explain things, not only in a clear way, but in a different way. It adds to the depth of our understanding. Thank you!
Best reference for learning statistics. Btw, would just like to point out that in 6:16, there appears to be a minor mistake. Actually for every 1 unit increase in Weight, there is a 2 unit increase in Shoe Size, because the equation would be Size = (1/2)*Weight, or 2*Size = 1*Weight
This video is actually correct. For every one unit increase in weight, there is only a 1/2 unit increase in Shoe Size. What your equation shows is that for every unit increase in Size, there is a 2 unit increase Weight. That's not the same thing as "for every unit increase in Weight, there is a 2 unit increase in Size".
Simply the best explanation of chain rule! Now I understand CR better to teach my kid when she needs it... Thank you!!! Do you publish a book on calculus I would love to buy it!
Hi, I think I found a mistake. (?) The pink ball in the graph from 13:08 should be on the other side of the Y axis. It doesn't change the educational value of the whole video but it caught my eye.
13:27 When the residual is negative, the pink circle is shown to be on the right side of the y-Axis, but shouldn't it be on the left side? Aside from that, great content! Cheers from Germany
Despite how good you are at explaining, I'm still having a hard time with it all. My confidence isn't exactly helped by the fact that all the other people in the comments seem to somehow be doing PhDs and stuff, but okay... How can I try to understand it even better?
13:15 Is the residual(squared) graph mirrored? Since residual=(observed - predicted), wouldn't that mean that when on the original graph the intercept is zero, the residual would be positive(2-1=1), so the position on the residual(squared) graph should be on the positive x-axis(x=1), as opposed to the negative side on the video, and vice versa?
Thanks for the video! In the last example, why not just plug in height = 2 and weight = 1 to solve for the intercept: When residual = 0, height - ( intercept + (1*weight)) = 0, so intercept = 1?
Sure, you could solve the equation directly, but the goal is to show how the chain rule works. Furthermore, by using the chain rule, we solve for the general equation and not just a specific equation tied to this specific data.
At 6:54 you said that you fit an exponential line to the graph and got hunger = time^2 + 1/2. I have a few questions about that. 1. I've never heard the phrase 'exponential line' before. Do you just mean an exponential 'line' of best fit? 2. You said that the equation is exponential, but that looks quadratic to me. Am I missing something? I really like the way you explained this. Once you think about problems in the 'real world' like this it really starts to make sense how changing one function affects and changes the other and then why you need the chain rule to find the rate of change.
1. I just mean that we fit a curve defined by the function hunger = time^2 + 1/2 2. I should have said quadratic instead of exponential. I apologize for any confusion that this may have caused.
@@statquest Thanks for replying so quickly on an older video like this! I'm making some math videos of my own right now and I can't believe how easy it is to misspeak or write something wrong. You've done an amazing job with all your videos. This is the only video I've found that attempts to explain the chain rule in an intuitive way without using the limit definition.
YES!!! This is the first video in my series on Neural Nets!!!!!!! The next one should be out soon (hopefully late July, but I always run behind so maybe early August).
Another explanation I saw on reddit that solidifies my understanding of the chain rule: "Mommy is a function with a baby function inside. This is how we find out where her bumps are: Differentiate the mommy- keep the baby inside Differentiate the baby Times them together And you're done :-) Example: (3x2 + x)4 The mommy is ( )4 , the baby is 3x2 + x Differentiate the mummy: 4( )3 Keep the baby inside: 4(3x2 + x)3 Differentiate the baby: 6x+1 Times them together: (6x+1)*4(3x2 + x))3 = (24x+4)(3x2 + x)3 And we're done" For your last example: why is observed and (1*weight) = 0 though? the chain rule makes sense now but Im trying to grasp that math. You say its because the terms do not contain the intercept but what does that mean?
When we take the derivative of something, we take the derivative relative to something. In this case, we take the derivative relative to the intercept. Thus, we are interested in how much things change when the intercept changes. (1 * weight) does not change when the intercept value changes - it stays the exact same, regardless of what value we plug in for the intercept. Since there is 0 change in (1 * weight) when we change the intercept, then the derivative of (1 * weight) with respect to the intercept = 0.
Amazing Video. Helps a lot! Does anyone know an example of an empirical research paper in which the chain rule (two step procedure) is applied in the context of empirical testing of the research question/hypothesis? Thank you very much for a reply!
@@statquest Dear Josh, thank you for your answer. I want to concretise my question. I understood from your videos that the chain rule is used in neural networks to solve the optimization problem and also in logistic regression using gradient descent etc.. I'm currently looking for a content example of published research (= a concret study) in which the modelling approach weight (some indepedent variable) predicts height(some other independent variable) and height predicts shoe size(dependent variable). Does anyone know an example of such an empirical research paper? Thank you very much for a reply!
Yet another bravo tutorial video! Thank you, Josh! One question is: what visual software/tool do you use to draw those beautiful plots? Are u like 3Blue1Brown to write a JS front-end tool yourself? Thanks!
So you can think of the derivative here as the velocity of awesomeness w.r.t. likes stat-quest. Neat. So its can be thought of something like 21 A.S.Q. or, 21 awesomenesses per stat-quest
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
Corrections:
13:05 When the residual is negative, the pink circle should be on the left side of the y-axis. And when the residual is positive, the pink circle should be on the right side.
This is an amazing explanation!!! Thanks
@@adolfocarrillo248 Thank you very much! :)
Got my copy of The StatQuest Illustrated Guide to Machine Learning today! Quadruple BAM!!!!
@@anushreesaran Hooray! Thank you very much! :)
@@statquestwhat do mean by the last term not containing the intercept?
I have started my machine learning journey a month ago and I stumbled onto a myriad of resources that explain linear models using the RSS function but no one, and I mean no one, managed to explain it with as much clarity and elegance as you have in just under 20 minutes. You sir are a boon to the world.
Thank you!
Did I just UNDERSTAND the CHAIN RULE ? SURREAL, thank you!
:)
Amazing pedagogy. Slow pace , short setences , visuals consistent with the talk. great job ;-) Thanks
Glad you liked it!
Man you are amazing. You should get a Nobel prize!
Thank you! :)
Agree!
more than a nobel! book bought
Yes Yes Yes
Or a Grammy!
I am Biostatistician, proclaiming that you are really a good teacher.
Thank you very much!
Over the past three years, I have been studying neural networks and delving into the world of coding. However, despite my best efforts, I struggled to grasp the true essence of this complex subject. That is until I stumbled upon your enlightening video.
I cannot emphasize enough how much your video has helped me. It has shed light on the intricate aspects of neural networks, allowing me to comprehend the subject matter with greater clarity and depth. The way you presented the material was truly remarkable, and it made a profound impact on my understanding.
What astounds me even more is that you provide such valuable content for free. It is a testament to your passion for educating and empowering individuals like myself. Your dedication to spreading knowledge and fostering learning is truly commendable.
Thanks to your channel, I have been able to unlock the true essence of mathematics and its relationship with neural networks. The confidence and clarity I now have in this subject are invaluable to my personal and professional growth.
Your video has been a game-changer for me, and I am grateful beyond words. Please continue your fantastic work and know that your efforts are deeply appreciated.
Thank you very much! BAM! :)
The way you link equations to visuals and show how everything is working along with the math at the SAME time. Beautiful, elegant, easy to follow.
Wow, thank you!
As someone who is doing medical research and needs to learn little-by-little about statistics, neural networks and machine learning as my project goes on, your channel is a literal life-saver! It has been so hard to try to keep my M.D. stuff together with my PhD research all the while learning statistics, programming and neural network structures and machine learning. Trying to arrange courses from my uni to fit in with all the other stuff is simply impossible, so I've been left to my own devices and find a way to gain knowledge about said subjects and your channel has done just that.
Your teaching is great and down-to-earth enough to be easily grasped, but you also delve deep into the subject after the initial baby steps, so the person watching isn't just left with "nice to know"-infobits. Love it! Keep up the great work!
Thank you!
Take my words Josh you are the best teacher on the internet who teaches Statistics........ and the chain rule made me crazy.......... by your explanation.
Wow, thanks!
@@statquest ❤️
Awesome!! None of my math teachers in high school or collage never explained to me WHY chain rule works this way. but you explained it with a very simple example. I'm certain that from now on I'll never forget the chain rule formula. Thanks a million. 👌✔
BAM! :)
Best chain rule explanation i have ever seen.
Thank you!
We could have had a "dreaded terminology alert" : "decomposition of functions". But even without it: this was a perfect explanation of the chain rule , with great practical examples. Bravo, Josh!
Thank you!
BY FAR the best explanation of the chain rule I have ever seen (and trust me - I've seen A LOT)
You, sir, just earned yourself yet another well-deserved subscriber.
F'n brilliant!!!
Thank you very much!!! BAM! :)
If I watched your videos during my college, my career trajectory would be totally different. BIG BAM!!!!
Thanks!
this channel was suggested by my professor, and i always watch the videos while doing a machine learning tasks. Big appreciate to you :D
Cool, thanks!
Josh you are a master in teaching, you make difficult topics so easy to understand which is really amazing. My mother language is not English but you explain so well and clear that I can understand everything. Congratulations Sir, please keep doing this job.
Thank you very much! :)
Bro your the only tutorial that actually helped me grasp this concept, thank you so much.
Glad it helped!
@@statquestI know this isn't related to this video, i just want you to help me because you replied to this comment.
With gradeint descent, how am i supposed to get the derivative for each weight and bias in a loss function dynamically? because surely for networks with more than 100 neurons there would be a way, i know there is i just don't know.
When i am calculating the derivative for one varaible in the loss function, to optimize it, i get some overly complicated function, but i see some papers on it and it isn't complicated.
@@mr.shroom4280 See: th-cam.com/video/IN2XmBhILt4/w-d-xo.html th-cam.com/video/iyn2zdALii8/w-d-xo.html and th-cam.com/video/GKZoOHXGcLo/w-d-xo.html
@@statquest thankyou so much, i watched those but i totally forgot about the chain rule lol
I’ve watched videos like this for work, yours is the best, I fully grasp what a derivative is!
Glad you liked it!
You are a genius at this I can't believe I hadn't heard of this channel before.
Thanks!
Your videos are fantastic, even without the sound effects... but the sound effects really bring them over the top.
Thank you! And thank yo so much for supporting StatQuest!!! BAM! :)
As always, clear and in simple language. Thank you !
Glad it was helpful!
This dude explains things clearly. Huge thanks!
Thanks!
dear @stat quest you must have come from heaven to save students from suffering's
just unbeliable explanation
Thank you! :)
Nobody:
The demon in my room at 3am: 7:56
Dang! :)
jesus, this was funny xD
This one outdoes all the best videos on the topic .
Thank you!
This is probably the best video about on the internet!! Thank you so much for taking the time to do it!!
Glad it was helpful!
I love StatQuest! I got my SQ mug in the morning and just got the Illustrated Guide to Machine Learning. Super excited to start! Thank you for all the great content!
That is awesome! TRIPLE BAM!!!! :)
Such a beautiful intuition that weight height then height shoe size example was just commendable
Thanks!
These seriously are some of my favorite videos on youtube!
Thanks!
One of the best video i have ever watched. Thank yoy guys for providing such a wonderful content for free.
Thanks!
you have great videos that help explain a lot of concepts very clearly, step by step. You have help a lot of students for sure.
Thank you very much! :)
Top notch visualization.
Thank you! :)
You had made my machine learning path easy!
Glad to hear that!
I would insert a BAM at 5:25. :) ...also, I realized the thing I like about your videos is you explain things, not only in a clear way, but in a different way. It adds to the depth of our understanding. Thank you!
That is definitely a BAM moment! And thank you. One of my goals is to always explain things in a different way, so I'm glad you noticed! :)
Best reference for learning statistics. Btw, would just like to point out that in 6:16, there appears to be a minor mistake. Actually for every 1 unit increase in Weight, there is a 2 unit increase in Shoe Size, because the equation would be Size = (1/2)*Weight, or 2*Size = 1*Weight
This video is actually correct. For every one unit increase in weight, there is only a 1/2 unit increase in Shoe Size. What your equation shows is that for every unit increase in Size, there is a 2 unit increase Weight. That's not the same thing as "for every unit increase in Weight, there is a 2 unit increase in Size".
@@statquest I calculated through the equation, and you are correct. Thanks for the verification!
The best video in the internet about the Chain Rule!
Thank you!
Very clear explanation. I saw different people explaining this topic but you are the best.
Thank you so much.
Thank you!
Genius serious sincere
I’m a mathematician and am convinced you are a born sage
Thanks!
I graduated with stats degrees from college 10+ years ago and never touched it. Now I feel I re-learned everything overnight!!!!!
BAM! :)
Dear Josh Starmer, Thank you so much. May God bless with you more knowledge so that you can energize learners like me. ❤. Thank you again.
Thank you very much!
Guess I will not be afraid of the ***THE CHAAAAAINNNN RULE***
Thank you, Josh! Always Waiting for your videos!
Bam! :)
Simply the best explanation of chain rule!
Now I understand CR better to teach my kid when she needs it...
Thank you!!!
Do you publish a book on calculus I would love to buy it!
Thanks! I don't have a book on calculus, but I have on on machine learning: statquest.org/statquest-store/
i'm so moved to finally understand this, thank you!
bam! :)
After this awesome statquest, I will hear 'The Chain Rule' with the echo playing in my head
bam! :)
Teaching is an art. thank you StatQuest
Thank you!
thanks for clearing up the confusions i had with chain rule!
bam!
Hi, I think I found a mistake. (?) The pink ball in the graph from 13:08 should be on the other side of the Y axis. It doesn't change the educational value of the whole video but it caught my eye.
Oh, I see someone already brought this up.
yep
Reading abour Loss in Neural Network and optimization from 20+ sources and could not understand it until watching this video. Big BAM!
Hooray! Thank you!
this is epic, simple, and applicable chain rule in real life too - we need more videos like this damn
Thank you! :)
An epically clear explanation. Thank you so much!
Thank you! :)
you deserve Nobel prize Nobel man
Thank you!
Thank you Sir for the amazing Tutorial.
Thanks!
Great teaching Josh Starmer!
Thank you kindly!
Awesome Explanation Mr. Starmer! I wish your videos existed back when I was taking Calculus in the university!!! ( which was a long time ago =) )
Wow, thanks!
Now I can't read "the chain rule" without hearing your voice !
:)
Your are an amazing teacher !
Thank you! 😃
I would like to thank you from bottom of my heart for such wonderful videos.
Such difficult topic made simple, you are awesome man , keep rocking!!!!
And Triple BAM!!!!
Thank you very much! :)
Another concept well explained ❤
Thanks a lot 😊!
13:27 When the residual is negative, the pink circle is shown to be on the right side of the y-Axis, but shouldn't it be on the left side?
Aside from that, great content! Cheers from Germany
Yep. Thanks for catching that! I've added a correction to the pinned comment.
Your explanation is awesome. Make more videos.
Thank you!
Despite how good you are at explaining, I'm still having a hard time with it all. My confidence isn't exactly helped by the fact that all the other people in the comments seem to somehow be doing PhDs and stuff, but okay...
How can I try to understand it even better?
Can you tell me what time point, minutes and seconds, you first got confused?
I think you must be an alien! This is the best, most simplistic and complete explanation I have seen -ever. Fantastic job you did ❤️ thanks
Thank you!
such a clean and simple explanation! can't wait for more Math and Statistic videos. You are the awesomeness in TH-cam!
Thank you! :)
I'm getting strong MST3K and Star Control II vibes from this guy and that's pretty cool
bam!
13:15 Is the residual(squared) graph mirrored? Since residual=(observed - predicted), wouldn't that mean that when on the original graph the intercept is zero, the residual would be positive(2-1=1), so the position on the residual(squared) graph should be on the positive x-axis(x=1), as opposed to the negative side on the video, and vice versa?
Yes! You are correct. Oops!
This video is not just the explanation of "The Chain Rule" instead explained the intuition behind the various loss functions.
That's right. The chain rule is used a lot in machine learning so I tried to explain it from that perspective.
@@statquest Thanks for all of the videos, they all really help alot
@@dhruvsharma7992 Thanks!
Thanks for the video!
In the last example, why not just plug in height = 2 and weight = 1 to solve for the intercept:
When residual = 0, height - ( intercept + (1*weight)) = 0, so intercept = 1?
Sure, you could solve the equation directly, but the goal is to show how the chain rule works. Furthermore, by using the chain rule, we solve for the general equation and not just a specific equation tied to this specific data.
BAM! best explanation so far
Thank you! :)
A video also on probability chain rule would be awesome
Noted! :)
At 6:54 you said that you fit an exponential line to the graph and got hunger = time^2 + 1/2. I have a few questions about that.
1. I've never heard the phrase 'exponential line' before. Do you just mean an exponential 'line' of best fit?
2. You said that the equation is exponential, but that looks quadratic to me. Am I missing something?
I really like the way you explained this. Once you think about problems in the 'real world' like this it really starts to make sense how changing one function affects and changes the other and then why you need the chain rule to find the rate of change.
1. I just mean that we fit a curve defined by the function hunger = time^2 + 1/2
2. I should have said quadratic instead of exponential. I apologize for any confusion that this may have caused.
@@statquest Thanks for replying so quickly on an older video like this! I'm making some math videos of my own right now and I can't believe how easy it is to misspeak or write something wrong. You've done an amazing job with all your videos. This is the only video I've found that attempts to explain the chain rule in an intuitive way without using the limit definition.
Oh boy that's a teaser for neural net. Been looking forward to this!!
YES!!! This is the first video in my series on Neural Nets!!!!!!! The next one should be out soon (hopefully late July, but I always run behind so maybe early August).
6:52 that's not an exponential line (2^x), it's just a parabola (x^2). Anyhow, you're awesome! BAM! Just subscribed!
Thanks for catching that. :)
Awesome Statquest...
Initially played Song and concept too!!😎😎😎
Thanks! :)
Another explanation I saw on reddit that solidifies my understanding of the chain rule:
"Mommy is a function with a baby function inside. This is how we find out where her bumps are:
Differentiate the mommy- keep the baby inside
Differentiate the baby
Times them together
And you're done :-)
Example: (3x2 + x)4
The mommy is ( )4 , the baby is 3x2 + x
Differentiate the mummy: 4( )3
Keep the baby inside: 4(3x2 + x)3
Differentiate the baby: 6x+1
Times them together: (6x+1)*4(3x2 + x))3 = (24x+4)(3x2 + x)3
And we're done"
For your last example: why is observed and (1*weight) = 0 though? the chain rule makes sense now but Im trying to grasp that math. You say its because the terms do not contain the intercept but what does that mean?
When we take the derivative of something, we take the derivative relative to something. In this case, we take the derivative relative to the intercept. Thus, we are interested in how much things change when the intercept changes. (1 * weight) does not change when the intercept value changes - it stays the exact same, regardless of what value we plug in for the intercept. Since there is 0 change in (1 * weight) when we change the intercept, then the derivative of (1 * weight) with respect to the intercept = 0.
Thanks a lot Sir Josh. Jzakallah. 😊Emotional
Thank you very much! :)
The best Chain Role tutorial! Do you have any for Relu? Thank you!!
Coming soon!
Thanks for informative video.
Thanks!
Awesome Tutorial...
Thank you 🙂!
awsome work man!!!! you have created the best content...... I wish that you should be teaching us at our college🥺
Thank you so much 😀
Amazing video! Back to basics 😄👍
Thanks!
Спасибо, вы молодец!
bam! :)
This be the first time I am laughing learning stats🤣 Thanks alot!
Hooray! :)
Thank you so much for your videos! I got a StatQuest Shirt for my Birthday... hurray! :)
BAM! :)
hanks for all your amazing videos. I'm still learning from you :)
Thank you!
Beautiful Just Beautiful
Thank you! 😊
your videos are fantastic
Glad you like them!
Amazing video thanks!
Thanks!
I'm getting gradually waiting for the "BAM".....I've been addicted to it....
bam! :)
There are people who love StatQuest and there are people who don't know about StatQuest yet... poor souls
Thanks! :)
这个视频不能更好了!收看量至少应该有100W=This video is really awesome! the watch times should at least more than 1millon!~!!!!
Thank you! :)
Simply beautiful. you are the best.
Wow, thank you!
LOVE YOUR CONTENT BEST FUN LEARNING EVER!!! (The chain rule is COOL)
bam!
Josh Starmer, you are a beast!
Thank you! :)
Amazing Video. Helps a lot! Does anyone know an example of an empirical research paper in which the chain rule (two step procedure) is applied in the context of empirical testing of the research question/hypothesis? Thank you very much for a reply!
Logistic Regression uses Gradient Descent, which, in turn, often uses the chain rule
@@statquest Dear Josh, thank you for your answer. I want to concretise my question. I understood from your videos that the chain rule is used in neural networks to solve the optimization problem and also in logistic regression using gradient descent etc.. I'm currently looking for a content example of published research (= a concret study) in which the modelling approach weight (some indepedent variable) predicts height(some other independent variable) and height predicts shoe size(dependent variable). Does anyone know an example of such an empirical research paper? Thank you very much for a reply!
Awesome. You made my day!
Hooray! :)
Bam! You are awesome. Thanks a lot.
bam! :)
Awesomeness = like statquest squared 😆 🤣
Thank you!
Yet another bravo tutorial video! Thank you, Josh! One question is: what visual software/tool do you use to draw those beautiful plots? Are u like 3Blue1Brown to write a JS front-end tool yourself? Thanks!
I'm glad you like the videos! I draw the pictures in Keynote.
3b1b does not use JS front end tool , It's Python animation lib powered by Cairo (C lib) or now it uses Open GL.
So you can think of the derivative here as the velocity of awesomeness w.r.t. likes stat-quest. Neat. So its can be thought of something like 21 A.S.Q. or, 21 awesomenesses per stat-quest
That's awesome! :)