I'm amazed at how amazing this channel is, I watch one video and somehow the next video is even more gold. I'm 28 videos into the playlist and every video is so extremely well edited, put together, narrated, and visualized. The content you make turns the most confusing things to visualize into simple ideas and very clearly laid out. Even taking a entire video to explain the definitions and notation is so incredibility valuable, instead of spewing jargon that might as well be another language, you let the viewer be on the same page before starting. I love this channel, I am so happy I found it, and I can't stop sharing it with the people in my class/lab. Please continuing making videos because I honestly haven't found another channel that even comes close to the level of clear, concise, explanations that this channel is producing!
Wow, Moose! Thank you x 1000 for such a thoughtful and genuine comment! We're really glad that you found our channel and you're here now and part of the community :) Thanks so much for sharing the content with your class and lab!
Simply awesome. Not sure why this channel is not mentioned frequently. Intution, deep maths, programming. So many hard concepts explained with such an ease. Respect !!
your lectures are a hidden treasure. Currently working in devops and now studying for my AWS machine learning certification and this is great for filling my gaps in this space, along with Andrew Ng's content. thanks
Backpropagation explained | Part 1 - The intuition th-cam.com/video/XE3krf3CQls/w-d-xo.html Backpropagation explained | Part 2 - The mathematical notation th-cam.com/video/2mSysRx-1c0/w-d-xo.html Backpropagation explained | Part 3 - Mathematical observations th-cam.com/video/G5b4jRBKNxw/w-d-xo.html Backpropagation explained | Part 4 - Calculating the gradient th-cam.com/video/Zr5viAZGndE/w-d-xo.html Backpropagation explained | Part 5 - What puts the “back” in backprop? th-cam.com/video/xClK__CqZnQ/w-d-xo.html Machine Learning / Deep Learning Fundamentals playlist: th-cam.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html Keras Machine Learning / Deep Learning Tutorial playlist: th-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
I have never subscribed to anything on TH-cam before ... but I was just so happy with these cleanly executed and correct tutourials :). One suggestion would be to add certificates to your courses and give people something to work towards.
Thank you very much for this video! It was great that you used visualizations to explain the notation! I usually don't like looking at mathematical notations but the visualizations made it fun and enjoyable to understand!
Love it and especially that you zoom in. Entirely personal, I know, but I prefer minimal use of bright white backgrounds and the spinning/rotating red rectangles, for focusing attention on certain areas, don't need the rotating effect. I find the rotation distracting and a stressor.
Firstly, I would like to congratulate you on such a well presented video. This has clarified the existing technique of back propagation. It seems to me however that the technique would work best if the loss function is linearly proportional to the weights on each layer. The technique works quite well and is used in many current AI applications. It does however require large training data sets and is not as efficient as the brain which does not use back propagation. Thanks again, this will help me in my research. I will definitely be subscribing.
It may be worth to note that instead of partial derivatives one can work with derivatives as the linear transformations they really are. Also, looking at the networks in a more structured manner makes clear that the basic ideas of BPP apply to very general types of neural networks. Several steps are involved. 1.- More general processing units. Any continuously differentiable function of inputs and weights will do; these inputs and weights can belong, beyond Euclidean spaces, to any Hilbert space. Derivatives are linear transformations and the derivative of a neural processing unit is the direct sum of its partial derivatives with respect to the inputs and with respect to the weights. This is a linear transformation expressed as the sum of its restrictions to a pair of complementary linear subspaces. 2.- More general layers (any number of units). Single unit layers can create a bottleneck that renders the whole network useless. Putting together several units in a unique layer is equivalent to taking their product (as functions, in the sense of set theory). The layers are functions of the of inputs and of the weights of the totality of the units. The derivative of a layer is then the product of the derivatives of the units; this is a product of linear transformations. 3.- Networks with any number of layers. A network is the composition (as functions, and in the set theoretical sense) of its layers. By the chain rule the derivative of the network is the composition of the derivatives of the layers; this is a composition of linear transformations. 4.- Quadratic error of a function. ... --- With the additional text down below this is going to be excessively long. Hence I will stop the itemized previous comments. The point is that a sufficiently general, precise and manageable foundation for NNs clarifies many aspects of BPP. If you are interested in the full story and have some familiarity with Hilbert spaces please google for our paper dealing with Backpropagation in Hilbert spaces. A related article with matrix formulas for backpropagation on semilinear networks is also available. We have developed a completely new deep learning algorithm called Neural Network Builder (NNB) which is orders of magnitude more efficient, controllable, precise and faster than BPP. The NNB algorithm assumes the following guiding principle: The neural networks that recognize given data, that is, the “solution networks”, should depend only on the training data vectors. Optionally the solution network may also depend on parameters that specify the distances of the training vectors to the decision boundaries, as chosen by the user and up to the theoretically possible maximum. The parameters specify the width of chosen strips that enclose decision boundaries, from which strips the data vectors must stay away. When using the traditional BPP the solution network depends, besides the training vectors, in guessing a more or less arbitrary initial network architecture and initial weights. Such is not the case with the NNB algorithm. With the NNB algorithm the network architecture and the initial (same as the final) weights of the solution network depend only on the data vectors and on the decision parameters. No modification of weights, whether incremental or otherwise, need to be done. For a glimpse into the NNB algorithm, search in this platform our video about : NNB Deep Learning Without Backpropagation. In the description of the video links to a free demo software will be found. The new algorithm is based on the following very general and powerful result (google it): Polyhedrons and Perceptrons Are Functionally Equivalent. For the conceptual basis of general NNs in see our article Neural Network Formalism. Regards, Daniel Crespin
{ "question": "Indices are important to understand the interaction between layers and nodes in the backpropagation algorithm. Where l is the layer index:", "choices": [ "j is the node index for l and k is the node index for l-1.", "k is the node index for l and j is the node index for l-1.", "j is the node index for l and k is the node index for l+1.", "k is the node index for l and j is the node index for l+1." ], "answer": "j is the node index for l and k is the node index for l-1.", "creator": "Chris", "creationDate": "2020-04-17T17:36:43.623Z" }
3:44 you've defined that l and l-1 are indexed as j=0,1,...,n-1 and k=0,1,...,n-1, respectively. But what if layer l and l-1 have different number of nodes?
I have observed one topic missing in this playlist!!! " BIAS ", if you add one video which explains the importance of biases that makes this playlist perfect.
Hey, first-time listener here! I'm amazed at the quality of your videos, easy-to-listen-to, calm beautiful voice and with the right pacing of the concepts you are teaching. I like the way you simplify complex concepts to newbies in ML like me (a Ph.D. student in Biomedical Engineering btw). I was wondering about the Scientific notebook you use in these videos...who makes this software? Also, I'm interested in de-noising images. Is it readily possible to train a network to recognize noise within the image, develop a noise template and use it to remove the noise w/o softening the image? Current ML algorithms I've seen, involving mostly reducing MSE blur sharp image details. I'm looking at noisy images from low dose CT scanners as that's the rage now, to try to x-ray reduce deadly dosage to the patient. Your channel is gold, as others have said. I watch one video after another without stopping and they are just about the right length before one loses focus. If it wasn't for the fact that we are in Spring break this week, I'd be skipping classes by mistake, as I'm currently grossed watching your videos. Video #24 so far since last night(!)... 14 more to go. Hope to finish this evening. Not sure what to do next :) ! Thanks again for this great channel, deeplizard ... Cool and ingenious name, btw.
Hey David - Haha that's great! Glad you're enjoying content :D Regarding the notebook, it was created using MacKichan Software's Scientific Notebook 5.5. Regarding denoising images, autoencoders have been used to accomplish this. I've not worked much with them, but I cover them a bit in the episode on unsupervised learning and show a brief denoising example. deeplizard.com/learn/video/lEfrr0Yr684 Regarding where to go next, I'd recommend the Keras series :) deeplizard.com/learn/playlist/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL
Thanks for the quick response and helpful links, deeplizard. One more question and that's regarding back-propagation (BP) video. Does one BP to the preceding inner layer L-1 and have weights recalculated or do you BP to the very first inner layer and then have new weights re-calculated for all proceeding layers throughout the network? Thanks again!
Great tutorials, very well done. Thank you for them. A small fix for the code in the site, especially j++, otherwise you get an infinite loop :). int sum = 0; int j = 0; while (j < a.length) { sum = sum + a[j++]; }
During the time 1:24 to 2:50 you are talking a describing things BUT there is Nothing on the screen. You need more pictures that describe your words. Also, the next slide "Definitions & Notation" has way a list of items with no arrows pointing to what it is in the diagram on the right. I was lost... This is a visual media!
During that time in the episode, I am summarizing the process that was taught in the previous episode. The corresponding visuals for that process, along with a more detailed explanation, can be found in the previous episode. Additionally, there are corresponding written posts for most videos that you can find on deeplizard.com that may be helpful for you. The one for this video is here: deeplizard.com/learn/video/2mSysRx-1c0 The previous episode I mentioned is here: deeplizard.com/learn/video/XE3krf3CQls
Why in theory argument of derivative is sum(w*x) to the activation function and in all code realisations argument of derivative is output of activation function?
Hey Sebastian - Download access to code files and notebooks are available as a perk for the deeplizard hivemind. Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind If you choose to join, you will gain download access to the math notebook from the backprop series here: www.patreon.com/posts/22080906 Note, the notebook was created using MacKichan Software's Scientific Notebook 5.5. The notebook is a .tex file. To open this file, you can download Scientific Viewer for free from the link below, which will allow you to view the notebook but not edit it. www.mackichan.com/index.html?products/sv.html~mainFrame You may also want to check out Scientific Notebook for purchase or a free trial from the link below if you want to create, edit, and save your own notebooks. www.mackichan.com/index.html?products/dnloadreq.html~mainFrame
Thank you so much for the explanation! Is knowing math operations in details mandatory if I plan to use deep learning for image processing? or just knowing how it works with no math details?
You're welcome! It isn't required to understand the math in order to build or use a neural network for image processing. For example, you'll see in the Keras playlist (th-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html) that we don't use much math when building and coding our networks. Developing an understanding for the math does, however, give you a deeper understanding for what otherwise looks like magic. Also, if you do understand the math, then it may help you with designing your architecture, tuning your model, and even troubleshooting when the model is not performing in a way that you'd expect.
deeplizard yes I watched 3 videos so far but I stopped a little bit to finish this great playlist! Honestly I’m so excited and don’t want this playlist to end so please add more and more videos to this playlist!
Hey Daniel - By "moving forward," I meant this as in "moving forward in our process of learning backprop." I wasn't meaning it in a way that suggests backprop is used in forward manner. Hope this helps clarify! Let me know if it doesn't.
best channel on ML ever, clean, crisp, and a beautiful voice !!!
I'm amazed at how amazing this channel is, I watch one video and somehow the next video is even more gold. I'm 28 videos into the playlist and every video is so extremely well edited, put together, narrated, and visualized. The content you make turns the most confusing things to visualize into simple ideas and very clearly laid out. Even taking a entire video to explain the definitions and notation is so incredibility valuable, instead of spewing jargon that might as well be another language, you let the viewer be on the same page before starting. I love this channel, I am so happy I found it, and I can't stop sharing it with the people in my class/lab.
Please continuing making videos because I honestly haven't found another channel that even comes close to the level of clear, concise, explanations that this channel is producing!
Wow, Moose! Thank you x 1000 for such a thoughtful and genuine comment! We're really glad that you found our channel and you're here now and part of the community :) Thanks so much for sharing the content with your class and lab!
It jargon only you've not studied highschool math
this channel is gold
Thanks, Gilang! Glad you think so!
Simply awesome. Not sure why this channel is not mentioned frequently. Intution, deep maths, programming. So many hard concepts explained with such an ease. Respect !!
your lectures are a hidden treasure. Currently working in devops and now studying for my AWS machine learning certification and this is great for filling my gaps in this space, along with Andrew Ng's content. thanks
I was trying to find an entry point into machine learning somewhere on the internet and this course is just that, I wish I had found it sooner
I love this teacher. This channel needs more subs man!
I love your videos, I love your voice. Recommended your channel many times to friends!
Backpropagation explained | Part 1 - The intuition
th-cam.com/video/XE3krf3CQls/w-d-xo.html
Backpropagation explained | Part 2 - The mathematical notation
th-cam.com/video/2mSysRx-1c0/w-d-xo.html
Backpropagation explained | Part 3 - Mathematical observations
th-cam.com/video/G5b4jRBKNxw/w-d-xo.html
Backpropagation explained | Part 4 - Calculating the gradient
th-cam.com/video/Zr5viAZGndE/w-d-xo.html
Backpropagation explained | Part 5 - What puts the “back” in backprop?
th-cam.com/video/xClK__CqZnQ/w-d-xo.html
Machine Learning / Deep Learning Fundamentals playlist: th-cam.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html
Keras Machine Learning / Deep Learning Tutorial playlist: th-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
0:00 Introduction
0:40 Outline
1:26 Backpropagation Recap
2:50 Definitions and Notation
7:20 Review
10:00 About the next video
I hope this is useful!
Perfect, thank you! Added to the description :)
I have never subscribed to anything on TH-cam before ... but I was just so happy with these cleanly executed and correct tutourials :). One suggestion would be to add certificates to your courses and give people something to work towards.
Thank you very much for this video! It was great that you used visualizations to explain the notation! I usually don't like looking at mathematical notations but the visualizations made it fun and enjoyable to understand!
This video is so awesome!!, thank you for the clear explanations on Backpropagation. You lady are a great teacher!
You are a savior ! This series is so amazing !
You are 100 times better than the Professor of ML at my uni.
Videos have so nicely explained ,especially the math.Thanks so much
Very detailed and carefully done. Thank you so much for your work.
Love it and especially that you zoom in. Entirely personal, I know, but I prefer minimal use of bright white backgrounds and the spinning/rotating red rectangles, for focusing attention on certain areas, don't need the rotating effect. I find the rotation distracting and a stressor.
Firstly, I would like to congratulate you on such a well presented video. This has clarified the existing technique of back propagation. It seems to me however that the technique would work best if the loss function is linearly proportional to the weights on each layer. The technique works quite well and is used in many current AI applications. It does however require large training data sets and is not as efficient as the brain which does not use back propagation. Thanks again, this will help me in my research. I will definitely be subscribing.
amazing work !! thanks u for making ML so easy to understand
It may be worth to note that instead of partial derivatives one can work with derivatives as the linear transformations they really are.
Also, looking at the networks in a more structured manner makes clear that the basic ideas of BPP apply to very general types of neural networks. Several steps are involved.
1.- More general processing units.
Any continuously differentiable function of inputs and weights will do; these inputs and weights can belong, beyond Euclidean spaces, to any Hilbert space. Derivatives are linear transformations and the derivative of a neural processing unit is the direct sum of its partial derivatives with respect to the inputs and with respect to the weights. This is a linear transformation expressed as the sum of its restrictions to a pair of complementary linear subspaces.
2.- More general layers (any number of units).
Single unit layers can create a bottleneck that renders the whole network useless. Putting together several units in a unique layer is equivalent to taking their product (as functions, in the sense of set theory). The layers are functions of the of inputs and of the weights of the totality of the units. The derivative of a layer is then the product of the derivatives of the units; this is a product of linear transformations.
3.- Networks with any number of layers.
A network is the composition (as functions, and in the set theoretical sense) of its layers. By the chain rule the derivative of the network is the composition of the derivatives of the layers; this is a composition of linear transformations.
4.- Quadratic error of a function.
...
---
With the additional text down below this is going to be excessively long. Hence I will stop the itemized previous comments.
The point is that a sufficiently general, precise and manageable foundation for NNs clarifies many aspects of BPP.
If you are interested in the full story and have some familiarity with Hilbert spaces please google for our paper dealing with Backpropagation in Hilbert spaces. A related article with matrix formulas for backpropagation on semilinear networks is also available.
We have developed a completely new deep learning algorithm called Neural Network Builder (NNB) which is orders of magnitude more efficient, controllable, precise and faster than BPP.
The NNB algorithm assumes the following guiding principle:
The neural networks that recognize given data, that is, the “solution networks”, should depend only on the training data vectors.
Optionally the solution network may also depend on parameters that specify the distances of the training vectors to the decision boundaries, as chosen by the user and up to the theoretically possible maximum. The parameters specify the width of chosen strips that enclose decision boundaries, from which strips the data vectors must stay away.
When using the traditional BPP the solution network depends, besides the training vectors, in guessing a more or less arbitrary initial network architecture and initial weights. Such is not the case with the NNB algorithm.
With the NNB algorithm the network architecture and the initial (same as the final) weights of the solution network depend only on the data vectors and on the decision parameters. No modification of weights, whether incremental or otherwise, need to be done.
For a glimpse into the NNB algorithm, search in this platform our video about :
NNB Deep Learning Without Backpropagation.
In the description of the video links to a free demo software will be found.
The new algorithm is based on the following very general and powerful result (google it): Polyhedrons and Perceptrons Are Functionally Equivalent.
For the conceptual basis of general NNs in see our article Neural Network Formalism.
Regards,
Daniel Crespin
{
"question": "Indices are important to understand the interaction between layers and nodes in the backpropagation algorithm. Where l is the layer index:",
"choices": [
"j is the node index for l and k is the node index for l-1.",
"k is the node index for l and j is the node index for l-1.",
"j is the node index for l and k is the node index for l+1.",
"k is the node index for l and j is the node index for l+1."
],
"answer": "j is the node index for l and k is the node index for l-1.",
"creator": "Chris",
"creationDate": "2020-04-17T17:36:43.623Z"
}
More great questions, thanks Chris!
Just added your question to deeplizard.com/learn/video/2mSysRx-1c0 :)
Thank you very much! This is very important
3:44 you've defined that l and l-1 are indexed as j=0,1,...,n-1 and k=0,1,...,n-1, respectively. But what if layer l and l-1 have different number of nodes?
Notation is often omitted in videos. Good work
I have observed one topic missing in this playlist!!! " BIAS ", if you add one video which explains the importance of biases that makes this playlist perfect.
Thanks for the suggestion, Pawan! I have bias on my list to cover in a future vid!
Now, there is a video on bias 😎
th-cam.com/video/HetFihsXSys/w-d-xo.html
Thats great!!! Hitting the point!!! M glad that "deeplizard' has considered my suggestion.
Hey, first-time listener here! I'm amazed at the quality of your videos, easy-to-listen-to, calm beautiful voice and with the right pacing of the concepts you are teaching. I like the way you simplify complex concepts to newbies in ML like me (a Ph.D. student in Biomedical Engineering btw). I was wondering about the Scientific notebook you use in these videos...who makes this software? Also, I'm interested in de-noising images. Is it readily possible to train a network to recognize noise within the image, develop a noise template and use it to remove the noise w/o softening the image? Current ML algorithms I've seen, involving mostly reducing MSE blur sharp image details. I'm looking at noisy images from low dose CT scanners as that's the rage now, to try to x-ray reduce deadly dosage to the patient.
Your channel is gold, as others have said. I watch one video after another without stopping and they are just about the right length before one loses focus. If it wasn't for the fact that we are in Spring break this week, I'd be skipping classes by mistake, as I'm currently grossed watching your videos. Video #24 so far since last night(!)... 14 more to go. Hope to finish this evening. Not sure what to do next :) !
Thanks again for this great channel, deeplizard ... Cool and ingenious name, btw.
Hey David - Haha that's great! Glad you're enjoying content :D
Regarding the notebook, it was created using MacKichan Software's Scientific Notebook 5.5.
Regarding denoising images, autoencoders have been used to accomplish this. I've not worked much with them, but I cover them a bit in the episode on unsupervised learning and show a brief denoising example.
deeplizard.com/learn/video/lEfrr0Yr684
Regarding where to go next, I'd recommend the Keras series :)
deeplizard.com/learn/playlist/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL
Thanks for the quick response and helpful links, deeplizard. One more question and that's regarding back-propagation (BP) video. Does one BP to the preceding inner layer L-1 and have weights recalculated or do you BP to the very first inner layer and then have new weights re-calculated for all proceeding layers throughout the network? Thanks again!
Great tutorials, very well done. Thank you for them.
A small fix for the code in the site, especially j++, otherwise you get an infinite loop :).
int sum = 0;
int j = 0;
while (j < a.length) {
sum = sum + a[j++];
}
During the time 1:24 to 2:50 you are talking a describing things BUT there is Nothing on the screen. You need more pictures that describe your words. Also, the next slide "Definitions & Notation" has way a list of items with no arrows pointing to what it is in the diagram on the right. I was lost... This is a visual media!
During that time in the episode, I am summarizing the process that was taught in the previous episode. The corresponding visuals for that process, along with a more detailed explanation, can be found in the previous episode. Additionally, there are corresponding written posts for most videos that you can find on deeplizard.com that may be helpful for you. The one for this video is here:
deeplizard.com/learn/video/2mSysRx-1c0
The previous episode I mentioned is here:
deeplizard.com/learn/video/XE3krf3CQls
Why in theory argument of derivative is sum(w*x) to the activation function and in all code realisations argument of derivative is output of activation function?
love ur knowledge
Hi Mandy, where does the weight of the connection come from?
Thank you soooooo much PLEASE do a video on LTSM's
You're welcome! Also, I have LSTMs on my list to cover in a future video!
Great to hear that!
Whew! Now onto the next ... :)
Glad to see you progressing through the content, Richard!
Much better explanation than Andrew Ng!
The node indexing implies that every layer has the same amount of nodes which shouldn't be a restriction
Where are the weights from layer l to the output?
Can you share that "mathematic scientific notebook" that you are reading from in the video?
Hey Sebastian - Download access to code files and notebooks are available as a perk for the deeplizard hivemind. Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind
If you choose to join, you will gain download access to the math notebook from the backprop series here:
www.patreon.com/posts/22080906
Note, the notebook was created using MacKichan Software's Scientific Notebook 5.5. The notebook is a .tex file. To open this file, you can download Scientific Viewer for free from the link below, which will allow you to view the notebook but not edit it.
www.mackichan.com/index.html?products/sv.html~mainFrame
You may also want to check out Scientific Notebook for purchase or a free trial from the link below if you want to create, edit, and save your own notebooks.
www.mackichan.com/index.html?products/dnloadreq.html~mainFrame
Do you not consider bias? I assume it's excluded for simplicity?
Thank you so much for the explanation! Is knowing math operations in details mandatory if I plan to use deep learning for image processing? or just knowing how it works with no math details?
You're welcome! It isn't required to understand the math in order to build or use a neural network for image processing. For example, you'll see in the Keras playlist (th-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html) that we don't use much math when building and coding our networks.
Developing an understanding for the math does, however, give you a deeper understanding for what otherwise looks like magic. Also, if you do understand the math, then it may help you with designing your architecture, tuning your model, and even troubleshooting when the model is not performing in a way that you'd expect.
deeplizard yes I watched 3 videos so far but I stopped a little bit to finish this great playlist! Honestly I’m so excited and don’t want this playlist to end so please add more and more videos to this playlist!
I'm so happy to hear how much you're enjoying the content! Thanks for letting me know :)
More videos to come soon!
الانترنت لحياة أسهل Net4Easy exactly, I also want more abs more videos like these
Amazing
You said, "use the math for backprop moving forward" ?
Hey Daniel - By "moving forward," I meant this as in "moving forward in our process of learning backprop." I wasn't meaning it in a way that suggests backprop is used in forward manner. Hope this helps clarify! Let me know if it doesn't.
i was loving you but now i love so mush.thank you so mush
So I assume n is number of nodes in a layer. That's the only logical answer that works here.
That's correct.