Let me clarify the concept of learning rate and step size in gradient descent: Learning rate: The learning rate is a hyperparameter that we set before starting the optimization process. It's a fixed value that determines how large our steps will be in general. Step size: The actual size of each step is determined by both the learning rate and the gradient at that point. Specifically: step_size = learning_rate * magnitude_of_gradient So: The learning rate itself is not the size of the steps from point to point. The learning rate is a constant that helps determine how big those steps will be. The actual size of each step can vary, even with a constant learning rate, because it also depends on the gradient at each point. To visualize this: In steep areas of the loss function (large gradient), the steps will be larger. In flatter areas (small gradient), the steps will be smaller. The learning rate acts as a general "scaling factor" for all these steps.
Im always confused by these screens or boards, whatever. Like how do you write on them? Do you have to write backwards or do you write normally and it kinda mirrors it?
Very nice explanation of the concept, brief and understandable. Awesome!
The most confusing part of this video is how he managed to write everything backwards on the glass so flawlessly
can't they write on their normal side then flip the video?
@@sanataeeb969 no that would be way too easy
Bro just focus on the gradient descent topic
@@sanataeeb969Oh shit, you're clever.
Nope he isnt writing backward..you can observe he seems to be using left hand to write ,but in actual right hand was being used
As always, great video from IBM
It is wrong.
Thank You Martin , really helpful for my uni exam
Good explanation. It is somewhat also important to note that curve should be differentiable.
didn't know Steve Kerr works at IBM
The best video i could find. Thank you.
Let me clarify the concept of learning rate and step size in gradient descent:
Learning rate:
The learning rate is a hyperparameter that we set before starting the optimization process. It's a fixed value that determines how large our steps will be in general.
Step size:
The actual size of each step is determined by both the learning rate and the gradient at that point. Specifically:
step_size = learning_rate * magnitude_of_gradient
So:
The learning rate itself is not the size of the steps from point to point.
The learning rate is a constant that helps determine how big those steps will be.
The actual size of each step can vary, even with a constant learning rate, because it also depends on the gradient at each point.
To visualize this:
In steep areas of the loss function (large gradient), the steps will be larger.
In flatter areas (small gradient), the steps will be smaller.
The learning rate acts as a general "scaling factor" for all these steps.
Very good explanation of high-level concept on GD.
Thank you so much!
Wow best explanation ever 👏
Im always confused by these screens or boards, whatever.
Like how do you write on them? Do you have to write backwards or do you write normally and it kinda mirrors it?
great lecture
Nice I learned more from this 7 min video than 1 hour long boring lecture
Your neural network is wrong.
Yeah the neurons are not fully connected 1:43
ANY CHANCE TO GIVE 1000 LIKES???😩
ibm: "how to make a neural network for the stock market?"
I was expecting a mathematical explanation :(
I couldn't visualise, I saw nothing on the screen...
can see it
Too many words