A learning rate that is too small can get trapped in a local minima with no escape. A learning rate that is too large will never find the global minima because it will keep "stepping" over it.
Been tinkering for a day (and night lol).... Your training set has only 10 items and 10 classifiers assigned to it. You go through the list, make a prediction for every element. Everything is predicted wrong, because your network is fresh off the shelf and unknowing what you want from it. If an item is predicted wrong, backprop only once and then go to the next item. Skip items that have been predicted correctly. For every item in the dataset there is a count that's being incremented everytime you have to backprop and the learning rate will be count * 0.9 (really). For correct predictions count will reset to 1. So....items with low success rate will be trained everytime you loop though your data but at a monster rate, while successful items or items that have to be trained only occationally will have lower learning rate. The overall process is done, if no backprop happened inside the loop. 😘 And this is my criteria, it has to be incredible fast learning at high framerate while doing all the others stuff like extracting unknown objects from my desktop screen, all things that Johnny-Boy has never seen before.😁 Training should not be the bottleneck of AGI, I mean....am I really so clever (lol)? Why not running me on a 500MHz Computer? It should be doable 😎
Another phenomenon about constant learning rate is that the last elements in the list finish training first because the network seems to forget about the beginning of the list. Learn something and forget about the others items! That's not what I want! It seems that "dynamic monster learning" where LR is constantly fluctuating doesn't have so much trouble about forgetting, so it's more independent of the ordering of the list!
A learning rate that is too small can get trapped in a local minima with no escape. A learning rate that is too large will never find the global minima because it will keep "stepping" over it.
Thank you, these videos are very helpful :)
what a helpful videos!Thank you! :)
Been tinkering for a day (and night lol)....
Your training set has only 10 items and 10 classifiers assigned to it.
You go through the list, make a prediction for every element. Everything is predicted wrong, because your network is fresh off the shelf and unknowing what you want from it.
If an item is predicted wrong, backprop only once and then go to the next item. Skip items that have been predicted correctly.
For every item in the dataset there is a count that's being incremented everytime you have to backprop and the learning rate will be count * 0.9 (really). For correct predictions count will reset to 1.
So....items with low success rate will be trained everytime you loop though your data but at a monster rate, while successful items or items that have to be trained only occationally will have lower learning rate.
The overall process is done, if no backprop happened inside the loop. 😘 And this is my criteria, it has to be incredible fast learning at high framerate while doing all the others stuff like extracting unknown objects from my desktop screen, all things that Johnny-Boy has never seen before.😁
Training should not be the bottleneck of AGI, I mean....am I really so clever (lol)? Why not running me on a 500MHz Computer? It should be doable 😎
Another phenomenon about constant learning rate is that the last elements in the list finish training first because the network seems to forget about the beginning of the list. Learn something and forget about the others items! That's not what I want!
It seems that "dynamic monster learning" where LR is constantly fluctuating doesn't have so much trouble about forgetting, so it's more independent of the ordering of the list!
by if you have stick to learning you probably have discovered the answer huh 😁
you are so pretty..