You can consider the equation to be: output=f(Wy(t-1)+Ux(t)+b) because the left side, y(t), might be making it a bit confusing. In the video, I am referring to y as the output of the hidden layer of the RNN, not the final output of the entire network. Therefore, making the left "output" is more meaningful. However the activation is designed, the main idea is that the past output of the hidden layer information goes back with the current input to the RNN. 😃
You can consider the equation to be:
output=f(Wy(t-1)+Ux(t)+b)
because the left side, y(t), might be making it a bit confusing. In the video, I am referring to y as the output of the hidden layer of the RNN, not the final output of the entire network. Therefore, making the left "output" is more meaningful. However the activation is designed, the main idea is that the past output of the hidden layer information goes back with the current input to the RNN. 😃
Yo should make a video derivating the back propagation through time algorithm for training them.
Noted! Thank you for watching and the suggestion.
@@C4A Any update? I need this video either