Tutorial 31- Back Propagation In Recurrent Neural Network

Krish Naik

มุมมอง 140 103

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 11 ก.ย. 2024
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
/ @krishnaik06
Please do subscribe my other channel too
/ @krishnaikhindi
Take the best online course on Data Science
www.appliedaic...
Connect with me here:
Twitter: / krishnaik06
facebook: / krishnaik06
Instagram: / krishnaik06

ความคิดเห็น • 115

@TheFirstObserver 4 ปีที่แล้ว ⁺⁴⁴
I have to admit, I was stuck on Recurrent networks for the longest time until I found your forward and backward propagation videos. Everything seems to have finally clicked. Thank you!!
@Official-tk3nc 4 ปีที่แล้ว ⁺⁴⁶
if you are watching this in lockdown you are one of the rare species on the earth . many students are wasting their time on facebook, youtube, twitter, netflix, watching movies playing pubg, but you are working hard to achieve something . ALL the best ...nitj student here
@Virtualexist 4 ปีที่แล้ว
Yo ! Thanks for sharing your Director with us ;-).
@charukant100 ปีที่แล้ว
@@Virtualexist Director?
@Virtualexist ปีที่แล้ว
@@charukant100 Yes. 😂😂 It's an inside college joke.
@manikantasai4766 4 ปีที่แล้ว ⁺¹³
I defintely recommand your tutorials to data science learners who ever asks me ....before entering into this feild.
When i think to give up all.... just i go through the video then i think how easy this made ..you motivates a me lot
Thanks helping Sir.
@nuzhat_tasfia3431 2 ปีที่แล้ว ⁺⁵
Only thanking you is not enough. Lots of dua for you Sir.
@overdrivegain 4 ปีที่แล้ว ⁺⁵
Danke from Germany!
@creed908 6 วันที่ผ่านมา
video is good,
if you want take this as a suggestion,
try to write the generalized equation at the end.
thankyou for a good video
@dheerajkumar9857 3 ปีที่แล้ว ⁺²
Krish, "agr aap na hote to kya hota" , i am thinking :) , amazing explanation. could you please make a video on what is loss function.
@Vijay-cz7pe ปีที่แล้ว ⁺¹
We are computing the gradients at every timestep right. What is the use of all these timestep gradients, as we update the weights only after 1 iteration right. Moreover the weights are the same throughout the iteration at every time step.
@romananalytics2182 3 ปีที่แล้ว ⁺²
Great content Krish! just a suggestion, try using different color markers to explain such contents where you have existing forward propagation and explaining back propagation on top of it!!
@raahulkalyaan8391 3 ปีที่แล้ว ⁺⁷
How does weights get updated during reverse time? As he mentioned that the diagram is the same and he has drawn it only to show how it works in time frames. So is it something like for every time frame we use different set of weight? or didn't I understand the concept properly?
@deepknowledge2505 3 ปีที่แล้ว
th-cam.com/video/Xeb6OjnVn8g/w-d-xo.html
He has explained it in a simple way
@tkhankhoje39 2 ปีที่แล้ว
back propagation is in different direction than normal ann ie through time patterns i guess
@nitayg1326 4 ปีที่แล้ว ⁺⁶
Bought your book. Liked it so far!
@nahidzeinali1991 หลายเดือนก่อน
Great video Krish. Keep it up.Thanks
@sambitnath9853 4 ปีที่แล้ว ⁺⁴
Great content 🙏
@ashishmehta2198 4 ปีที่แล้ว ⁺⁵
Hello sir. The videos and the entire deep learning playlist is very informative. Learnt alot from you. Thank you so much.
I had a doubt.
Dont we apply learning rate while updating the weights in RNN?
@hasiburrahman96 4 ปีที่แล้ว ⁺²
Yes learning rate also applied.
@sudhadpurity ปีที่แล้ว ⁺¹
Yes
@islamicinterestofficial 4 ปีที่แล้ว ⁺²
Sir please make a video that how the backpropagation equations are derived in RNN. Thanks Sir.......
@akashpawar9058 4 ปีที่แล้ว ⁺²
You work hard bro
2 ปีที่แล้ว
watching the first 10 lectures will make your process faster than ever!
@22shubhankar22 9 หลายเดือนก่อน
bro i have an exam tomorrow and i guess i am doomed because all this is too much for my small brain. thanks for explaining.. idk what u did but it sounds correct.. keep it up man
@namansharma9490 2 ปีที่แล้ว ⁺¹
hi I have a doubt regarding the topic pls see though it
1st . when we find the weight w then we have to find it only once ?? (since weight is shared )
2nd can u pls tell how we can calculate/update the weight w'
pls reply thanks
@ashishjain871 3 ปีที่แล้ว ⁺¹
Nicely done. The notation could have been slightly different to make it slightly less confusing.
@dmg8529 ปีที่แล้ว ⁺¹
5:30 while doing back prop after finding Loss, do we update same w 4 times, cuz it's a same w in all time steps ?.
@mizgaanmasani8456 4 ปีที่แล้ว ⁺⁴
Is y^ is only dependent on w'' not on O4? why do we neglect O4 and w'' while updating the weights in back propagation? waiting eagerly for the answer...
@hasiburrahman96 4 ปีที่แล้ว
Where you found negate o4 and w" in backs propagation? Look at the derivative dL/dw
@9465-e7m 4 ปีที่แล้ว ⁺³
Is input weight at layer t-1 and output weight at layer t-1 are the same or different?
if different what is the relation?
@pankajjoshi8292 4 ปีที่แล้ว ⁺²
Happy new year , u r doin so much efforts yaar. God bless
@beingraja9586 4 ปีที่แล้ว
Happy new year 2020
@sahubiswajit1996 4 ปีที่แล้ว ⁺⁴
Sir,
@ 3:41 I think that, It will
W`` = W`` - [ (learning rate) * (dL/dw``) ]
But in the video, it is written that
W``= W`` - (dL/dw``) =================> learning rate is missing
Am I right sir?
@krishnaik06 4 ปีที่แล้ว ⁺⁸
Yes it is.. sorry missed the learning rate
@rijulsingh9803 3 ปีที่แล้ว ⁺²
Great explanation sir! Just one doubt, since the weights are same for inputs, do we need to propagate back through all the time sequences?
@ansariarbaz3374 3 ปีที่แล้ว ⁺⁴
Good question bro. Yes each input (word here in NLP case) has its own features and affecting the output in that way. So that affect factor or weight for each input must be different. To find that optimal value we have to change each weight.
@rijulsingh9803 3 ปีที่แล้ว ⁺¹
@@ansariarbaz3374 oh I get it now! Thanks a lot!
@commonboy1116 4 ปีที่แล้ว
Krish you are best as always
@donaldzhou895 4 ปีที่แล้ว ⁺²
So for each process of weight updating, the RNN updates T times the length of the sequence?
@harshstrum 4 ปีที่แล้ว ⁺¹
thank you for the effort you are making
@apunbhagwan4473 3 ปีที่แล้ว ⁺¹
So much for Happy new year 2020😀😁
@delllaptop5971 4 ปีที่แล้ว ⁺²
you mentioned that w and w' is the same for all inputs and outputs in the beginning but during back propagation you state that each w and w' will change? Arent they all the same?
@TheFirstObserver 4 ปีที่แล้ว ⁺²
The goal of back propagation is to determine how much the weights need to change to minimize error. While this is a single layer network, we treat it as a t*layer network (with t being timesteps), but all using the same neuron. Thus, the weights are all the same at each layer. If I understood correctly, we determine how much the weights change each layer (as if this were a normal t*layer network), sum the changes, and update w and w'. TLDR they're the same during calculating, then changed at the end.
@pawanbisht3488 4 ปีที่แล้ว
Happy new year 2020 (seriously) [Just kidding ] love your work and appreciate your hard work.
@ITSimplifiedinHINDI 8 หลายเดือนก่อน
I am learning on "Happy New Year 2024". 😀
@divyanshushrivastava8451 3 ปีที่แล้ว
So in this situation, we are not going to add the learning rate in the formula to compute the new weights?
Like we were doing in the ANN back Propogation?
@DineshBabu-gn8cm 3 ปีที่แล้ว ⁺³
Is each hidden layer giving an output y^ or in final we are getting a single output y^ ?
because in the beginning of this video he told that this is single hidden layer based on different time intervals.
what is w ' ' .... is w' and w' ' are same ?
please anyone explain.
@amitprajapati928 2 ปีที่แล้ว
yes I guess
@dhirendra2.073 2 ปีที่แล้ว
So basically this is a many to one RNN , so multiple inputs single output . Example can predicting the last word of a sentence from previous words . So each hidden layer does not give an output Y^ , it gives the current information as output which is transferred to next layer and so on . w' and w'' are different , one is for each hidden layer for multiplying the input and the w'' is for multiplying the information / output being transferred from one layer to next
@harshvardhanagrawal 2 หลายเดือนก่อน
I have a question: Why does input of softmax has a different weight w' ' and not w ' ?
@manishbolbanda9872 3 ปีที่แล้ว ⁺¹
in the previous video, you said that weight assigned with input features that is x1*w or x2*w is the same which is represented by w (as in this video as well) and while explaining weight update at 4:48 you said that various equations will be considerd while updating weights, if weights are same then why is it needed to update weights independently ??an quite confused with this video.
@adwaitpatil8300 3 ปีที่แล้ว
Because the output of each layer is diff so while backpropogating w for each time stamp will change
@gouravnaik3273 2 ปีที่แล้ว
did you get the actual reason ?
@sachinpriya88 3 ปีที่แล้ว
Is Elman Recurrent NN or Simple Recurrent NN anonyms to each other or they r different in theory?
@arjundev4908 4 ปีที่แล้ว
small confusion sir.. please clarify.. dl/dw = (dl/dyp x dyp/d04 x d04/dw)
@ganeshsubramanian6217 2 ปีที่แล้ว
sir, did you missed to mention about the learning rate to be multiplied while updating the weights?
@amitjajoo9510 4 ปีที่แล้ว ⁺¹
Thanks 😊 sir
@rajathslr 2 ปีที่แล้ว
In this video, where is the output 'Y' defined ? is it part of the dataset where there is a collection of sentences and thier sentiments are represented as "Y" output column?
@rohanprabhu9899 4 ปีที่แล้ว ⁺¹
What about the learning rate that needs to be multiplied to the partial derivative when calculating the new weights while backpropogating?
@omadeus ปีที่แล้ว
He forgot about it or ignored for simplicity I think
@sounakmojumder808 4 ปีที่แล้ว
hi just one thing its not global minima it's a local one.......in any case
@subhajitmondal5230 2 ปีที่แล้ว
You have said you will reach the global minima…. How can you prove that RNN will reach global minima….any reference?
@kavinvignesh2832 หลายเดือนก่อน
Sir I think this video is incomplete which makes it kinda wrong. Can you please reupload an updated and correct version.
@RodrigoLopezF 2 ปีที่แล้ว
Thanks Krish!
@abhisai594 2 ปีที่แล้ว
Any one answer please. So an input of 10 words are passed to RNN. After the first word is processed, we get 'y1' and also 'o1' . Now the output we get is not final. But will 'y1' this be passed through the activation layer and intermediate result or the activation happens only when the 10th word is trained?
@alphonseinbaraj2959 4 ปีที่แล้ว
input weights and output weights are same ? I know input weights are same and output weights are same .
@yashumahajan7 4 ปีที่แล้ว
is the hidden state and hidden layer is the same also this hidden layer is same as we have the in ANN so how the dimension of this calculated
@sanjaybalasubramanian9468 4 ปีที่แล้ว ⁺¹
Hlw bro...
Which laptop is best for machine learning (for beginners) under 40000?
Can I use it for long time (After becoming an intermediate)? Or should I upgrade?
@kishanlal676 4 ปีที่แล้ว ⁺¹
Any laptop with minimum 8GB of RAM. You can use Google colab once you get into Deep Learning or you can update your RAM storage. It's up to you! Eventhough 4GB is enough for beginners, you can't depend on 4GB for some algorithms like 'Random Forest'. So I recommend you to buy any laptop with minimum of 8GB RAM.
@srikarkodakandla2825 4 ปีที่แล้ว
You can use Kaggle or Google Colab for training,No need for a excellent computing computer
@saranzeb2183 3 ปีที่แล้ว
but in RCNN the backprpgation is done in time steps as opposed to other neural networks
@moeshams4504 4 ปีที่แล้ว
you are the best
@louerleseigneur4532 3 ปีที่แล้ว
Thanks Krish
@FamFitFun 4 ปีที่แล้ว
short and precise :)
@rahuldey6369 3 ปีที่แล้ว
the gradient dL/dw=(dL/dO4) (dO4/dw)
@harishlakshmanapathi1078 3 ปีที่แล้ว
I have a doubt so while forward propagation the same weights are used for each time step right, but while back propagation we are talking about different weights for taking derivative and updating them, I am confused now...
@gourav9608 2 ปีที่แล้ว
did you got the conclusion
@apica1234 3 ปีที่แล้ว
How do we use RNN for multivariate time series models?
@YoussefBerro 5 หลายเดือนก่อน
🔥🔥🔥🔥
@sriranjaniganesan 3 ปีที่แล้ว
What is the difference between w' and w1?
@EkNidhi 4 ปีที่แล้ว
thanku sirrr
@ayushsingh-qn8sb 3 ปีที่แล้ว
So no learning rate in RNN?
@vatsal_gamit 3 ปีที่แล้ว
I think learning rate also should be there in that formula written.
@urvashiarora6243 ปีที่แล้ว
yes
@ronylpatil 3 ปีที่แล้ว
Please make a seperate video on how actually the chain rule working in back propagation and please someone help me that what will be the chain rule equation for w of x11 and w' of O1.
@Official-tk3nc 4 ปีที่แล้ว ⁺¹
yeah happy 2020 :(:(:(
@shahrinnakkhatra2857 3 ปีที่แล้ว ⁺¹
But in Andrew NG's video from the deep learning course of coursera, there were outputs (y^) counted for each feature seperately. But you calculated one final output only. Will these two methods generate different results?
@midun2977 3 ปีที่แล้ว ⁺²
That's a many to many architecture, here we're assuming a many to one architecture. That's why we have many input units but one output.
@shahrinnakkhatra2857 3 ปีที่แล้ว
@@midun2977 yes I got that part later as I progressed through the coursera course. Thank you.
@adityajacob2246 4 ปีที่แล้ว ⁺¹
I am planning to buy a laptop for machine learning ....but confused between Mac book air and Lenovo legion y540...which will be better?
@srikarkodakandla2825 4 ปีที่แล้ว ⁺²
Try buying a laptop with good gpu so that you can complete training faster
@adityapatnaik6079 4 ปีที่แล้ว ⁺¹
YOU LOOK TIRED ! TAKE REST AND BOUNCE BACK
@nitayg1326 4 ปีที่แล้ว ⁺²
Isnt there a bias in RNN?
@subhajitmajumder957 4 ปีที่แล้ว ⁺¹
To understand properly we can take bias as zero.
@srikarkodakandla2825 4 ปีที่แล้ว
yes,you have to add bias in rnn
@RajibDas-kq2uz 3 ปีที่แล้ว
how will i find you w` any suggestion?
@maheshbiradar374 3 ปีที่แล้ว
can anyone clear my doubt? My doubt is " Number of words is equal to no of hidden layers?"
@vlogwithdevesh9914 4 ปีที่แล้ว
Thus we can conclude that the only difference in weight updation b/w RNN and ANN is the learning rate absence coz it seems like rest all process is same. Please tell!
@adhiyamaanpon4168 4 ปีที่แล้ว ⁺¹
krish sir, forgot to include learning rate
@nithinm.kannal467 ปีที่แล้ว
when it knows to end i mean it has reached global minima
@ashwathjadhav2859 ปีที่แล้ว
Hi everyone,
can anyone tell me how does the chain rule applies on the w' weights, and how we update them?
@bukkalapraneeth7177 ปีที่แล้ว
Actually you need to add all errors for calculating dl/dw, like
dl/dw=dl1/dw+dl2/dw+dl3/dlw+..,
(dl1/dw is for time stamp 1, dl2/dl3 for time stamp 2,...).
Then, you can update w=w-dl/dw, as w is same in all layers loss..
@aravindnaidu1286 3 ปีที่แล้ว
if we change the weights of neural network "w" will that weight "w" will change in all neural networks Plz help!!!
@deepknowledge2505 3 ปีที่แล้ว
th-cam.com/video/Xeb6OjnVn8g/w-d-xo.html
@deepknowledge2505 3 ปีที่แล้ว
He has explained in a simple way
@RanjanKumar-ue5id 4 ปีที่แล้ว
only last updated value of W will be in record ?
@onemanshow3274 3 ปีที่แล้ว
I have the same question. How the fuck can we find derivatives wrt weights of all prev. Times
@gopikishanmahto001 8 หลายเดือนก่อน
Watching this on 1st jan 2023 .. happy new year 😂😂
@jakaseptiadi9845 4 ปีที่แล้ว
LSTM please
@kingsmengames9950 2 ปีที่แล้ว
Now it 2022
@kanishksaxena7735 ปีที่แล้ว
bhai aap irritate bahut krte ho
@sandipansarkar9211 4 ปีที่แล้ว
Great video Krish. Keep it up.Thanks
@suvarnadeore8810 3 ปีที่แล้ว
Thank you sir

ต่อไป

เล่นอัตโนมัติ

Tutorial 32- Problems In Simple Recurrent Neural Network