MIT 6.S191 (2020): Recurrent Neural Networks

Alexander Amini

มุมมอง 395 335

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 ธ.ค. 2024

ความคิดเห็น • 203

@antonstafeyev3606 4 ปีที่แล้ว ⁺¹³²
if u cant go to MIT, make MIT come to you. Thanks to everybody who made it possible.
@vigneshgj6061 4 ปีที่แล้ว ⁺⁹
I am the first graduate of my family. It will be near impossible to listen MIT lecture unless there is a initiative like this.
Now education/knowledge is open-sourced
@tusharsolanki8979 4 ปีที่แล้ว ⁺⁵⁷
They explained the topic in a very easy manner even a guy with no background in ML can understand it.
I wish they also had the tutorial for the Practical Sessions.
@nitroyetevn 4 ปีที่แล้ว ⁺¹⁶
github.com/aamini/introtodeeplearning
@kpr7717 4 ปีที่แล้ว
@@nitroyetevn Thanks you so much!
@nitroyetevn 4 ปีที่แล้ว
@@kpr7717 No problem!
@josephwong2832 4 ปีที่แล้ว ⁺¹¹⁹
unbelievable series!! I'm learning so much from these lectures compared to other youtube vids
@patmaloyan620 4 ปีที่แล้ว ⁺¹
CRAZY GOOD quality for each minute of the lecture.
@architectsmusicgroup 4 ปีที่แล้ว
Great stuff!
@eventhisidistaken 3 ปีที่แล้ว ⁺⁹
While stuck at home over the summer, I decided to code up the infrastructure for a bunch of different kinds of AI. I started with perceptron layers, then added recurrent perceptron layers, then LSTMs (that was particularly hard - I had to intensely study at least a dozen research papers to piece it together before all the unstated pieces gelled) , then convolutional nets, then a transformer (encoder/decoder) infrastructure. What I discovered in this process, is that the amount of "intro" level information, as well as "how to AI in python" sort of stuff, completely drowns out the nuts and bolts. In the end, you're stuck reading research papers, which are targeted at an audience that is already a subject matter expert. I suppose universities are supposed to fill that gap, but honestly, this stuff is just not that hard once you decode the language of the field. It's just differential calculus and a bit of optimization theory. Good 3rd year engineering students have the math background, and combine that with some coding skills and you're golden. The other thing I learned in this process, is that convolutional nets and encoder/decoders are just amazing. Even though I wrote every line of code, and understand how and why they work, and train them myself, it feels like magic to watch them work.
@triton62674 ปีที่แล้ว
Top tier comment.
@davidschonberger8609 3 ปีที่แล้ว ⁺¹
Excellent presentation! Typo alert in slide shown at around the 18:19 timestamp: The loss corresponding to y_t should be L_t, not L_3.
@DarkLordAli95 3 ปีที่แล้ว ⁺¹
everything about this course is phenomenal. It's so good that sometimes I get distracted by thinking about how amazing the course is. The language and the pace used are perfect. The slides are perfect; there's just the right amount of information on the slides so that I don't get overwhelmed by having too much to read while listening, (something I struggle with in my regular classes).
It's just so fascinating. Teachers all around the world should take notes.
Thank you so much for sharing this with us.
@AAmini 3 ปีที่แล้ว
Thank you!!
@DarkLordAli95 3 ปีที่แล้ว
@@AAmini Thanks for the reply Alexander. Could you please let me know if you're going to upload these slides anytime soon?
@AAmini 3 ปีที่แล้ว
@@DarkLordAli95 Sure, the slides have been published since last year on the 2020 course site: introtodeeplearning.com/2020/. The most recent course iteration contains the 2021 slides (which are also published, but slightly different from this talk).
@ashutoshbhushan6107 3 ปีที่แล้ว ⁺¹
I can't believe such useful information is available to us for free. Thanks!
@noviaayupratiwi5613 3 ปีที่แล้ว ⁺⁴
I would like to say thank you, Alexander and Ava for making it happen! I would make my personal note for Convulational neural network and RNN in 2am and learning from MIT, thank you
@Otis6475 4 ปีที่แล้ว ⁺²
Thanks Alexander and Ava for this free but complete content about NN. Education at its best.
@yashsolanki069 4 ปีที่แล้ว ⁺²
Not from IIT NIT IIIT but I'm learning from MIT thanks for providing such great learning experience.
@ahmednagi7074 2 ปีที่แล้ว ⁺¹
i loved ava way of explanation such hard topics and break it up in easy pieces that can be understood
@ireenisabel988 2 ปีที่แล้ว ⁺¹
Amazing delivery! I have never thought that I would be able to understand any lecture from MIT 100% because of the lack of pre-requisites knowledge. Looking forward to see more videos on GNN, GCN etc..
@AAmini 2 ปีที่แล้ว ⁺¹
Thanks! You may also want to check out next week's lecture which will also be on Deep Sequence Modeling but contain a lot of cool new material on Transformers and Attention. The link will be here but it is not published yet: th-cam.com/video/QvkQ1B3FBqA/w-d-xo.html
@ireenisabel988 2 ปีที่แล้ว
@@AAmini Thank you very much. Looking forward to it. ....
@zhenmingwang9363 3 ปีที่แล้ว ⁺¹
This one is nice. Nicely fit with my class's slides. The best part for me is that it clearly reveals the concept of timestep computational graph, which I have not seen in previous introduction videos.
@basilihuoma5300 3 ปีที่แล้ว ⁺¹
This Lectures have been super cool, has clarified a lot of things for me, can't wait for the 2021 Series.
@saisankarborra2930 4 ปีที่แล้ว ⁺¹
This is nice, leaving behind the mathematics behind, how the vanishing gradient will be overcome by LSTM is nice explanation for RNN.
@ancbi 4 ปีที่แล้ว ⁺¹
Please, correct me if I'm wrong. At 36:18, "uninterupted gradient flow" is essentially due to the fact that operations along the route c0,c1,c2,c3,... has no weights to be updated at all. From what I can see, there are kinds 3 operations along that route:
[1] point-wise multiplication (type C x C -> C)
[2] point-wise addition (type C x C -> C)
[3] copying (type C -> C x C)
where C is whatever data type c0,c1,c2,c3 is.
@louisryan6902 4 ปีที่แล้ว ⁺¹
I struggled with this when I came across and ended up resorting to delving into the maths to grasp why LSTMs solve the vanishing/exploding gradient problem. "Uninterrupted gradient flow" does not explain sufficiently.
I found this article to be of use (if you can wade through the maths it really explains why LSTMs are better than RNNs for gradient problems)
medium.com/datadriveninvestor/how-do-lstm-networks-solve-the-problem-of-vanishing-gradients-a6784971a577
@aritraroygosthipaty3662 4 ปีที่แล้ว
Being a bootcamp I had never thought the materials could be so meaty and well made. I love the videos. Great job!
@DarkLordAli95 3 ปีที่แล้ว
I don't think she meant "bootcamp" in the literal sense.
@BharathirajaNarendran 4 ปีที่แล้ว ⁺¹³
Thanks for sharing these lectures as open source. Looking forward to the rest of the boot camp videos and will attempt the lab exercise :)
@hrsight 3 ปีที่แล้ว ⁺¹
great material
@worldof6271 4 ปีที่แล้ว ⁺²
great course. I didn’t think that while sitting in Almaty I could watch MIT courses
@rubeniaborge4652 3 ปีที่แล้ว ⁺¹
I love this lecture. I am learning so much with this series. Thank you very much for sharing! :)
@firmamentone1 4 ปีที่แล้ว ⁺¹
Hi everyone, At 31:00, Does the block in the middle of the slice represent a single neuron or a layer?
@ronniechatterjee4368 4 ปีที่แล้ว
it's a single RNN cell with more information (and basically the addition of a cell state and also the gates). One unit in the layer, not quite a single neuron.
@HarpreetKaur-vd1lb 4 ปีที่แล้ว ⁺¹
Hi Ava,
Thank you for sharing the knowledge. I have couple of questions:
1. First of all how does back propagation in RNN leads to vanishing gradient but not in case of deep neural network since Mr Amini did not bring it up in the introductory lecture.
2. Secondly you mentioned that having weights as identity matrix will solve the problem of vanishing gradients. So how is that possible. What is the math behind it ?
3. Thirdly, I am confused as to how you are showing the matrix as n*m matrix in the first place. Since the example you took had a sentence which seem to be like a n*1 matrix (where n is the number of words, isn't it?)then how can the weights be a n*m matrix(focusing on the dimensions here). If what is shown is correct then how will you multiple the weight matrix with the example that you have described.
4. And lastly, I could not correlate how relu results in derivative greater than 1. I mean when you use the relu function, then combination of product of weights *x is forced to be a constant value 1 isn't it. So no matter what the value of the dot product of weights and x (and dot product of the state and another weight matrix )is going to be as long as it is greater than zero it will result in a y value of 1, if I understand correctly. However, the value of the function is constant no matter what, then wouldn't that yield the derivative of 0 since the derivative of constant is 0. Maybe I have misunderstood something but you further clarifying this would help certainly.
Thanks in Advance!
@rushimashru4630 4 ปีที่แล้ว ⁺⁴
This series of MIT lectures were very effective and productive, especially in this lockdown and WFH situation. I learned a lot. Thank You !!! Alexander Amini Sir to make it possible.
@wangsherpa2801 4 ปีที่แล้ว
Despite not being very good in maths, I am understanding all these lectures.
❤️Love from India❤️
@bluealchemist6776 4 ปีที่แล้ว ⁺¹
Outstanding educational sharing...I love you MIT...no place like you in this world...or the next
@sarthakshukla5251 3 ปีที่แล้ว ⁺¹
In the LSTM diagram, what operation does the intersection of the wires perform?
@mattrowlands5751 3 ปีที่แล้ว
Its vague in the diagrams... in reality the input to each gate is calculated as: weight * ht-1 + weight * xt + bias, where each gate has their own weight and bias and ht-1 is the previous 'hidden state' and xt is the new input.
@sarthakshukla5251 3 ปีที่แล้ว
@@mattrowlands5751 I thought about it and most probably they represent flow of information with the tanh box representing
tanh(Whh*ht-1 + Wxh*xt) i.e the entire function/expression has been abstracted to that box.
@sarthakshukla5251 3 ปีที่แล้ว
@@mattrowlands5751 Thanks btw
@mattrowlands5751 3 ปีที่แล้ว
SARTHAK SHUKLA Yep that is correct. I personally think these diagrams are very misleading and confusing. Best of luck to you my friend.
@osvaldonavarro3292 ปีที่แล้ว
How does the vanishing gradients problem is solved by the LSTM? If the LSTM includes sigmoid operations, doesn't that contribute to make gradients smaller?
@xiyupan299 2 ปีที่แล้ว ⁺¹
The best one on the TH-cam
@AAmini 2 ปีที่แล้ว
Thanks! You should also check out the new 2022 version, it's even better!!!
@sichaoyin7675 4 ปีที่แล้ว ⁺⁵
very nice course. lots of new stuff. looking forward to new releases
@manishbolbanda9872 4 ปีที่แล้ว
at 34:38 what is the use of RHS tanh block in LSTM??please ans if you know.thanks
@teeg-wendezougmore6663 2 ปีที่แล้ว
Great course. Thanks for sharing. Is RNN module equivalent to a simple neuron? I am little confused with terms cell and module
@jithuk8693 4 ปีที่แล้ว ⁺¹
I am having a doubt!
What will be the initial step in the first iteration? Means what is ht-1 in first iteration?
@nickglidden9220 4 ปีที่แล้ว
Really amazing video! A ton of info in 45 minutes but in an easy to understand manner!
@stephenlashley6313 2 ปีที่แล้ว
Absolutely excellent explanation! To further motivate and inspire your great work, where the ball goes next is the holy grail of all models of everything: Quantum Mechanics. You can easily infer the quantum "genome" sequence of all reality with this technology. Again, great presentation and content
@Harini.R 4 ปีที่แล้ว ⁺¹
Excellent! This has been super useful to get my head around ML terminologies and potentially use them on my ongoing project. Thank you very very much!
@kushangpatel983 4 ปีที่แล้ว ⁺¹
Thanks MIT for providing access to such an amazing series of lectures!
@praveenkumarverma9470 2 ปีที่แล้ว ⁺¹
great lecture
@prabhavarora1992 4 ปีที่แล้ว
Hi all I was just confused about something. The point of RNNs is to preserve the order in a sequence. Let us say our sequence is "I took my cat fora walk". If we use a really large fixed window and make it into a fixed vector to put into a Vanilla feedforward network is the order preserved? I guess what I am really asking is can a feedforward network preserve order?
@footstepsar992 4 ปีที่แล้ว ⁺¹
Great Lecture Series! I just tried to download the slides from your site, but they are unavailable; the site just states that slides and videos are upcoming for 2021 lectures. Do you know where to find the 2020 lecture slides?
@DarkLordAli95 3 ปีที่แล้ว
they're still not there :(
@RishitDagli 4 ปีที่แล้ว ⁺¹²
Wonderful lectures, I love them. The things are so much simplified and made easy to learn
@c9der 4 ปีที่แล้ว
Thanx for provide High quality content..... Always wanted to go MIT
@kennethqian6114 4 ปีที่แล้ว ⁺¹
This was an incredibly well-organized lecture on recurrent neural networks. Thank you so much for the video!
@marwasalah2640 3 ปีที่แล้ว ⁺¹
This is a good lecture and a great instructor :)
@emenikeanigbogu9368 4 ปีที่แล้ว ⁺¹
I have been eagerly awaiting this video
@Matteopolska 4 ปีที่แล้ว ⁺⁵
Thanks from Poland for this great and valuable content :)
@saeeduchiha5537 4 ปีที่แล้ว
I didn't get the "Sharing parameters across the sequence" part! 9:35
what kind of parameters we are talking about here?
how they can actually be "shared" across the sequence??
@mukul2610 4 ปีที่แล้ว
Sharing parameter means the position of a parameter ( in this example: a word) should not be fixed.
The parameter (word) can be anywhere in sequence and also present multiple times.
@avdhutchavan3044 4 ปีที่แล้ว ⁺¹
I am getting an error while playing the songs ('C:\Users
ame\PycharmProjects\tensors\venv\lib\site-packages\mitdeeplearning\bin\abc2wav' is not recognized as an internal or external command,
operable program or batch file.)
@blakef.8566 4 ปีที่แล้ว
36:20 does this suggest that all loss information is propagated through the internal state of the cell?
@Kevin-gm7gx 4 ปีที่แล้ว ⁺²
Thank you so much for making top class education accessible, especially such an important topic!
@omidzare1934 4 ปีที่แล้ว ⁺¹
great lecture . very informative
@a.yashwanth 4 ปีที่แล้ว ⁺¹
If a neural net generates music, after training using a copyrighted music, does the rights of generated music still belong to the copyrighted music creator?
@fr0iler578 4 ปีที่แล้ว
No
@abhishekrungta4056 4 ปีที่แล้ว ⁺¹
A question related to lab session 2:
Why does the last code block show that there are no songs in the text when clearly I generated it using the generate_text function in the previous block?
@alhikmah6265 3 ปีที่แล้ว
Any suggested readings
@junqima5127 4 ปีที่แล้ว
I know I can't go to MIT as I have tons of questions and they only seem to have a couple
@aliyaqoob8779 4 ปีที่แล้ว
How do you choose between different non-linearity functions? In this example here, we used a tanh but could we have used a sigmoid instead?
@siminmaleki4818 4 ปีที่แล้ว ⁺⁵
great courses. Thank you for sharing. you rock! Proud of you in Iran.
@HazemAzim 4 ปีที่แล้ว ⁺¹
Very Good .. Great Lecture .. Thanks
@ميكاساالعبيدي 4 ปีที่แล้ว
Can I get master's theses or phd regarding this topic[Time-Series Deep-Learning Classifier for Human Activity Recognition Based On Smartphone Built-in Sensors ]
@SumukaG 4 ปีที่แล้ว
Could somebody explain in detail about shared parameters that she talks about ? Maybe with another example ?
@mattiapennacchietti9224 4 ปีที่แล้ว
Well done MIT, always one step ahead
@vincentlius1569 4 ปีที่แล้ว
I have a question, does feed forward have the vanishing gradient problem, if so how to fix it ?
@mukul2610 4 ปีที่แล้ว
When we talk about Ht can we say that we are giving some sort of feedback to the perceptron?
@charliedelagarza9686 4 ปีที่แล้ว
Hi, I havent finished the video (currently in 16:24), however, I was wondering how do you input a string or text into the RNN. Is it as a string or do you change it to numbers with a tokenize? If you do not use a tokenizer and if you input as a number what method do you use to change strings to numbers?
@namekuldeep 3 ปีที่แล้ว
In RNN you will not enter complete string you will enter each word as a first input , sec world as second input in the form of Vector( you need to convet the word into vector ,use wordtovec library for that).
@shivamkumar-ff5ui 4 ปีที่แล้ว
What can we do if we r stuck in TODO portion of lab .
@susheelmaskar7992 4 ปีที่แล้ว
what if give solution to this problem, will i get a job?
@barbaracarnauba1507 4 ปีที่แล้ว ⁺¹
Thanks for sharing this great lecture!!
@dianaamiri9520 4 ปีที่แล้ว ⁺¹
wow you explain it amazingly clear and understandable. Thanks MIT for sharing
@neuroling 4 ปีที่แล้ว ⁺¹
These lectures are fabulous, but this one is top tier. Excellent. Thank you!
@lksmac1595 3 ปีที่แล้ว ⁺¹
excellent!
@Gustavo_Rojas 4 ปีที่แล้ว ⁺¹
for us watching online, is there other material we can use to go along with these lectures?
@RAZONEbe_sep_aiii_0819 4 ปีที่แล้ว
Can we also have some coding lectures associated with the topics you are discussing here?
I will be glad if u can include it.
@amantayal1897 4 ปีที่แล้ว
if you check their website you can find some coding problems
@ShadyAshrafMohammed 4 ปีที่แล้ว
I would like to attend or get my hands on the lab material. :)
@daxj9133 3 ปีที่แล้ว
There is lab material in their website's section of 2020.
@sumithdommati6392 4 ปีที่แล้ว ⁺¹
sir can we get solutions of the lab todo problems
@AAmini 4 ปีที่แล้ว ⁺¹
They are all available on the course github repo. Please check the website for more details.
@star831 2 ปีที่แล้ว
This was really helpful!
@digvijayyadav3633 4 ปีที่แล้ว
Are there any hands-on projects for Applications of RNN in music generation?
@Shah_Khan 4 ปีที่แล้ว ⁺¹
Thanks Ava....for sharing us informative lectures....Learning a lot from your videos...Can I ask any questions from you people either in slack or somewhere else?
@hemingwaykumching3812 4 ปีที่แล้ว ⁺¹
Thank you for sharing knowledge and making it available for all. I am feeling difficulty in understanding the lecture videos, I had never before learnt anything about ML or DL but I really enjoyed the first video.
Is there any perquisite information required for this series of lecture?
@SantiSanchez28 4 ปีที่แล้ว ⁺¹
I'd say basic linear algebra, calculus and programming
@kingeng2718 4 ปีที่แล้ว ⁺¹
The lab parts are really useful thanks alot
@tamdoduc9804 4 ปีที่แล้ว ⁺¹
Amazing coure! Thank you!
@alk8773 4 ปีที่แล้ว
Anyone knows where are the projects?
@aliyaqoob8779 4 ปีที่แล้ว
so how is c_t different from h_t? How is this "cell state" different from h_t? And why are both of those used?
@srinivasprabhu9884 4 ปีที่แล้ว
How exactly is counts calculated???
@sirjohn8629 4 ปีที่แล้ว
I am learning new things!
@venugopalbv2069 4 ปีที่แล้ว
Pre requisite to learn from this course?
@smartteluguguru 4 ปีที่แล้ว ⁺³
Thank you very much, for giving knowledge from another dimension, great course. :)
@schnittstelle8492 4 ปีที่แล้ว
amazing explained in only 45 minutes. Why don't they teach like this at my university?
@yuanyuan23191212 3 ปีที่แล้ว
Why I feel like RNN is quite similar to Kalman Filter...
@ANKITPAL-ro8ue 4 ปีที่แล้ว
Where to get the data for lab?
@aromax504 4 ปีที่แล้ว
I am 30 now and I wish to have these sort of contents on my school/college days. Still trying to learn as much as possible. Thank you for your contribution to democratize world class education
@DarkLordAli95 3 ปีที่แล้ว
THE SLIDES ARE NOT AVAILABLE IN THAT LINK.
@matthewphares4588 4 ปีที่แล้ว
Great lecture Ava. Is there any chance I can consult with you on a project I’m working on? You can choose the hourly tutor rate?
@chiragpalan9780 4 ปีที่แล้ว
Where can I get lab tutorials?
@rahultripathi9457 4 ปีที่แล้ว
How to get the relevant lab content?
@ameerhamza4816 4 ปีที่แล้ว
Visit course website
@fatnasaeed2937 4 ปีที่แล้ว
Thanks very much well understanding
@michaelcjakob 4 ปีที่แล้ว ⁺¹
Amazing series, thank you! Would love to see more from MIT! (:
@hsiang-yehhwang2625 4 ปีที่แล้ว
Great lecture from MIT!!
@RajaSekharaReddyKaluri 4 ปีที่แล้ว ⁺¹
Thank you @ava soleimany
@RizwanAli-jy9ub 4 ปีที่แล้ว
thankyou for this series of lectures

ต่อไป

เล่นอัตโนมัติ

MIT 6.S191 (2020): Convolutional Neural Networks