Neural Networks - The Math of Intelligence #4

Siraj Raval

มุมมอง 53 317

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 30 ก.ย. 2024

ความคิดเห็น • 131

@PaulGoux 7 ปีที่แล้ว ⁺⁸
This video went from 0 - 100 real quick.
@danielparrado3605 6 ปีที่แล้ว ⁺⁴
wow... this 11 min video took me 2 hours to understand most of it. You did a really good job putting ALL that information in such a short amount of time. Great job Siraj, keep up the good work!
@JordanShackelford 7 ปีที่แล้ว ⁺⁹
breh why are people still using sigmoid? I thot ReLu was superior
@nandanp.c.7775 7 ปีที่แล้ว ⁺⁸
As far as i know Sigmoid is still used because to get the probabilities. [0-1] values , which is not the case with ReLU(for binary classification problems | Softmax in case of multiclass classification problems. ) . So they are used just in last layer. ReLu doesn't suffer from vanishing gradient problem so they are all used in hidden layers so that errors can be propagated back effectively.
@SirajRaval 7 ปีที่แล้ว ⁺⁴
what Nandan said is true.
@novansyahherman6788 5 ปีที่แล้ว ⁺²
Terima kasih mas siraj, saya di kasih tugas karena anda
@RyanRamadhanii 5 ปีที่แล้ว
sama sama mas novan
-mas siraj
@saminchowdhury7995 5 ปีที่แล้ว ⁺²
Take a shot everytime he says function.
Great vid btw
@ongjiarui6961 7 ปีที่แล้ว ⁺³
Hi Siraj! Here's my solution for this week's coding challenge: github.com/jrios6/Math-of-Intelligence/tree/master/4-Self-Organizing-Maps
@hammadshaikhha 7 ปีที่แล้ว
I read over your notebook, I liked the nice and simple vectorized code. I am trying to understand the general intuition behind how you did the MNIST example. Correct me if I am wrong, but your output lattice of nodes is 20 x 20, so you have 400 weight vectors lying in dimension 784 (number of pixels in image). You then represented this information as a 3D matrix of size 20x20x784. After training this matrix has the finalized weights. Its not clear to me what your doing next? Are you now using these 400 weights, to form 400 clusters in your data, and then plotting the each image in the clusters on the 20x20 lattice to get the visualization?
@ongjiarui6961 7 ปีที่แล้ว
hammad shaikh Yeah, you're right. To visualise the 3D Tensor, we have to transform it to a 2D matrix first. So each 768 weight vector is converted to 28x28 matrix and aligned according to the parent node in the lattice.
@SirajRaval 7 ปีที่แล้ว
u rule Ong
@tanmayrauth7367 6 ปีที่แล้ว ⁺¹
can anyone please explain me
why derivative of sigmoid function is taken as x*(x-1) . ??
@DrewBive 7 ปีที่แล้ว
I have a problem in the last line of code .In your notebook u have this -'
#testing
print(activate(np.dot(array([0, 1, 1]), syn0)))
[ 0.99973427 0.98488354 0.01181281 0.96003643]'
So when i just copy-past this i had an error like NameError.Then i 'from numpy import array' and got different result from activation function.it was like that = [ 0.36375058].What the prroblem?
Ps.U have a mistake in this code -github.com/llSourcell/neural_networks/blob/master/simple_af_network.ipynb .( #Use it to compute the gradient
layer2_gradient = l2_error*activate(layer2,deriv=True) .In this line we have l2_error parametr.Instead of this u need to use layer2_error).Thank you
@tyhuffman5447 7 ปีที่แล้ว ⁺⁷
Simplified AF network, not familiar with that one.
@SirajRaval 7 ปีที่แล้ว ⁺²
Hmm. yeah. technical definition is a single layer feedforward network. older terminology is perceptron. i shouldve said that instead. thanks
@tyhuffman5447 7 ปีที่แล้ว ⁺¹
No, keep it. You're entertaining AF! Best channel for learning AI. Keep up the good work.
@anti_globalista 3 หลายเดือนก่อน
"Clowning the explanation of neural networks". Is this some kind of american talk show?
@Wherrimy 6 ปีที่แล้ว ⁺¹
Can someone clarify the part at 2:26 about dot product and matrix multiplication? It says that they're the same, while they're completely different, dot product producing a scalar, and matrix multiplication producing a matrix.
@davidutra2304 7 ปีที่แล้ว ⁺¹
hello Siraj, make a video talking about what is necessary to start to learning machine learning, like basic math necessary and programming language to learn before start.
sorry for my English, I'm Brazilian, thanks
@Murderface666 7 ปีที่แล้ว ⁺¹
All the talk about neural networks from conferences to individual series are cool, but what a lot of people aren't clearing up is exactly how to apply it based on real-world example. Its like giving a person an engine and showing how the engine itself works, but one person may want a car engine, another may want a boat engine, another may want a jet engine and another may want whatever engine the Starship Enterprise uses. So in all actuality, there is not really any information on how to use neural networks so that programmers can use it to apply to whatever problem.
@SirajRaval 7 ปีที่แล้ว
+Partisan Black see my intro to deep learning playlist
@Murderface666 7 ปีที่แล้ว
***dammit man, Your explanations are awesome!
@ericalcaidealdeano7674 7 ปีที่แล้ว ⁺¹
Hey I was unable to install PIL via pip, so I changed 3 lines and it worked:
import matplotlib.pyplot as plt
from scipy.misc import toimage
# from pillow import Image
def show(self):
plt.imshow(toimage(self.weights.astype('uint8'), mode='RGB'))
plt.show()
@ebimeshkati4729 7 ปีที่แล้ว ⁺⁶
Siraj, Could you kindly provide us with an example (tutorial) on how properly to update a trained deep learning model based on new data (lets say from a sensor)?
@Tozziz 10 หลายเดือนก่อน
This video is awasome!!!! Thank you so much :)
@j1nchuika 7 ปีที่แล้ว
Kind of late but, could somebody explain why the random wight matrix at 2:15 is multiplied by 2 and minus 1? I tried without them and it worked pretty much the same, but I'm doing the simple AF one...
@MrDominosify 7 ปีที่แล้ว ⁺³
Siraj, I wonder.
Sigmoid function is y = 1 / (1 + e^-x). It's derivative is equal to e^x / (e^x + 1)^2
Why in this video are you using different function as derivative? x*(1 - x)
@simonmandlik910 7 ปีที่แล้ว ⁺¹
that is exactly what I was thinking as well. The derivative can be rewritten as s(x)*(1-s(x)), where s(x) is sigmoid function, but definitely not as x*(1-x). His training seems to be working though :O
@simonmandlik910 7 ปีที่แล้ว
I get it now. I am probably used to different order of computation. Error is defined as partial derivative of cost function, w.r.t. weighed input z (W*x + b). If you want do calculate error in the last layer, according to chain rule, you have: error = dC/da * ds/dz, where C is cost function, a is activation in the last layer and s is sigmoid/activation function. If you want to compute exact value of second term, you should plug in z to sigmoid prime, but Siraj plugs in activation (sigmoid already applied) and that's why we don't have to apply sigmoid in the function
@XRobotexEditz 7 ปีที่แล้ว
@Simon Mandlik
I still do not understand . see
activation(np.array([2.0,1.0,-1.0]),True) and
np.array([2.0,1.0,-1.0])*(1-np.array([2.0,1.0,-1.0]))
generates the same result. I do not see how x*(1-x) is the same as
S(x)*(1-S(x)). ?
@XRobotexEditz 7 ปีที่แล้ว
@Simon Mandlik
I still do not understand . see
activation(np.array([2.0,1.0,-1.0]),True) and
np.array([2.0,1.0,-1.0])*(1-np.array([2.0,1.0,-1.0]))
generates the same result. I do not see how x*(1-x) is the same as
S(x)*(1-S(x)). ?
@rahls7 7 ปีที่แล้ว ⁺¹
So, it does appear that nonlin returns x*(1-x) when deriv=True, however when it is called, the x that is passed to it is itself a sigmoid function L1, effectively making it the same thing. I guess, it just helps to represent it as x instead of typing it again.
@transitioningtech ปีที่แล้ว
I'm a business programmer and I just have one thing to say. If I ever have to program like this to keep a job ... I'm screwed. What the hell is a "sigmoid function?"
@SirajRaval ปีที่แล้ว
Sigmoid function is a type of activation function for neural networks. Search activation functions siraj on TH-cam and watch that vid you’ll love it
@kondziossj3 7 ปีที่แล้ว
@Siraj Raval can you create some video about data overfitting?
And of course solution for that problems...
Because i try a lot of time create your previous challenge, but sometimes I have big overfitting problems, like when I use train data then I have 100% acc. but with test data I have ~10% -_- (of course I check the best prediction in tensorboard, but it isn't great solution for it. Correct me if I am wrong :D)
Have you any better solution for overfitting problems?
@khalidjaradat 5 ปีที่แล้ว
Siraj Raval
, thank you for all these great videos
Can you become a little bit slower? Because our first language isn't English
@ozzzer 5 ปีที่แล้ว
Ngl, siraj has the weirdest sense of humour
@Throwingness 7 ปีที่แล้ว
I really appreciate all the work you're doing with these videos. Sorry for my caustic comments before. I am a rank ammeter. You're videos are getting better and better.
@manojnagthane 7 ปีที่แล้ว
Hi... Could you please explain the difference and relation between big data,data science,machine learning and neural networks. please please make a video on that.
@rishikksh20 7 ปีที่แล้ว
Please make a video how to configure, train and use Tensorflow new Object Detection API with own dataset and model
@emperorjustinianIII4403 7 ปีที่แล้ว ⁺⁴
I heard practising Dutch, which triggered me because I'm a Dutchie.
@emperorjustinianIII4403 7 ปีที่แล้ว ⁺¹
BTW, if you want a suggestion for learning material I'm using Duolingo and I like it.
@bobcrunch 7 ปีที่แล้ว
What's the correct way to say:
Met een windmolen in het hoofd slaan
or
Door een windmolen in het hoofd slaan
?
@pinkiethesmilingcat2862 7 ปีที่แล้ว
Doulingo is well, but i think that the practice is the better way to learn than have a boring course of any language that you wish. por example, my english is not perfect, but learned a lot making the subs in english and spanish of math of intelligence. before, i have been colaborator in other videos of philosophy, memes, reviews etc. without a basic english and now i'm here, typing you.
@emperorjustinianIII4403 7 ปีที่แล้ว ⁺¹
Bob Crunch, I have never heard either of those sentences. But that's possible because even I sometimes don't know an idiom that's not used very often. But I'd choose 'Met een windmolen in het hoofd slaan', because you can hit someone in the head with a windmill (as in a windmill-toy), but one can't literally 'through a windmill hit in the head'. Note that the word 'door' means 'through' in this sentence.
@bobcrunch 7 ปีที่แล้ว
I heard from a Dutch native speaker that "Hit in the head by a windmill" was an idiom for someone who is crazy or maybe someone with a bad idea. Thanks for the reply.
@lionelt.9124 5 ปีที่แล้ว
Those beats... deserved a rewind all on their own. A beat souffle I would say.
@jefkearns 5 ปีที่แล้ว
The flute beat is mine. Hurricane. Video is on my channel.
@ranojoybarua6468 7 ปีที่แล้ว
Hey, how do we optimize the total number of hidden layers required and number of neurons present in each layer for a model.
e.g., Like a image recognition problem can be solved by having 2 hidden layers and each layer having 100 neurons each but same can be solved by using 5 layers each having 400 neurons.
So how do we optimize these numbers ?
@rgrimoldi 7 ปีที่แล้ว ⁺¹
yo - what's the name of the song? it's amazing!
@jefkearns 6 ปีที่แล้ว ⁺¹
Hurricane - Jef Kearns
@ThomasHauck 7 ปีที่แล้ว
You got it goin on ...
@wibiyoutube6173 6 ปีที่แล้ว
Thanks for the amazing info mate.
In the Fast Ai course, they say: one should learn the code first then the theory, but you prove them wrong in my opinion.
Thanks again my friens.
@rajscuba 7 ปีที่แล้ว
yay
@Tagraff 7 ปีที่แล้ว
Why don't we use each channel/layer as a form of captured time? Say 5 second length as a captured time. Then use that as a channel/layer and apply it to the system. Action as a symbol.
@morkovija 7 ปีที่แล้ว ⁺¹
why is the print function parameter censored? hehe =)
@Gioeufshi 7 ปีที่แล้ว ⁺¹
It is not censored it is a reference to "black box"
@venu589 6 ปีที่แล้ว
Excellent lecture bro but i have some doubts....why neural networks need hidden layer with multiple neurons why cant it adjust with one neuron in the hidden layer?Moreover same inputs are connected to each neuron in the hidden layer which gives the same output.Do we need to give different set of weights to each input so that differentiates one neuron from other?What every neuron in the hidden layer is computing?
@nickellis1553 7 ปีที่แล้ว
when you realize siraj knows several languages and probably actually went to go practice his Dutch
@alverlopez2736 7 ปีที่แล้ว
BTW your short-hair girl friend is beautiful
@SirajRaval 7 ปีที่แล้ว ⁺¹
thx shes not my gf u guys r my gf
@JapiSandhu 7 ปีที่แล้ว ⁺¹
this channel is so under rated
@simetry6477 6 ปีที่แล้ว
Japi Sandhu he is a great communicator.
@LogOfOne 7 ปีที่แล้ว
great vid again.....really helpful
@sivaprasad-pw3xt 7 ปีที่แล้ว
hai sriraj kindly provide links to learn machine learning iam new to this field
@computersciencebasis6051 6 ปีที่แล้ว
When time mattered in the input sequence then RNN Comes in. Good.
@robinranabhat3125 7 ปีที่แล้ว
hey siraj , are you a speedreader/ speedlearner ? if yes , please try to make a video series on your fast learning style too
@larryteslaspacexboringlawr739 7 ปีที่แล้ว
thank you for math intelligence video
@leonnowsden7802 6 ปีที่แล้ว
This video is very helpful
@dimitrilambrou 7 ปีที่แล้ว
Why would you need to practice dutch?
@SirajRaval 7 ปีที่แล้ว ⁺¹
i live in amsterdam
@SubarnoPal 7 ปีที่แล้ว
Explaining LSTM and Conv Net implementations would be very helpful in upcoming tutorials!
@SirajRaval 7 ปีที่แล้ว
i will
@basharjaankhan9326 7 ปีที่แล้ว
Hey Siraj, you're AWESOME! Nothing less. I am watching your videos to learn Machine Learning while my college admissions are going on. Never stop, cuz i too want to see AI solved in my lifetime.
@basharjaankhan9326 7 ปีที่แล้ว
Did I forget to mention that your videos are easy to understand. Sorry for that.
@RAHULGUPTA-ce6zb 7 ปีที่แล้ว
Can anyone tell me how we calculated gradient?
@superchefliumaohsing 7 ปีที่แล้ว
Hi Siraj, all of your videos are playable offline except this one. Im trying to learn machine learning and i downloaded all of your videos to watch it when im travelling going to work. Hope that in a few weeks i could send an entry for your github contests. Anyway, can you change the setting to be saved offline?
@SirajRaval 7 ปีที่แล้ว
hmm use keepvid dot com
@BiranchiNarayanNayak 7 ปีที่แล้ว
I liked the "LOVE" equation was too good.... Thanks Siraj :)
@SirajRaval 7 ปีที่แล้ว
lol thanks
@Nightphil1 7 ปีที่แล้ว
Hey Siraj, why do we add the gradients after we backproped them instead of subtracting. We are going for the minima right?!
@javerenzoaugustinejao4729 7 ปีที่แล้ว
I think it's because the results are negative? or am I just dumb :D
@phuccoiinkorea3341 7 ปีที่แล้ว
@MIGuy 7 ปีที่แล้ว
saved4
@ManojChoudhury99 6 ปีที่แล้ว
Learning from you is amazing.
@Leon-pn6rb 7 ปีที่แล้ว
*On Sigmoid* : I was just reading about it
The derivative of a sigmoid function
S'(x) = S(x) * (1-S(x))
But here you did:
S'(x) = x * (1-x)
*Can someone please explain?*
@Leon-pn6rb 7 ปีที่แล้ว
oly shit i got it now
u were being cheeky smart with puttin those 2 important parts in one function
or mayb i am a dumb shit
god, i m so laggin you
@Christian-mn8dh 5 ปีที่แล้ว
They are the same things just in different syntaxes.
@JordanShackelford 7 ปีที่แล้ว
Wow amazing!
@yasar723 6 ปีที่แล้ว
This video is GOLD!!!!
@anastasia_onion 7 ปีที่แล้ว
King of memology!
@412kev2 7 ปีที่แล้ว
Was crackin up at 1:15
@akashvaidsingh 7 ปีที่แล้ว
Hey Siraj,Plz make lots of tutorial videos on Neural Network.For students(just like me) who want to learn about ANN.
@pinkiethesmilingcat2862 7 ปีที่แล้ว
akash vaid perhaps if you support him on Patreon.
i dont know.
@SirajRaval 7 ปีที่แล้ว
i have so many countless neural network videos see my intro to deep learning playlist. i will make more
@Leon-pn6rb 7 ปีที่แล้ว ⁺³
in IN[43] 2:22 , can someone tell me what does this line mean:
*synaptic_weights = 2 * np.random.random((3,1)) - 1*
What is the significance of (3,1) - 1 and why was his code working without affixing 'np' in the beginning (like I did)?
And why Random.random (random 2 times )?
@MultiverseHacker 7 ปีที่แล้ว ⁺⁵
np was "affixed" (you mean imported) by
import numpy as np
It sounds to me like you are a total beginner, but I'm going to answer anyway.
(3,1) is a python data structure called tuple, it's packing 2 values into one variable. It's supposed to describe the dimensions of the output matrix which will be 3 rows and one column. By default random() returns a matrix with random values between 0 and 1, the matrix size is specified by this tuple.
np is the numpy module you imported
np.random is the random number generator inside of numpy
np.random.random((3,1)) calls the function random() on the random number generator, requests matrix dimensions 3x1
2*np.random.random((3,1)) multiplies all values in this (3x1) matrix by 2, resulting in a matrix with random values betweem 0 and 2
2*np.random.random((3,1))-1 the minus one subtracts one from each value, making a matrix with random values between -1 and 1
@Leon-pn6rb 7 ปีที่แล้ว
nono i knew np was numpy , my question was that he didnt import it and yet his code worked
I always consider myself a beginner in everything, but I am brand new to python
Ohhh i get it now , u r a BOSS
danke very mush
@MultiverseHacker 7 ปีที่แล้ว ⁺¹
He simply didn't bother to show the import. If you look at his code on github you'll see it's there
@marr73 7 ปีที่แล้ว
12345a scroll up, there is his import
@UsmanAhmed-sq9bl 7 ปีที่แล้ว
awesome siraj
@AkashMishra23 7 ปีที่แล้ว
is this Deep Learning or is this Art?
@SirajRaval 7 ปีที่แล้ว ⁺²
both
@0newhero0 7 ปีที่แล้ว
love ur vids, keep up the good work!
@SirajRaval 7 ปีที่แล้ว ⁺¹
thank u!
@UdderChaos79 7 ปีที่แล้ว
I came here looking to learn the math of intelligence and left looking to a math tutor. :|
@SirajRaval 7 ปีที่แล้ว ⁺¹
:( i will do better
@UdderChaos79 7 ปีที่แล้ว
Thanks! I’m looking forward to your future videos!
@kryptoshi4706 7 ปีที่แล้ว
Hi Siraj
@SirajRaval 7 ปีที่แล้ว ⁺¹
hi
@abhisheksinghchauhan6115 7 ปีที่แล้ว
what are the prerequisite for ML?
@6388-s2n 7 ปีที่แล้ว ⁺¹
Calculus + basic knowledge of programming
@nickellis1553 7 ปีที่แล้ว ⁺¹
Abhishek Singh Chauhan patience
@abhisheksinghchauhan6115 7 ปีที่แล้ว ⁺¹
Nick Ellis I mean to say which programming language
@nickellis1553 7 ปีที่แล้ว ⁺¹
Abhishek Singh Chauhan well Python. but honestly that doesn't matter as much as having the patience to go line by line and equation by equation and trusting that yr brain will make sense of it all. also a good "statistics vocabulary" , and familiarity with Linear Algebra and matrix operations lol.
@abhisheksinghchauhan6115 7 ปีที่แล้ว ⁺¹
Aditya Abhyankar voice responsive automated system like assistant
@manishadwani386 7 ปีที่แล้ว
siraj please dont upload a videos per week. please upload maybe like 2 or 3 in a week.
@SirajRaval 7 ปีที่แล้ว ⁺²
'don't upload a videos?' was this a typo i dont understand
@Leon-pn6rb 7 ปีที่แล้ว ⁺¹
what he meant was , upload more videos in a week cause we are hooked now
we need a dose of your mind
@manishadwani386 7 ปีที่แล้ว
Siraj Raval yup that was a typo I meant 1 video per week
@SirajRaval 7 ปีที่แล้ว ⁺¹
kk thx

ต่อไป

เล่นอัตโนมัติ

Convolutional Neural Networks - The Math of Intelligence (Week 4)