You do these fantastic explanations with your original behind-the-whiteboard technique AND you're giving the book this information comes from for free? ?????????? What a gift you are to the internet, thank you so very much. I love your teaching style.
I'm highly critical of so-called TH-cam "educators." I just watched several on SVD from MIT and Stanford, all of which were garbage. But this... this is art in its purest form. You are a scholar among scholars! Absolutely beautiful to watch unfold. Thank you!!!
Brunton is unparalleled until others do the clear board as well. re MIT tho..Strang is no joke, don't sleep on him. abstruse but distilled well, passionate dude yet elderly linear algebra notable
@@phdrxakadennyblack4098 No doubt. If I could possess the mind of a modern mathematician, Strang is in my top 5, without question. But this video really emphasized the simple elegance of a good TEACHER.
@@ethan6708 Foucault and Baudrillard always made me feel smarter like I learned something significant that they pondered long and hard over. When it is so good that the lecture and the concepts stick clearly in your head for years and years, that to me is the measure of a great teacher. I hope SB can address the latest stuff I see like the math behind KAN networks as opposed to MLPs, or how ternary filters are so good at approximating so efficiently cutting through data without losing information, if he hasn't already.
Mit and likes are either simplistic to superficial and wasting much time, or like teaching to seniors... Strang is passionate but teach in a unique way, after that you are on your own studies or restart another similar course
This guy is the refreshed version of professor Gilbert Strang. Professor Steve Brunton, you're an amazing linear algebra teacher. Glad I found these videos. Going to recommend to my entire computational linear algebra classmates. Thank you so much
I could complain about these videos being to fast-paced and technical for a more novice audience, or that the explanations are not very thorough, or even about the idiosynchratic notation (m for number of columns, like come on matrices are mxn by default). But it still wouldn't change the fact that these videos motivated me to learn more about SVD and served as a invaluable resource for getting introduced to the maths I need for my job. Thank you. I hope there is more content in the making.
OMG.... These videos really moved me. I haven't had the feeling to learn knowledge with excitement and refreshment, thank you so much for being such a good teacher. I feel the beauty of math.
This is mind-blowing. Everything falls into place nicely - the fact that the row and column correlation matrices are symmetric positive semi-definite matrices, giving rise to real positive eigenvalues and orthogonal eigenvectors, which makes the prior assumption that the singular values are real and positive, and that U and V comprise orthogonal column vectors, all true!
Usually some of the explanations can be skipped using Einstein notation but it is often ignored how very educational it can be to explicitly walk though the process. Respect!
This is one of the best Stuffz I've seen on the internet.....explicit and was taught with so much passion, rigor, Aggressiveness in presentation, clear intuition etc. With these stepwise approach, even the blind is set and equipped to be a Genius and solve real time problems. You don't want to know how grateful I am. Thanks
I don't think I would ever see this sort of intuitive explanation of U and V matrices in terms of Eigen values and Eigen vectors. Thanks a lot Professor for helping humanity in understanding SVD.
I am not clear on how X^tX is a correlation matrix, which I understand to be a matrix of correlation coefficients between variables. Really enjoying this series.
It technically isn't a correlation matrix. If you center each variable (column) by subtracting a mean, then X'X yields a covariance matrix (if you divide the whole result by n-1). If you normalize each column in X by its standard deviation, then X'X yields a correlation matrix. I think he leaves out these details to simplify.
For sure, the best explanation I'd seen about the whole topic. I'm very surprised on the production and staging. Prof. Brunton seems to be writing in a transparent surface with actual pens. However, it's not possible to film it from the perspective all we are seeing unless he is doing mirror writing, like Leonardo Davinci. Involved mathematics, fluent speaking and mirror writing, at the same time! too much, even for a sharp mind. So, how was done? Maybe the scene is reflected in a physical mirror and the camera is pointing that mirror, and some linear algebra to correct the perspective 🙂, or maybe everything is advanced postprocessing. Please, we want to know how did you shoot this fantastic video. Greetings from Spain!
This is as good as it gets, thank you. I got the book last week, and the Python implementation from the website is a great addition. Thanks again for these priceless videos.
Thank you for the great video! Couple questions: 1. Is the column space of the matrix X what people usually refer to as the feature space? 2. How does it relate to the standard basis on R^n? 3. When you take the inner products in (X^T)(X), is that the inner product in R^n? or some other inner product? i.e pearsons correlation on the space of random variables.
I wanted to provide some linear algebra reminder at the beginning of the school year for student enrolling for a data science master. I guess I Ill just give the students the link to your videos 😊 I love your presentation of the SVD ! Very visual ans usefil for data science
As always, It is really nice to have more prof. Brunton content to learn from. This is the nudge I needed to get your new book and dive deeper to data-driven modeling and machine learning control. Thanks, prof. Brunton. I just finished the series, I am really looking forward to the rest of it. If the [X]T[X] represents the correlation between the columns, does that mean that the diagonal has the largest elements? And does [X][X]T represent the correlation between the row? Can that be interpreted in any meaningful way?
Thank you for the great content, I have a question: At 3:54 you say "if these are people's faces then the ij-th entry of this matrix ( X^t X ) is the inner product between person i and person j 's face, so if there's a value of this matrix that is large that means that those two people had a large inner product, their faces are similar, they have the same basic face structure. If you have a small value of this inner product that means that they are nearly orthogonal and they're very different faces". The inner product (as usually defined for a Euclidean space) of two vectors gives you an idea of the "similarity" of the two vectors, in this case each vector is a "list" of numbers each representing a pixel of a picture of a face. The idea of "if the two vectors are similar then the two faces they represent are similar" supposes that the pictures were somehow taken such that the distribution of the values of the pixels was related to the facial characteristic, correct? Otherwise it's not clear to me how could the similarity between two faces be "detected" (for example two photos of the same face in different lighting/position would be associated to two very different vectors). Thanks!
Hi Prof, at 6:43 , it is substituted U^ (U^)T = I. But U^ is an economy matrix you, earlier it is mentioned that they are no more unitary matrices once truncated. Please clarify this point.
In the previous lecture, he mentioned that for truncated matrices, U'U=I, but it is not the case that UU'=I. In this lecture, he only made use of the first fact. So you are correct they are not "full" unitary, but you only need "half" unitary for this proof to go through.
Great video, thanks for it, but I have a question regarding your explanation of positive semidefinite in minute 4:36. You said that X*X is symmetric and positive semidefinite because the elements are inner products. Nevertheless, according to the definition of an inner product, it is only positive definite when and it does not occur for all the elements of that matrix. Can anyone explain to me if that was a mistake in the video or if I am wrong, please?
Hi! Yes, off-diagonal elements of a matrix can be negative, but this does not mean that the matrix is not positive semi-definite. Positive semi-definite ensures that the eigenvalues are not negative. You can google about the Gram matrix :)
Thank you very much for these amazing lecture videos, I bought the book after watching a few of them. I am also very happy that you made it available as an ebook as well, it works great on my Kindle :)
I am greatful for such a good series of videos with you Steve as your are really talented. I have a question that might be basic : to obtain a proper correlation matrix, one should center X and reduce it afterwards ? I asked chatgpt but trust you more on this ^^'
Thanks, I'm glad you like them! Yes, typically you would center X (subtract the column-wise mean or row-wise mean first, and then take inner products with either all pairs of columns, or all pairs of rows). We discuss this in databookuw.com/databookV2.pdf if you want more details.
Hi. Great explanantion! Just had a slight confusion. Isnt it like Inner product signifies similarity but wont be able to give a correlation since 'technically' correlation could mean a measure of linear reltionship between 2 variables (That is, how much a value cahnges in response to the other)? Bu tin the video, it is marked as a correlation.
Brilliant. I think there is one little error at 6.10. Sigma has the size mxn, so Sigma Transpose is of the size nxm. They both have sigma1,..,sigma_m in the diagonal. So instead of Sigma^2 we get another mxm matrix. It doesn't change end result,but I thought it was important to mention to avoid confusion.
I think, If nxm X has n features of m faces, then (X-p)(X-p)^T , where p is nx1 vector of row means(here p is broadcast from nx1 to nxm), is the nxn covariance matrix of the features among those faces. (X-q)^T (X-q), where q is the 1xm row vector of column means (again broadcast to nxm), is the mxm covariance between the faces among those features. After dividing by the number of data points - 1.
I performed XT*X for V and X*XT for U on a matrix X. But X does not equal to the U*S*VT. I even converted the values of U and V to orthogonal. Still does not equal to the orignal X. What am I doing wrong?
Very clean and clear explanation of the computational process for SVD. However, it doesn’t explain much the intuitive concepts of SVD in terms of what it is conceptually as a decomposition of a linear transformation with basis change.
There is just one little thing I am not following. Around 11:18, he says that sigma hat squared is the eigenvalues of the correlation matrix. But when I look at the equation X^{T} X = V \sigmahat ^ 2, the expression on the right does not look like what I expect to see for an equation about eigenvalues. I expect something like Av = \lambda v. Of course, that is an equation about a single eigenvalue and he is saying that sigmahat^2 is all of the eigenvalues. I just don't see how that is possible, since sigmahat^2 is an m x m matrix. Can anyone point me in the right direction?
great video! thanks so much!i have a question... what if you have data where m>>n? so my datapoints outnumber the parameters i have evaluated for each of them. can you still interpret the SVD such that the columns in U and V are the eigenvectors of the row and column correlation matrix, respectively?
ah sorry! i think that may have been a silly question! i think i'll just switch the rows and columns of my data so that n>>>m and just pull out whichever U or V would be more meaningful for my data!
In the book (p. 13) you say: "_This provides an intuitive interpretation of the SVD, where the columns of U are eigenvectors of the correlation matrix XX* and columns of V are eigenvectors of X*X". Isn't the rows of U are eigenvectors of the correlation matrix XX* and rows of V are eigenvectors of X*X, is it?
Hello professor, many thanks for the video's. I have watching both your electrical engineering (laplace transform) videos, as well as this one on SVD. I am bit confused where the approximation you state (towards the end of the video) comes from. Is the original SVD itself approximate ? In the case of truncated SVD, we are just dropping the columns that would be zero'd anyway correct ?, so at what point of time do things become approximate.
Why are the values of the correlation matrix inner products of the columns of X and not of the rows of X? Don't we want to see the correlations between different features of the faces, not between the different faces themselves?
Great question. We dig into this more in the other videos in the playlist. But short answer is that you can look at the row-wise or column-wise correlation matrices, and they will have the same non-zero eigenvalues (really quite surprising when you first think about it). Both of these can be used to extract the SVD.
Hi, so I'm just wondering, does a column in X here represent a feature column, and the rows represent the different samples, or do the columns represent the different samples and the rows represent the different features?
Hi Prof. Steve, I don't understand why the eigenvectors given by the right singular vectors do not correspond to the same vectors when I compute Eigenvectors function (in Mathematica software) of the matrix X^T.X. Could you please help me? I have a singular matrix X ={{-1, 0, -1}, {1, -1, 0}, {0, 1, 1}}, with rank =2, with the economy SVD I get V= {{1/Sqrt[2], -(1/Sqrt[6])}, {0, Sqrt[2/3]}, {1/Sqrt[2], 1/Sqrt[6]}}. The relationships are clearly satisfied as the definition of eigenvectors, i.e., X^T.X.v=3.v. However, when I run the Eigenvector[ X^T.X] it returns {{1, 0, 1}, {-1, 1, 0}, {-1, -1, 1}}, where the first vector is ok, and corresponds to 1/Sqrt[2]*{1, 0, 1}, but the remain ones have no correlation with v2= {-(1/Sqrt[6]), Sqrt[2/3], 1/Sqrt[6]}. However, I know that this last eigenvector {-1, -1, 1} corresponds to the (right) null space of X. Am I doing something wrong? Maybe it is because this matrix has the four spaces and that is the reason for these vector are not correponding?? So, my question is, why I got different eigen vectors when considering two different techiques? Shouldn't they span the same space? Thank you
There is a mistake in min 8:55 Let * deonte the transpose : (X*X)* = X*X => X*X is symmetric and therefor XX* is not the transpose of X*X (see math.vanderbilt.edu/sapirmv/msapir/prtranspose.html)
The class is just incredible, and you are a very good teacher. But I just want to know, how do you write it in front of you and the text and math symbols are not inverted to us? Thats a nice magic trick.
Just to note, it is not the correlation matrix but the covariance matrix which is gained by the X^T X etc. Otherwise great video, I really needed this type of intuition for SVD:s.
I know that your post is a year old, but I was surprised that you seem to have been the only person that noticed the "correlation"/"covariance" confusion. Too bad that Steve Brunton did not respond to your post.
I cannot put in words how much talent you have for teaching. It is just impressive. Thank you so much.
You do these fantastic explanations with your original behind-the-whiteboard technique AND you're giving the book this information comes from for free?
??????????
What a gift you are to the internet, thank you so very much. I love your teaching style.
Wow, thank you!
If they had a Nobel prize for teaching on TH-cam, this guy would be one of the top contenders
th-cam.com/video/0Ahj8SLDgig/w-d-xo.html
I'm highly critical of so-called TH-cam "educators." I just watched several on SVD from MIT and Stanford, all of which were garbage. But this... this is art in its purest form. You are a scholar among scholars! Absolutely beautiful to watch unfold. Thank you!!!
Brunton is unparalleled until others do the clear board as well. re MIT tho..Strang is no joke, don't sleep on him. abstruse but distilled well, passionate dude yet elderly linear algebra notable
@@phdrxakadennyblack4098 No doubt. If I could possess the mind of a modern mathematician, Strang is in my top 5, without question. But this video really emphasized the simple elegance of a good TEACHER.
@@ethan6708 Foucault and Baudrillard always made me feel smarter like I learned something significant that they pondered long and hard over. When it is so good that the lecture and the concepts stick clearly in your head for years and years, that to me is the measure of a great teacher.
I hope SB can address the latest stuff I see like the math behind KAN networks as opposed to MLPs, or how ternary filters are so good at approximating so efficiently cutting through data without losing information, if he hasn't already.
Mit and likes are either simplistic to superficial and wasting much time, or like teaching to seniors... Strang is passionate but teach in a unique way, after that you are on your own studies or restart another similar course
This guy is the refreshed version of professor Gilbert Strang. Professor Steve Brunton, you're an amazing linear algebra teacher. Glad I found these videos. Going to recommend to my entire computational linear algebra classmates.
Thank you so much
Strang will always be the GOAT.
Engaged delivery with exceptional content clarity. If Mr Brunton isn't tenured it would be an indictment of the US university system. Cheer!
Professor Brunton is great. I saw his SVD slightly covered in his in-class lecture. Now he dedicate his time to make this video lecture. Super!
I could complain about these videos being to fast-paced and technical for a more novice audience, or that the explanations are not very thorough, or even about the idiosynchratic notation (m for number of columns, like come on matrices are mxn by default). But it still wouldn't change the fact that these videos motivated me to learn more about SVD and served as a invaluable resource for getting introduced to the maths I need for my job. Thank you. I hope there is more content in the making.
OMG.... These videos really moved me. I haven't had the feeling to learn knowledge with excitement and refreshment, thank you so much for being such a good teacher. I feel the beauty of math.
a real genius can dissect complex subject matter to explain it in a simple and intuitive way. and this guy is just a splendid example of such genius
This is mind-blowing. Everything falls into place nicely - the fact that the row and column correlation matrices are symmetric positive semi-definite matrices, giving rise to real positive eigenvalues and orthogonal eigenvectors, which makes the prior assumption that the singular values are real and positive, and that U and V comprise orthogonal column vectors, all true!
I learned SVD 4 years ago, and nobody ever explained it so well! Will recommend this series to every math student!
Usually some of the explanations can be skipped using Einstein notation but it is often ignored how very educational it can be to explicitly walk though the process. Respect!
One of the best TH-cam explanation i have ever seen.
Sir, do you even have a slightest idea what a superawesomely-superawesome teacher you are?!!!
Thank you, SO MUCH!
🙏🙏🙏
Every one is applauding the wonderful lesson, which is very true!!
BUT No one is applauding Steve's wonderful ability to write inverted.
the best grad school prof ever!
This is one of the best Stuffz I've seen on the internet.....explicit and was taught with so much passion, rigor, Aggressiveness in presentation, clear intuition etc. With these stepwise approach, even the blind is set and equipped to be a Genius and solve real time problems. You don't want to know how grateful I am. Thanks
this series is like a gift, thank you so much. Will definitely buy the book when I can :)
Hope you enjoy it! (check out databookuw.com/databook.pdf until then)
One of the best things to have happen in Jan 2020 is seeing you share more of your knowledge! Much Appreciated!
wow these tutorials needs high level of attention!
thanks for the extraordinary lectures
Your descriptions here are, by far, the best I have seen. Maybe it is because you teach in the way I learn but these have been really useful.
That was so clear and well explained. It's really connecting a lot of dots in the maths I've had to learn for modelling in engineering
Dominant correlations? More like "Dang good pieces of information!" Being serious, this is some of the absolute best content on TH-cam.
This is so good. We need more teachers like this!
Amazing lecture, thank you for making the SVD series! I appreciate the importance of SVD much more after watching your lectures. Thank you again!
Thank you for this fantastic series, definitely among the best educational content i have come across on youtube!
best video on SVD topic, concept crispy clear, and the video is movie quality, really enjoy
I don't think I would ever see this sort of intuitive explanation of U and V matrices in terms of Eigen values and Eigen vectors. Thanks a lot Professor for helping humanity in understanding SVD.
Thank you for making such excellent learning material freely available, you are a godsend.
Wow, thank you. I just found your channel today, and I'm grateful for these intuitive explanations. Cheers!
I am not clear on how X^tX is a correlation matrix, which I understand to be a matrix of correlation coefficients between variables. Really enjoying this series.
Yeah that's my problem too! The video is incredible but I don't get why those matrixes are correlations matrixes
It technically isn't a correlation matrix. If you center each variable (column) by subtracting a mean, then X'X yields a covariance matrix (if you divide the whole result by n-1). If you normalize each column in X by its standard deviation, then X'X yields a correlation matrix. I think he leaves out these details to simplify.
@@neurochannels ty .. that seems plausible
this video is on a different level of teaching, thanks
You are the the best teacher I have seen
For sure, the best explanation I'd seen about the whole topic. I'm very surprised on the production and staging. Prof. Brunton seems to be writing in a transparent surface with actual pens. However, it's not possible to film it from the perspective all we are seeing unless he is doing mirror writing, like Leonardo Davinci. Involved mathematics, fluent speaking and mirror writing, at the same time! too much, even for a sharp mind. So, how was done? Maybe the scene is reflected in a physical mirror and the camera is pointing that mirror, and some linear algebra to correct the perspective 🙂, or maybe everything is advanced postprocessing. Please, we want to know how did you shoot this fantastic video. Greetings from Spain!
This is as good as it gets, thank you.
I got the book last week, and the Python implementation from the website is a great addition. Thanks again for these priceless videos.
God Bless you, Steve because you make a big difference in how to teach ML salut from Brazil
Thank you for the great video!
Couple questions:
1. Is the column space of the matrix X what people usually refer to as the feature space?
2. How does it relate to the standard basis on R^n?
3. When you take the inner products in (X^T)(X), is that the inner product in R^n? or some other inner product? i.e pearsons correlation on the space of random variables.
First SVD I've ever liked and like spreading it.
I wanted to provide some linear algebra reminder at the beginning of the school year for student enrolling for a data science master. I guess I Ill just give the students the link to your videos 😊 I love your presentation of the SVD ! Very visual ans usefil for data science
Your lectures are treasure for us ...........👍👍❤️❤️
So nice of you!
Thanks a lot.
I haven't watched the series yet, but I already know it's fantastic.
I already liked your classroom course on the same topic.
you just got a genius for teaching complex things.
Thank you so much ! This seemed so complicated and you made It so clear and easily understandable , amazing !
As always, It is really nice to have more prof. Brunton content to learn from. This is the nudge I needed to get your new book and dive deeper to data-driven modeling and machine learning control. Thanks, prof. Brunton. I just finished the series, I am really looking forward to the rest of it.
If the [X]T[X] represents the correlation between the columns, does that mean that the diagonal has the largest elements? And does [X][X]T represent the correlation between the row? Can that be interpreted in any meaningful way?
Thank you for the great content, I have a question:
At 3:54 you say "if these are people's faces then the ij-th entry of this matrix ( X^t X ) is the inner product between person i and person j 's face, so if there's a value of this matrix that is large that means that those two people had a large inner product, their faces are similar, they have the same basic face structure. If you have a small value of this inner product that means that they are nearly orthogonal and they're very different faces".
The inner product (as usually defined for a Euclidean space) of two vectors gives you an idea of the "similarity" of the two vectors, in this case each vector is a "list" of numbers each representing a pixel of a picture of a face. The idea of "if the two vectors are similar then the two faces they represent are similar" supposes that the pictures were somehow taken such that the distribution of the values of the pixels was related to the facial characteristic, correct? Otherwise it's not clear to me how could the similarity between two faces be "detected" (for example two photos of the same face in different lighting/position would be associated to two very different vectors).
Thanks!
Thank you so much for uploading these amazing lectures!
This lecture cleared a misunderstanding I had about SVD. Thank you so much !
Glad it was helpful!
Hi Prof,
at 6:43 , it is substituted U^ (U^)T = I. But U^ is an economy matrix you, earlier it is mentioned that they are no more unitary matrices once truncated. Please clarify this point.
In the previous lecture, he mentioned that for truncated matrices, U'U=I, but it is not the case that UU'=I. In this lecture, he only made use of the first fact. So you are correct they are not "full" unitary, but you only need "half" unitary for this proof to go through.
For this amazing videos you are going to heaven.
Great video, thanks for it, but I have a question regarding your explanation of positive semidefinite in minute 4:36. You said that X*X is symmetric and positive semidefinite because the elements are inner products. Nevertheless, according to the definition of an inner product, it is only positive definite when and it does not occur for all the elements of that matrix. Can anyone explain to me if that was a mistake in the video or if I am wrong, please?
Hi! Yes, off-diagonal elements of a matrix can be negative, but this does not mean that the matrix is not positive semi-definite. Positive semi-definite ensures that the eigenvalues are not negative. You can google about the Gram matrix :)
At first I also had this question )
Making my research ever so meaningful! Thank you.
Thanks for the video!
Does the data have to be centered/standardized for transpose(X)*X to be the "correlation matrix" ?
Thank you very much for these amazing lecture videos, I bought the book after watching a few of them. I am also very happy that you made it available as an ebook as well, it works great on my Kindle :)
Great lecture Steve. You say that we usually want the economy SVD. In what situations would we want to compute the full svd please?
I am greatful for such a good series of videos with you Steve as your are really talented. I have a question that might be basic : to obtain a proper correlation matrix, one should center X and reduce it afterwards ? I asked chatgpt but trust you more on this ^^'
I know this specific video is about intuition, I'm just asking to confirm my understanding thanks :)
Thanks, I'm glad you like them! Yes, typically you would center X (subtract the column-wise mean or row-wise mean first, and then take inner products with either all pairs of columns, or all pairs of rows). We discuss this in databookuw.com/databookV2.pdf if you want more details.
Ok super thanks!@@Eigensteve
Your teaching is so impressive and awesome.
You're awesome. I wish I could watch this video a few years earlier.
Incredibly well explained. Thank you!
greatly put. again, linear algebra is magic.
Great thanks professor!!! At last I understood what SVD is
Hi. Great explanantion! Just had a slight confusion. Isnt it like Inner product signifies similarity but wont be able to give a correlation since 'technically' correlation could mean a measure of linear reltionship between 2 variables (That is, how much a value cahnges in response to the other)? Bu tin the video, it is marked as a correlation.
Brilliant. I think there is one little error at 6.10. Sigma has the size mxn, so Sigma Transpose is of the size nxm. They both have sigma1,..,sigma_m in the diagonal. So instead of Sigma^2 we get another mxm matrix. It doesn't change end result,but I thought it was important to mention to avoid confusion.
Thank you, your explanation is beautiful.
This video is incredible. Such a great explanation
THAT WAS A W E S O M E!
Thank you so much! So clear, this is what you need to be a good engineer/scientist in any fields.
I think, If nxm X has n features of m faces, then (X-p)(X-p)^T , where p is nx1 vector of row means(here p is broadcast from nx1 to nxm), is the nxn covariance matrix of the features among those faces.
(X-q)^T (X-q), where q is the 1xm row vector of column means (again broadcast to nxm), is the mxm covariance between the faces among those features.
After dividing by the number of data points - 1.
is V the single vectors of the correlation matrix or the data matrix? 8:12
I performed XT*X for V and X*XT for U on a matrix X. But X does not equal to the U*S*VT. I even converted the values of U and V to orthogonal. Still does not equal to the orignal X. What am I doing wrong?
Very clean and clear explanation of the computational process for SVD. However, it doesn’t explain much the intuitive concepts of SVD in terms of what it is conceptually as a decomposition of a linear transformation with basis change.
There is just one little thing I am not following. Around 11:18, he says that sigma hat squared is the eigenvalues of the correlation matrix. But when I look at the equation X^{T} X = V \sigmahat ^ 2, the expression on the right does not look like what I expect to see for an equation about eigenvalues. I expect something like Av = \lambda v. Of course, that is an equation about a single eigenvalue and he is saying that sigmahat^2 is all of the eigenvalues. I just don't see how that is possible, since sigmahat^2 is an m x m matrix. Can anyone point me in the right direction?
Ugh... I see my mistake! V is not a vector. It is a matrix! So that equation is equivalent to lots of Av = \lambda v equations.
Why would you say U is the eigenvectors of the column space of the data when U is corresponding to X @ X.T which is the correlation between rows of X.
great video! thanks so much!i have a question... what if you have data where m>>n? so my datapoints outnumber the parameters i have evaluated for each of them. can you still interpret the SVD such that the columns in U and V are the eigenvectors of the row and column correlation matrix, respectively?
ah sorry! i think that may have been a silly question! i think i'll just switch the rows and columns of my data so that n>>>m and just pull out whichever U or V would be more meaningful for my data!
very clear. I love it professor
In the book (p. 13) you say:
"_This provides an intuitive interpretation of the SVD, where the columns of
U are eigenvectors of the correlation matrix XX* and columns of V are eigenvectors of X*X".
Isn't the rows of
U are eigenvectors of the correlation matrix XX* and rows of V are eigenvectors of X*X, is it?
Hello professor, many thanks for the video's. I have watching both your electrical engineering (laplace transform) videos, as well as this one on SVD. I am bit confused where the approximation you state (towards the end of the video) comes from. Is the original SVD itself approximate ? In the case of truncated SVD, we are just dropping the columns that would be zero'd anyway correct ?, so at what point of time do things become approximate.
Hi. What isn't a hat V there since we are using the truncated SVD? Thanks.
Why dot product of X is correlation matrix? Did we substructure the mean somewhere?
Why are the values of the correlation matrix inner products of the columns of X and not of the rows of X? Don't we want to see the correlations between different features of the faces, not between the different faces themselves?
Great question. We dig into this more in the other videos in the playlist. But short answer is that you can look at the row-wise or column-wise correlation matrices, and they will have the same non-zero eigenvalues (really quite surprising when you first think about it). Both of these can be used to extract the SVD.
@@Eigensteve ahh ok thanks so much for the reply! Will watch the whole playlist now. Great videos btw
Hi, so I'm just wondering, does a column in X here represent a feature column, and the rows represent the different samples, or do the columns represent the different samples and the rows represent the different features?
@7.28 can we say that svd= v€2VT is equally represented by cholesky decomposition LDLT?
Hi Prof. Steve,
I don't understand why the eigenvectors given by the right singular vectors do not correspond to the same vectors when I compute Eigenvectors function (in Mathematica software) of the matrix X^T.X. Could you please help me?
I have a singular matrix X ={{-1, 0, -1}, {1, -1, 0}, {0, 1, 1}}, with rank =2, with the economy SVD I get V= {{1/Sqrt[2], -(1/Sqrt[6])}, {0, Sqrt[2/3]}, {1/Sqrt[2], 1/Sqrt[6]}}. The relationships are clearly satisfied as the definition of eigenvectors, i.e., X^T.X.v=3.v.
However, when I run the Eigenvector[ X^T.X] it returns {{1, 0, 1}, {-1, 1, 0}, {-1, -1, 1}}, where the first vector is ok, and corresponds to 1/Sqrt[2]*{1, 0, 1}, but the remain ones have no correlation with v2= {-(1/Sqrt[6]), Sqrt[2/3], 1/Sqrt[6]}. However, I know that this last eigenvector {-1, -1, 1} corresponds to the (right) null space of X. Am I doing something wrong?
Maybe it is because this matrix has the four spaces and that is the reason for these vector are not correponding??
So, my question is, why I got different eigen vectors when considering two different techiques?
Shouldn't they span the same space?
Thank you
thank you very much! you video is most intuition!!!😊
There is a mistake in min 8:55 Let * deonte the transpose : (X*X)* = X*X => X*X is symmetric and therefor XX* is not the transpose of X*X (see math.vanderbilt.edu/sapirmv/msapir/prtranspose.html)
Excelent content, really love it! thanks
Great explanation! thanks for sharing
My pleasure!
This is gold.
The class is amazing. thanks. I just wonder where you store the python code for this class
Glad you like it! Python code is at databookuw.com or at github.com/dynamicslab/databook_python
Thank you so much, your videos helped a lot !
I have a question. Does anyone know why the value of inner product is related to the correlation ?
The class is just incredible, and you are a very good teacher. But I just want to know, how do you write it in front of you and the text and math symbols are not inverted to us? Thats a nice magic trick.
Beautifully explained :)
Is it a guarantee that X^T X is positive semi-definite?
Nicely explained again
Hi, why is U_T*U identity matrix?
TH-cam should add in option to put multiple likes
great video series
Why not X.X(T) cancel out? like V.V(T) and U.U(T)?
X is the original matrix, it is not orthonormal like U and V are. SVD works for all types/shapes of matrices.
Just to note, it is not the correlation matrix but the covariance matrix which is gained by the X^T X etc. Otherwise great video, I really needed this type of intuition for SVD:s.
I know that your post is a year old, but I was surprised that you seem to have been the only person that noticed the "correlation"/"covariance" confusion. Too bad that Steve Brunton did not respond to your post.