I rarely comment on youtube. But I have to say, you are by far the best teacher around here. This is the first channel I allowed the notifications. I want to watch every single of your amazing videos. I hope you never stop uploading.
Our math professor just said as an example that when compressing an image "you can throw away the remaining numbers and only keep the large sigmas of a image" and I asked myself why. And now I know why thank you so much.
For anyone confused about why np.diag is necessary, it is actually turning a vector into a matrix here rather than the other way around. S actually comes out of np.linalg.svd as a 1-D array of length n, and therefore should become the diagonal of an empty matrix of size nxn in order to work nicely with these matrix multiplication @ operators. This is what np.diag is doing. Oddly it will go either way depending on if you give it a matrix or a vector.
You can process the RGB values seperately with the same method. RGB is just a 3-entry vector: every entry in this vector is but a simple number representing the intensity of its respective color (R, G, or B) in the pixel, just like in the case for gray scale. Separate them out as an R-image, a G-image and a B-image, where each image could be represented as a gray scale image, and then process them separately like in the example in the video. You can put the results back into one tensor object that includes all 3 vector entries for each pixel when you're done processing, to represent the color version. Probably MATLAB and/or Python have packages that allow you to process RGB images in one take, so you don't need to tear the image apart and re-assemble it afterwards because the software will do it for you. In any case the underlying method remains the same as for gray scale.
Awesome!!! Thank you very much, merci bcp , gracias!!!! I loved your way of teaching. It is very understandable. This content is very useful for my Ph.D. Thank you again!!!!!
an image case of the approximation that SVD gets to in regards to the original image is easy to understand. It would be helpful to see a tall-skinny case like recording phases of an experiment.
Thank you Steve I have always wanted to know more about SVD and how it helps in dimentionality reduction... Also I want to know used cases where it can play a major role.. thank you in advance
Hi Steve, enjoyed your session. here I have a questions about the svd. In many scenarios, I see people do svd on the covariance matrix or just simply svd (A^T * A) instead of svd(A). Can you explain why or why not to do so?
Professor, thanks for making the concept of SVD very easy for us with the help of some real examples. I have a question. In the book, the amount of saved storage is also written. Can you please tell me how did you get it? I tried __sizeof__() and sys.getsize() but they both return the same value for x and xapprox. That means they take same memory space. Again, when I checked the shape of both matrices, they are of the same shape. So, how is our image is compressed if we think of the image to be an array of some numbers?
You decide the amount of compression by the "rank". Basically you just throw away all of the coefficients that don't noticeably contribute to the final image. Your intermediate output is the same size as the image, but the final output can be as small as you want depending on how faithfully you want to represent the original image after decompression.
So, here I'm getting the intermediate output, right? It it is, then how to get the final output? Just by cutting off some smaller coefficient than the threshold value?
@@Ajwadmohimin0 I think my reply to another comment may help clarify. (And I would call Xapprox the final output of the decompression, in the compression step it's not involved at all.) th-cam.com/video/QQ8vxj-9OfQ/w-d-xo.html&lc=UgySxL8I3zJqiqPtUlt4AaABAg.9Owk8AGSt769ijFmwq9Vli
I have a question that I cannot figure out. The output Xapprox and X have the same dimensions in terms of the number of rows and columns so they will take the same memory (I think). What is the value of such image compression?
Nice video. However, since "n" in this example is height of the image, isn't the reduction of the matrix by (height x r), not r x r 2:10. It is still a very nice illustration and example of feature reduction using SVD.
I don't hear him specify the exact reduction, but I'm also not sure you got it right. In the matlab-video they have a calculation of the stored size, and I think it should be r*(n+1+m) here. You want r rows and r columns and as well r elements from S, not only the columns. th-cam.com/video/QQ8vxj-9OfQ/w-d-xo.html&lc=UgySxL8I3zJqiqPtUlt4AaABAg.9Owk8AGSt769ijFmwq9Vli
In lectures X matrix has m columns and n rows where columns correspond to images and rows correspond to pixels. But in this example, we have only one image and columns do not correspond to images anymore. Could you explain the difference, please?
That is a really good point. Sometimes the columns of the matrix will be entire images that have been reshaped into column vectors. In this case, we are looking for correlations between images. But in this case, the entire matrix X is the image itself. So we are looking for correlations between the columns (vertical strips) of a single image.
@@Eigensteve If we wanted to compress a single RGB image, would each column of our data matrix be a concatenation of vertical strips of the image from each color channel? e.g. The first column of X would be the vertical concatenation of the first column of the image in the R channel + the first column of the image in the G channel + the first column in the B channel? Loving the lectures so far!
I see this question (how to interpret columns) keeps coming up still. I'd like to clarify that a matrix is a more abstract concept than one might think when exemplified by images. SVD doesn't care much about how you interpret the columns of X, as long as you stay consistent. You can reorder columns (just as you could add images of faces into X in any order you wanted) and SVD will work just as well. The eigenvectors are anyway ordered by importance. Even more interesting is that you could as well put rows of the image into columns of a matrix T. That would be T = X' = V S U' (same U,S,V as from svd(X)) so in essence you have the correlation among horizontal rows of the image in there too. See earlier video for that hint: th-cam.com/video/WmDnaoY2Ivs/w-d-xo.html
@@dwardster This should have been a separate comment. Did you try it? It should work ok. See other comment. th-cam.com/video/H7qMMudo3e8/w-d-xo.html&lc=Ugyc_eW3a77tYuxpXyN4AaABAg
you see the three values you are adding and averaging represent RGB values for that pixel.So it's analogous to talking red, blue, green paint in certain amounts(your values) and mixing to get a new color, which in this case would be a particular shade between black and white i.e. grey.
He didn't take the average of the third channel. He took the average of the third axis! If you take a look at numpy.mean help, the first parameter that accepts an integer is the axis parameter. By running the code, you will see that the A nparray (the variable which contains the original image) has 3 dimensions and it's shape is 2000,1500,3. When he takes np.mean(A,-1), he's actually taking the mean of the last axis of A. So, he indeed takes the mean of the 3 channels, not only the third one.
The A nparray has 3 dimensions. It's shape is 2000,1500,3. You could split it in 3 2-d arrays, each one for a color channel, R=A[:,:,0], G=A[:,:,1] and B=A[:,:,2], and then run SVD for each channel and then join them again. But maybe SVD can act on greater than 2-d nparrays. Would have to check.
@@chrlemes This is a valid way, it allows the channels to be handled separately and assumes nothing about them. Another option would be to insert all columns into X as if they were all grayscale (just keep track of order, e.g. R, G, B for each column in that order). This may benefit edges or shadows (an orange will be visible in multiple color channels), and I'm guessing it would compress better. The problem with color is however that errors might be much more visible than an error in grayscale.
Because the code would be needlessly messy if everything were repeated three times to compress each channel of the image. It's an example, not a production-ready implementation.
Your way to present your knowledge is outstanding. The combination of a whiteboard, yourself and the code is great!
I rarely comment on youtube. But I have to say, you are by far the best teacher around here. This is the first channel I allowed the notifications. I want to watch every single of your amazing videos. I hope you never stop uploading.
Our math professor just said as an example that when compressing an image "you can throw away the remaining numbers and only keep the large sigmas of a image" and I asked myself why. And now I know why thank you so much.
For anyone confused about why np.diag is necessary, it is actually turning a vector into a matrix here rather than the other way around. S actually comes out of np.linalg.svd as a 1-D array of length n, and therefore should become the diagonal of an empty matrix of size nxn in order to work nicely with these matrix multiplication @ operators. This is what np.diag is doing. Oddly it will go either way depending on if you give it a matrix or a vector.
After learning linear algebra from scratch it's time to see this videos and apply the knowledge as a programmer. Thank you.
Thank you so much for invaluable lectures and book!
Nothing beats this
This is truely impressive, better than any university lectures on these topics. One question: How do you deal with RGB images?
You can process the RGB values seperately with the same method. RGB is just a 3-entry vector: every entry in this vector is but a simple number representing the intensity of its respective color (R, G, or B) in the pixel, just like in the case for gray scale. Separate them out as an R-image, a G-image and a B-image, where each image could be represented as a gray scale image, and then process them separately like in the example in the video. You can put the results back into one tensor object that includes all 3 vector entries for each pixel when you're done processing, to represent the color version.
Probably MATLAB and/or Python have packages that allow you to process RGB images in one take, so you don't need to tear the image apart and re-assemble it afterwards because the software will do it for you. In any case the underlying method remains the same as for gray scale.
Thank you so much for the resources, it's so good to have the code and use it ourselves.
Awesome!!! Thank you very much, merci bcp , gracias!!!! I loved your way of teaching. It is very understandable. This content is very useful for my Ph.D. Thank you again!!!!!
Thank you, much better with the supporting code and how the maths is applied, certainly better than loading slides after slides...
I have to say these courses, slides, and presentations are so well prepared, thanks professor, and hope to learn more from you.
an image case of the approximation that SVD gets to in regards to the original image is easy to understand. It would be helpful to see a tall-skinny case like recording phases of an experiment.
Thank you Steve I have always wanted to know more about SVD and how it helps in dimentionality reduction... Also I want to know used cases where it can play a major role.. thank you in advance
G.O.A.T ...... Such a greattt explanation🎉🎉🙏🙏🙏
Amazing work...Thank you very much for the help...Could you please expand on a different lecture sometime on Fista algorithm?
Thanks for the suggestion -- i'll add it to the list.
Loved this video! Very good expanations. Thanks!
Excellent, and very helpful!
I'm impressed!!!!! This is exactly what i need thanks a lot
Good explanations, keep it up!
Amazing videos, very eloquent!!!! Could you tell me how you made them?
thank you for sharing how it actually is implemented!
Very good example.
Professor, you are so cool
Thanks! 😃
cute dog, love the series, thank you
excellent video!
Glad you liked it!
Grat explanation!!
Hi Steve, enjoyed your session. here I have a questions about the svd. In many scenarios, I see people do svd on the covariance matrix or just simply svd (A^T * A) instead of svd(A). Can you explain why or why not to do so?
Professor, thanks for making the concept of SVD very easy for us with the help of some real examples. I have a question. In the book, the amount of saved storage is also written. Can you please tell me how did you get it? I tried __sizeof__() and sys.getsize() but they both return the same value for x and xapprox. That means they take same memory space.
Again, when I checked the shape of both matrices, they are of the same shape. So, how is our image is compressed if we think of the image to be an array of some numbers?
You decide the amount of compression by the "rank". Basically you just throw away all of the coefficients that don't noticeably contribute to the final image. Your intermediate output is the same size as the image, but the final output can be as small as you want depending on how faithfully you want to represent the original image after decompression.
So, here I'm getting the intermediate output, right? It it is, then how to get the final output? Just by cutting off some smaller coefficient than the threshold value?
@@Ajwadmohimin0 I think my reply to another comment may help clarify. (And I would call Xapprox the final output of the decompression, in the compression step it's not involved at all.)
th-cam.com/video/QQ8vxj-9OfQ/w-d-xo.html&lc=UgySxL8I3zJqiqPtUlt4AaABAg.9Owk8AGSt769ijFmwq9Vli
im agraddecement with you, because this video is perfect for my proyect the modelament and simulation , thanks i
thank you so much for your video!
I have a question that I cannot figure out. The output Xapprox and X have the same dimensions in terms of the number of rows and columns so they will take the same memory (I think). What is the value of such image compression?
Nice video. However, since "n" in this example is height of the image, isn't the reduction of the matrix by (height x r), not r x r 2:10. It is still a very nice illustration and example of feature reduction using SVD.
I don't hear him specify the exact reduction, but I'm also not sure you got it right. In the matlab-video they have a calculation of the stored size, and I think it should be r*(n+1+m) here. You want r rows and r columns and as well r elements from S, not only the columns.
th-cam.com/video/QQ8vxj-9OfQ/w-d-xo.html&lc=UgySxL8I3zJqiqPtUlt4AaABAg.9Owk8AGSt769ijFmwq9Vli
This is an excellent video! Thank you. I subscriebed.
Hello Steve, thank for the video. HOW DO I CHANGE BACK TO THE RGB IMAGE AFTER REDUCTION OF IMAGE???
That was lost forever. See other comment. th-cam.com/video/H7qMMudo3e8/w-d-xo.html&lc=Ugyc_eW3a77tYuxpXyN4AaABAg
In lectures X matrix has m columns and n rows where columns correspond to images and rows correspond to pixels. But in this example, we have only one image and columns do not correspond to images anymore. Could you explain the difference, please?
That is a really good point. Sometimes the columns of the matrix will be entire images that have been reshaped into column vectors. In this case, we are looking for correlations between images.
But in this case, the entire matrix X is the image itself. So we are looking for correlations between the columns (vertical strips) of a single image.
@@Eigensteve If we wanted to compress a single RGB image, would each column of our data matrix be a concatenation of vertical strips of the image from each color channel? e.g. The first column of X would be the vertical concatenation of the first column of the image in the R channel + the first column of the image in the G channel + the first column in the B channel?
Loving the lectures so far!
I see this question (how to interpret columns) keeps coming up still. I'd like to clarify that a matrix is a more abstract concept than one might think when exemplified by images. SVD doesn't care much about how you interpret the columns of X, as long as you stay consistent. You can reorder columns (just as you could add images of faces into X in any order you wanted) and SVD will work just as well. The eigenvectors are anyway ordered by importance.
Even more interesting is that you could as well put rows of the image into columns of a matrix T. That would be T = X' = V S U' (same U,S,V as from svd(X)) so in essence you have the correlation among horizontal rows of the image in there too. See earlier video for that hint:
th-cam.com/video/WmDnaoY2Ivs/w-d-xo.html
@@dwardster This should have been a separate comment. Did you try it? It should work ok. See other comment. th-cam.com/video/H7qMMudo3e8/w-d-xo.html&lc=Ugyc_eW3a77tYuxpXyN4AaABAg
Respect 👐
what does this operand @ represents while u r multiplying U, S and VT?
That is the multiplication.
@@Eigensteve np.dot(A,B) is the same as A @ B ?
@@vivinvivin3020 yes'
my pycharm takes about a minute to compute the new X(with a specific rank), is it normal? The results look right
1990s - Lena
2020 - Mordecai
Thank you sir, Can you explain the jpeg compression in python sir
without converting it to a gray scale can we still convert these into unit block r, by diagonal r
It should work ok. See other comment. th-cam.com/video/H7qMMudo3e8/w-d-xo.html&lc=Ugyc_eW3a77tYuxpXyN4AaABAg
When I try to open the image I get a file not found error but I’ve downloaded the data - how do I fix this?
hello sir
nice video
sir i want to compress the dicom image
plz help me
thank you in advance
Could you explain how does taking the average of the third channel converts it into a grayscale image ?
you see the three values you are adding and averaging represent RGB values for that pixel.So it's analogous to talking red, blue, green paint in certain amounts(your values) and mixing to get a new color, which in this case would be a particular shade between black and white i.e. grey.
He didn't take the average of the third channel. He took the average of the third axis! If you take a look at numpy.mean help, the first parameter that accepts an integer is the axis parameter. By running the code, you will see that the A nparray (the variable which contains the original image) has 3 dimensions and it's shape is 2000,1500,3. When he takes np.mean(A,-1), he's actually taking the mean of the last axis of A. So, he indeed takes the mean of the 3 channels, not only the third one.
What is the link to the python code?
tried to run the code, returns an error in the loop where Xapprox is calculated, saying "too many indices for the array". How do I treat this?
check the variable where you are storing the diagonal element it should be as above variable
S=np.diag(S) here s should be similar as above U, S,VT
how to access the code?
how can we compress the image without turning it to grayscale?
The A nparray has 3 dimensions. It's shape is 2000,1500,3. You could split it in 3 2-d arrays, each one for a color channel, R=A[:,:,0], G=A[:,:,1] and B=A[:,:,2], and then run SVD for each channel and then join them again. But maybe SVD can act on greater than 2-d nparrays. Would have to check.
@@chrlemes This is a valid way, it allows the channels to be handled separately and assumes nothing about them. Another option would be to insert all columns into X as if they were all grayscale (just keep track of order, e.g. R, G, B for each column in that order). This may benefit edges or shadows (an orange will be visible in multiple color channels), and I'm guessing it would compress better. The problem with color is however that errors might be much more visible than an error in grayscale.
a possible mistake at 5:28
Why are you converting it into grayscale image?
Because the code would be needlessly messy if everything were repeated three times to compress each channel of the image. It's an example, not a production-ready implementation.
Cool!
I thought each column of X was suppose to be a single image.
Python is the bees knees.
Who is the single guy to dislike the video ?