Couldn't you do a Fourier transform to lose the spatial misalignment? Face landmark extration uses such an approach as well. But I suspect if you have a deep network with lots of hidden layers. They will do such transformations on their own instead of giving it explicitly?
Hi, Mr. Brunton. This video is about SVD, but I could not find it in the SVD playlist. I don't really know if it belongs to that playlist, but I think that it should be there.
Thanks. So we can't apply these dimension reduction techniques for sequential data like videos. If I have clip of 32 frames, want to find key /important frames among these 32 frames, can't we use svd for this task?
@Steve: How do we make sense of these alignment caveats in your example of image compression from earlier? I mean, the different columns of the image data matrix could have no (humanly) agreeable meaning on what their elements stand for. To repose the question, why/how does SVD manage to see enough correlations between the columns that it can get away with representing all of them with just a truncated set
Could this issue be mitigated by subtracting the mean of the data matrix X and thus centering the data? Thank you in advance for sharing your time with us doing these extremely helpful tutorials. Greetings from Spain!
Hi there, I believe that normalizing data (0 mean, 1 variance) does not solve the problem. Normalization makes the mean of the (i, j) pixel to be 0. It does not change the location of the face in the image, e.g. the mouth is still in the rectangle (i, j, k, l) (top, left, bottom, right). Given an unaligned dataset (e.g. photos on facebook), you need to locate the bounding box of the face and crop to that box. The best tool is now Convolutional Nets, i.e. you need deep learning for this.
Hi professor Brunton. I have been attempting to apply PCA/SVD to some time varying data and your video confirmed my issues I was having with time alignment (or lack of!). You mention that this is a current field of study - I..e. attempting to adapt linear algebra methods such as this so that they become invariant to rotations etc. Can you point me in the direction of any good papers that cover such research? (I've been doing a bit of googling but nothing is coming up for me). Many thanks, Rory
SVD application in Image Processing. It's beautiful and unimaginable
This is really brilliant. Amazing work professor!
Cool! I was thinking about that all week: why convolutional neural network instead of SVD? Now I have the answer! Thank you again, Professor!
Couldn't you do a Fourier transform to lose the spatial misalignment?
Face landmark extration uses such an approach as well. But I suspect if you have a deep network with lots of hidden layers. They will do such transformations on their own instead of giving it explicitly?
Image Processing has techniques to address related issues. Hough Transform and others.
Hi, Mr. Brunton.
This video is about SVD, but I could not find it in the SVD playlist. I don't really know if it belongs to that playlist, but I think that it should be there.
Good catch, thanks! Just added it to the playlist.
ok we need translation, rotation and scale adjustments but don't we also need a transformation taking into account the effect of the focal lenght ?
Thanks.
So we can't apply these dimension reduction techniques for sequential data like videos.
If I have clip of 32 frames, want to find key /important frames among these 32 frames, can't we use svd for this task?
@Steve:
How do we make sense of these alignment caveats in your example of image compression from earlier? I mean, the different columns of the image data matrix could have no (humanly) agreeable meaning on what their elements stand for.
To repose the question, why/how does SVD manage to see enough correlations between the columns that it can get away with representing all of them with just a truncated set
Could this issue be mitigated by subtracting the mean of the data matrix X and thus centering the data? Thank you in advance for sharing your time with us doing these extremely helpful tutorials. Greetings from Spain!
Hi there,
I believe that normalizing data (0 mean, 1 variance) does not solve the problem. Normalization makes the mean of the (i, j) pixel to be 0. It does not change the location of the face in the image, e.g. the mouth is still in the rectangle (i, j, k, l) (top, left, bottom, right). Given an unaligned dataset (e.g. photos on facebook), you need to locate the bounding box of the face and crop to that box. The best tool is now Convolutional Nets, i.e. you need deep learning for this.
Hi professor Brunton. I have been attempting to apply PCA/SVD to some time varying data and your video confirmed my issues I was having with time alignment (or lack of!). You mention that this is a current field of study - I..e. attempting to adapt linear algebra methods such as this so that they become invariant to rotations etc. Can you point me in the direction of any good papers that cover such research? (I've been doing a bit of googling but nothing is coming up for me). Many thanks, Rory
Shameless plug, and the reference within the AAAI paper
th-cam.com/video/fDYPAj9WAbk/w-d-xo.html
Comment for algo