also this looks a lot like LSA/LSI where you have a sparse matrix of words and document which you do some kind of matrix factorization with SVD except the documents are small strings and the words are subword ngrams.
I love the idea of densify-ing sparse matrix but I wonder if PCA is the best method. PCA will make principle components that preserve the most variance. Could you or should you use another matrix factorization like nonnegative matrix factorization or truncated SVD?
Man I'm not used to comment on TH-cam, but i love your videos
Just as I was going to open the "next video" I noticed this was posted 19 hours ago
Very good problem statement and justification for PCA based on a realistic data set.
also this looks a lot like LSA/LSI where you have a sparse matrix of words and document which you do some kind of matrix factorization with SVD except the documents are small strings and the words are subword ngrams.
I love the idea of densify-ing sparse matrix but I wonder if PCA is the best method. PCA will make principle components that preserve the most variance. Could you or should you use another matrix factorization like nonnegative matrix factorization or truncated SVD?
Oh TruncatedSVD would also totally work!