My first "dumb" question is why the examples proceed using cells plotted vs genes (cells being dependent variable) rather than the other way (genes plotted vs cell where genes are treated as dependent va). Is this possibly because dimension of genes (i.e. #types) is larger than dimension cells (ie #cells in study)? Maybe it makes no difference in the end, but this point has me stuck...
@@jazznomad The format of the data in this video is consistent with how genomic data is formatted in real-life. I used to have a job working with this data on a day-to-day basis, and a typical file would have 6 columns, representing 6 different mice, and 20,000 rows, representing the 20,000 genes in the mouse genome.
You are the reason I am surviving my masters program. Seriously, thank you so much for this and all your videos. Best explanations on TH-cam, you're leading a generation of bioinformaticians/data analysts into successful futures. Thank you StatQuest!
There will never ever be someone in the whole world like you. Anytime I get stuck with ML basics I end up in your channel, and you never disappoint. It's amazing esp. after looking how 10 internet pages couldn't explain it as clear as you did. Bravo!
I've been almost crying for five days trying to find a video, paper or book that actually explained to me, step by step, with a clear example, how to interpret a PCA. I am doing geometric morphometrics with insect wings and I didn't know what the PCA was telling me about my data. Now I know and I am so grateful, thank you so much for this! I will recommend your channel to future generations of morphometricians at my lab :D
i absolutely love how you threw in the jargon at the end! i learned what i needed and then just put the name to the process rather than being bombarded with fancy words and having to solidify what they meant
Thank you! I'm terrible with jargon - I can't remember it - so I try to focus on the concepts that I can draw. If you're interested in learning a little more about PCA, check out my new and improved video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
What you do really well with these videos is start at the beginning. So many technical people just assume prior knowledge, whereas you really don't assume anything. Thank you for this explanation.
This hands down, the best explanation of PCA in the context of RNA-seq data (and data in general) that I have ever seen. I have seen many tutorials fumble around the concepts, or more often hand wave away the details of the PC1/PC2 coordinate calculations as if it were intuitive (it is not!). Thank you so much for clearing it all up, especially at 12:40 where you fill in the gap of my knowledge! THANK YOU! Thank you!
After being a user of youtube for more than 7 years, this is my first comment ever. I just wanted to express how appreciative I am about your efforts and videos..... what a wonderful explanation!!
Just spent 4 hours using this video as lecture notes!! Can't imagine how confusing this would have been in a standard uni lecture! This is an amazing vid!
Better than my professor, I had watched my professor's video for four times but still haven't got a clue about what PCA is, but I only whatched this video once and now I finally understand what do all thoes fancy words mean and what's the logic behind it. Best video ever, life saver, thanks!
This video is AMAZING. You can never get too granular in explaining this type of stuff since it is so theoretical and important to understand, so I appreciate the hell out of you for making this. EDIT: Also love the approach of explaining it in an intuitive way then bringing in the terminology at the end. I wish all teachers did this, a vocab word means nothing to me unless I understand it's use case.
I tried different tutorials before trying to understand PCA and only got more confused in certain way. you are the best ! your explanation is just the scratch to the itchy part in mind , no more no less, simple and clear enough, just perfect ! thanks !
Hooray! I'm glad you like the video! If you have time, check out the new and improved version - I know it's hard to imagine, but it's even better (although I like how this one explains dimensions): th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Thank you!. I am doing Machine Language and for the life of me I couldn't get my head around PCA's. I came across this video by accident. You made it sound so easy The lecturers made it so complicated and i was worried about the exams coming up. Much appreciated
So helpful after an hour and a half lecture on PCA from my professor that made absolutely no sense and a midterm coming up leaving me feeling hopeless. Thank you so much!
Awesome Video! I love this so much, I love how you start from 1d, 2d, 3d and CLEARLY EXPLAIN IT!!! Thank you so much from a Suffering Biochem PhD Student Trying to Understand PCA in Stats class
Hooray!!! I'm glad the video is helpful. If you have time, check out the new and revised version, which goes a little deeper into how PCA actually works (but it's just as easy to understand): th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Perfect Explanation! The art that very few people have to explain highly abstract concepts translated in simple and understandanble real-life language and examples instead of math style crappy explanations which only math-faculty people are able to understand! Thank you so much Joshua!
I have been hearing a lot about eigenvectors since my undergraduate degree. Never understood it, and I didn't expect to learn much about it. But looking at this video, everything changed and now life has become a lot easier. Thank you so much.
Hey Joshua, I have a quick question. Why is it that genes on the fringes of the PC axis get the highest magnitude influence or loading value? Genes that fall on the line in the example you showed with 2 cells seem to be approximately evenly expressed between the 2 cells (Take for example, gene A, 10 vs. 8) and so I dont understand why that should be considered a gene that contributes to high variance between Cell 1 and 2. Or, am I understanding it incorrectly? Do they have high influence because they contribute the most to the length of the line drawn across the varying genes, or PC? If so, it seems like I'm not getting the full picture here haha.
dear sir , i have to do metabolomics analysis of 1D H'NMR Data ..how i will do it i went with some software but facing lot of problem,I am totally new in this field . If you could suggest me any better option , then it will be very helpful for me
Sir, thank you so much! I have read dozens of articles on PCA and got no understanding whatsoever. Now I have watched your video and it all became clear.
The Movie and TV 3D to 2D example was the most mind-blowing and catchiest explanation of something really complicated I've ever listened to. Thank you very much for that!! P.S. The second most mind-blowing explanation was yours of the kernel trick in your SVM video! I just recognize, both explanations deal with dimensionality reduction... maybe this is just some mind blowing thing! :)
Glad you like the video! If you have time, you should check out the new and revised version of this video. It's just as easy to understand, but it does a better job explaining how PCA works: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
The ultimate video for the PCA. Thanks a lot dear Joshua for this wonderful and illustrative video. I am requesting you to post similar videos on CVA, CCA, LDA......I am speechless...very happy to say that finally I understood what is PCA and how is PCA..
Hooray! I'm glad you like video! I've started to do a few videos on machine learning topics and I hope to do more in the future. If you have time, check out the new and revised version oft this PCA video. It's just as easy to understand, but goes into more detail on how PCA actually works: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
I've learned about PCA a couple of times but never quite got what some of the concepts in it were in reference to... When you said that the array of the "influence weights" is called an eigenvector, i felt like my mind exploded since it finally all clicked. thanks again man!
Dear Joshua, I don't usually leave comment on TH-cam video at all but I have to for your video(s). Definitely the best PCA explanation ever hands down. Thank you very much and please keep you with your great works!!
I'm glad you like this StatQuest! If you have time, you should check out the new and improved version of this video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
the one talked here takes cells as dimensions for calculation of the first two components. What about taking genes as dimensions for calculation of the first two components. In this case, we can draw cells directly in the plot without further calculation using the loadings (influence score). Which PCA plot is better?
+Joshua Starmer Thank you for the great video! But, I have the same confusion. To me the variation is in the expression of genes, so the question is that what single gene has the highest expression variability among multiple cells (meaning is a better predictor to discriminate different cell types). It is different from asking what single cell has the highest variability among multiple gene expressions. For example, for the hypothetical case that you describes at 6:30, it is the cell 1 that has high variability among its different genes expression level, and the cell 2 expresses similar levels of different genes. So it sounds to me as it is the the variability among multiple genes within each cells that determine the principal component ranking, not the variability of a single gene among multiple cells. Maybe I am missing something here, but this seems to me counter-intuitive.
+Joshua Starmer, thank you for the video. It was indeed intuitive. Except that I also had the same confusion as these two here. When you started discussing the dimensions of the data set, you considered each cell as a dimension and each gene as an observation unit/data point. However, when you used PCA to reduce the dimensionality and rotate the data, you shifted and switched the two - the genes now are the dimensions and the cells are the units of observations/data points. From my understanding, PCA only rotates the data and potentially reduces dimensionality, it should not change what the points represent. Thank you!
Same confusion for me! Thanks for the video, but I got lost in the middle of it unfortunately. Shouldn't we use the genes as features of the cells (namely, the data points)? In other words, shouldn't each cell be represented by N genes and not the other way around?
+therealrictuar its viewership is hindered by its title and tag. searching "principle component analysis statquest" doesn't lead to this video in the first several results
wow i love revisiting this comment years later after statquest blew up. its so interesting how full circle my statistical learning journey has become. so glad statquest made so many more vids.
If you have time, you should check out the updated and revised version of this video. It should clear up the questions you have. If not, please post more questions: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
I've spent the better part of six hours trying to teach myself PCA for an assignment and completely failed to understand what the SPSS output I had actually meant until just now. Thank you so fucking much, I was really struggling to get this final piece of my assignment finished.
I didn't get the part about how to find the most influencing gene from checking the PC1 scores. Based on your video, the PC1 or PC2 scores are sum of counts of genes multiplied by their influence factor. I don't know how to find out a specific gene in this case.
where can I find the scores (influence factors) of a certain gene? I usually use an R script developed by my colleague to generate PCA plot for my RNA-seq data. But I don't know how to identify which genes are important and can give me best separation of my samples. All I get is calculated PC1, PC2, PC3 & PC4 values of each sample.
Hello Joshua, Thank you very much for your helpful video. I learned a lot from this, and I'd like to apply it to my study. Could you please also send me your R script that does all the PCA stuff? I would highly appreciate for your help. Best regards Amber
Dear Joshua, Could you teach me how to assign our samples into particular group (by adding different color or making circles both ways are fine for me). Thank you very much
Thanks again for the wonderful video about PCA. For those who have the same confusion like me: I tried to consider Cell columns are features of a Gene, but actually it's just in the opposite manner. I guess it would be helpful if the table was drawn with Cells as indexes and Genes as columns so that we know Genes are features(components) of any Cell. A data point is a cell with Genes as its vector. Here is my understanding regrading PCA in general: Intuitively, we plot a PCA to research correlations by comparing significance sum(influence) of all the components of a datapoint to other datapoints.
This is an excellent video describing PCA. Lots of others have said so, but I thought it was worth commenting again because it really did help me a ton. Thanks for taking the time to make it!
The TV / Movie analogy was a brilliant addition to this video. I've never heard anyone liken PCA to something like that but it will now forever be ingrained into my mind (which is an amazing achievement on your part really!). Really loved the teaching style.
Awesome!!! Thank you so much. I'm glad you liked that part of the video - it's my favorite part as well. If you have time, now that you understand what it means to reduce dimensions, you should check out the new and improved version of this video (the new version doesn't do as good a job explaining dimensions as well - but it does a better job explaining how PCA actually works): th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Most excellent explanation, of PCA ever seen (, well at least the first 11 minutes ) Massive help to grasp the basic understanding of PCA without mentioning Eigenvectors.
Great!!! Love the way you explain from zero. I have been reading and even went to a class trying to understand this concept but this video is the TOP! Thank you!
Whichever video I start, first I hit the like button, because I am fully confident that your video will give me enough understanding of the given title. You are just awesome😁Thank you.
Thank you so much for the clear explanation to help me understanding the main idea that is mostly buried while reading about PCA. I hope other math concepts can be explained in a similar simple way.
This was great! I just got my first RNA-seq data (which somebody analyzed with PCA) and now I can understand how it was generated and what it means! Thank you so much Joshua!
Thank you so much! If you have time, you should check out the new and revised version of my PCA video. Believe it or not, it's even better! th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Really Brilliant... Thank you very very much...I have not found a more simple way of explaining the concept any where..The best resource on understanding the concept.
Cary on please, your videos are one of the most useful videos I have seen. In my case they make connection between theoretical explanations and common sense so I can truly understand the topic and use that method in real world applications. Once again, great job!
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
My first "dumb" question is why the examples proceed using cells plotted vs genes (cells being dependent variable) rather than the other way (genes plotted vs cell where genes are treated as dependent va). Is this possibly because dimension of genes (i.e. #types) is larger than dimension cells (ie #cells in study)? Maybe it makes no difference in the end, but this point has me stuck...
@@jazznomad The format of the data in this video is consistent with how genomic data is formatted in real-life. I used to have a job working with this data on a day-to-day basis, and a typical file would have 6 columns, representing 6 different mice, and 20,000 rows, representing the 20,000 genes in the mouse genome.
You are the reason I am surviving my masters program. Seriously, thank you so much for this and all your videos. Best explanations on TH-cam, you're leading a generation of bioinformaticians/data analysts into successful futures. Thank you StatQuest!
Thanks and good luck! :)
@@statquest Thank you!
3 years later and helpin me suvive my masters program!
There will never ever be someone in the whole world like you. Anytime I get stuck with ML basics I end up in your channel, and you never disappoint. It's amazing esp. after looking how 10 internet pages couldn't explain it as clear as you did. Bravo!
Thank you very much. :)
any book gathering all your knowledge ! impatient about hearing that this idea is on here way . Thank you for all what you are doing.
It is true
I've been almost crying for five days trying to find a video, paper or book that actually explained to me, step by step, with a clear example, how to interpret a PCA. I am doing geometric morphometrics with insect wings and I didn't know what the PCA was telling me about my data. Now I know and I am so grateful, thank you so much for this! I will recommend your channel to future generations of morphometricians at my lab :D
Thank you very much! If you have time, check out the new and updated version of this video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
i absolutely love how you threw in the jargon at the end! i learned what i needed and then just put the name to the process rather than being bombarded with fancy words and having to solidify what they meant
Thank you! I'm terrible with jargon - I can't remember it - so I try to focus on the concepts that I can draw. If you're interested in learning a little more about PCA, check out my new and improved video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
whoa! Have been doing PCA for past 5 years but this is the first time I understood it. Thanks a ton!!!
more at @ th-cam.com/video/a9jdQGybYmE/w-d-xo.html
What you do really well with these videos is start at the beginning. So many technical people just assume prior knowledge, whereas you really don't assume anything. Thank you for this explanation.
Thanks!
This hands down, the best explanation of PCA in the context of RNA-seq data (and data in general) that I have ever seen. I have seen many tutorials fumble around the concepts, or more often hand wave away the details of the PC1/PC2 coordinate calculations as if it were intuitive (it is not!). Thank you so much for clearing it all up, especially at 12:40 where you fill in the gap of my knowledge! THANK YOU! Thank you!
After being a user of youtube for more than 7 years, this is my first comment ever. I just wanted to express how appreciative I am about your efforts and videos..... what a wonderful explanation!!
Just spent 4 hours using this video as lecture notes!! Can't imagine how confusing this would have been in a standard uni lecture! This is an amazing vid!
Glad it helped!
Better than my professor, I had watched my professor's video for four times but still haven't got a clue about what PCA is, but I only whatched this video once and now I finally understand what do all thoes fancy words mean and what's the logic behind it. Best video ever, life saver, thanks!
:)
I have to say this is by far the best PCA explanation that I have seen
If every video on youtube were this clear, the world would be a better place
This video is AMAZING. You can never get too granular in explaining this type of stuff since it is so theoretical and important to understand, so I appreciate the hell out of you for making this.
EDIT: Also love the approach of explaining it in an intuitive way then bringing in the terminology at the end. I wish all teachers did this, a vocab word means nothing to me unless I understand it's use case.
Thank you very much! :)
I tried different tutorials before trying to understand PCA and only got more confused in certain way. you are the best ! your explanation is just the scratch to the itchy part in mind , no more no less, simple and clear enough, just perfect ! thanks !
Josh, that was by far the best structured and most comprehensive bit on PCA I have ever seen. Thanks so much!
Hooray! I'm glad you like the video! If you have time, check out the new and improved version - I know it's hard to imagine, but it's even better (although I like how this one explains dimensions): th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Thank you!. I am doing Machine Language and for the life of me I couldn't get my head around PCA's. I came across this video by accident. You made it sound so easy The lecturers made it so complicated and i was worried about the exams coming up. Much appreciated
Hooray! We've made it to the end of another video on "PCA - clearly explained". Both the versions (2015 and 2018) are super helpful!
Awesome!!! :)
So helpful after an hour and a half lecture on PCA from my professor that made absolutely no sense and a midterm coming up leaving me feeling hopeless. Thank you so much!
Awesome Video! I love this so much, I love how you start from 1d, 2d, 3d and CLEARLY EXPLAIN IT!!! Thank you so much from a Suffering Biochem PhD Student Trying to Understand PCA in Stats class
Hooray!!! I'm glad the video is helpful. If you have time, check out the new and revised version, which goes a little deeper into how PCA actually works (but it's just as easy to understand): th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Perfect Explanation! The art that very few people have to explain highly abstract concepts translated in simple and understandanble real-life language and examples instead of math style crappy explanations which only math-faculty people are able to understand! Thank you so much Joshua!
Best video explaining PCA on youtube. Thank you Joshua!
I have been hearing a lot about eigenvectors since my undergraduate degree. Never understood it, and I didn't expect to learn much about it. But looking at this video, everything changed and now life has become a lot easier. Thank you so much.
bam!
Best video ever on PCA .Thanks a lot Joshua
I agree , Best video on PCA, I was struggling since today morning , now its clear. Thank u
Hey Joshua, I have a quick question. Why is it that genes on the fringes of the PC axis get the highest magnitude influence or loading value? Genes that fall on the line in the example you showed with 2 cells seem to be approximately evenly expressed between the 2 cells (Take for example, gene A, 10 vs. 8) and so I dont understand why that should be considered a gene that contributes to high variance between Cell 1 and 2. Or, am I understanding it incorrectly? Do they have high influence because they contribute the most to the length of the line drawn across the varying genes, or PC? If so, it seems like I'm not getting the full picture here haha.
Agreed!
ishabh Kaushik Leighton1973
dear sir , i have to do metabolomics analysis of 1D H'NMR Data ..how i will do it i went with some software but facing lot of problem,I am totally new in this field . If you could suggest me any better option , then it will be very helpful for me
Sir, thank you so much! I have read dozens of articles on PCA and got no understanding whatsoever. Now I have watched your video and it all became clear.
Thanks! If you like this video, check out the new and improved version: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
best video tutorial on PCA.. thanks a lot for sharing.. now the basics of PCA are clear to me.
The most clear explanation for PCA on TH-cam even for a chemometrician. Thanks!
The Movie and TV 3D to 2D example was the most mind-blowing and catchiest explanation of something really complicated I've ever listened to. Thank you very much for that!!
P.S. The second most mind-blowing explanation was yours of the kernel trick in your SVM video!
I just recognize, both explanations deal with dimensionality reduction... maybe this is just some mind blowing thing! :)
BAM! :)
My goodness. Simply Brilliant. Simply Brilliant. The video is long but ABSOLUTELY WORTH WATCHING ALL THE WAY THROUGH. THANK YOU!
If you have some familiarity with math and statics, play the video in 'speed 1.25' and it will be the best video ever.
Wow I learned more about PCA in that 20 minute video than I did all semester in my stats course. Thank you so much for this great explanation!
THANK YOU for dissipating the darkness, at last!!!
Teaching the topic without the scary words definitely helps. Thank you
Glad it was helpful!
Absolute gem... thanks for your time Josh for all of us
Glad you like the video! If you have time, you should check out the new and revised version of this video. It's just as easy to understand, but it does a better job explaining how PCA works: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
OMG ! Simplicity at its best. Astonising. What an explanation it was. Hats off to you .
You have just saved a chemistry student!!! Thank you so much😃
Happy to help!
The ultimate video for the PCA. Thanks a lot dear Joshua for this wonderful and illustrative video. I am requesting you to post similar videos on CVA, CCA, LDA......I am speechless...very happy to say that finally I understood what is PCA and how is PCA..
Sir you should start posting tutorials on deep learning. Much needed going by this quality of explanation skills.
Hooray! I'm glad you like video! I've started to do a few videos on machine learning topics and I hope to do more in the future. If you have time, check out the new and revised version oft this PCA video. It's just as easy to understand, but goes into more detail on how PCA actually works: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
I've learned about PCA a couple of times but never quite got what some of the concepts in it were in reference to... When you said that the array of the "influence weights" is called an eigenvector, i felt like my mind exploded since it finally all clicked. thanks again man!
This is so amazing. Has to be most informative video about PCA. The dimensions blew me away, beautiful!
Thank you! If you like this video, you might also like the updated version: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
This right here, just saved me 2 hours before an exam, (used for BCI competition IV). Thank you.
Good luck! If you have more time, check out the updated version: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
By far the best video on PCA. Thank you sir :)
Dear Joshua, I don't usually leave comment on TH-cam video at all but I have to for your video(s). Definitely the best PCA explanation ever hands down. Thank you very much and please keep you with your great works!!
Hooray Joshua! very well explained! I'm subscribing.
6 years on still one of the best explanation, Thank you @StatQuest
Thank you! :)
Loved it! Thank you.
I've been trying to figure out this topic for two weeks. The professor and two textbooks were no help. But this video is perfect. Thank so much!
I'm glad you like this StatQuest! If you have time, you should check out the new and improved version of this video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
This old video feels more lucid than the later video published. Thanks. Regards from India, sir!
Thank you!
the one talked here takes cells as dimensions for calculation of the first two components.
What about taking genes as dimensions for calculation of the first two components. In this case, we can draw cells directly in the plot without further calculation using the loadings (influence score).
Which PCA plot is better?
+Joshua Starmer Thank you for the great video!
But, I have the same confusion. To me the variation is in the expression of genes, so the question is that what single gene has the highest expression variability among multiple cells (meaning is a better predictor to discriminate different cell types). It is different from asking what single cell has the highest variability among multiple gene expressions. For example, for the hypothetical case that you describes at 6:30, it is the cell 1 that has high variability among its different genes expression level, and the cell 2 expresses similar levels of different genes. So it sounds to me as it is the the variability among multiple genes within each cells that determine the principal component ranking, not the variability of a single gene among multiple cells. Maybe I am missing something here, but this seems to me counter-intuitive.
+Joshua Starmer, thank you for the video. It was indeed intuitive. Except that I also had the same confusion as these two here. When you started discussing the dimensions of the data set, you considered each cell as a dimension and each gene as an observation unit/data point. However, when you used PCA to reduce the dimensionality and rotate the data, you shifted and switched the two - the genes now are the dimensions and the cells are the units of observations/data points. From my understanding, PCA only rotates the data and potentially reduces dimensionality, it should not change what the points represent. Thank you!
Same confusion for me! Thanks for the video, but I got lost in the middle of it unfortunately. Shouldn't we use the genes as features of the cells (namely, the data points)? In other words, shouldn't each cell be represented by N genes and not the other way around?
I have the same confusion. I thought I understood PCA, now I am not sure.
I am a very passionate follower of these classes. Thank very much Dr. Josh
Thanks!
holy shit this video is amazing
+therealrictuar its viewership is hindered by its title and tag. searching "principle component analysis statquest" doesn't lead to this video in the first several results
wow i love revisiting this comment years later after statquest blew up. its so interesting how full circle my statistical learning journey has become. so glad statquest made so many more vids.
I just have to tell you you save our lives. Whenever there is a hard concept, I can always find help here. I appreciate
Thank you! :)
well done mate!
If you like this video, you should check out the new, improved version of it: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
I don't think an explanation can get better than this.
Excellent
Amazing how for some subjects it looks like no one really understand how it works. You clearly do and that is why the video is good.
great video, but this is only for 2 genes, what do we do with PC3 and on??
If you have time, you should check out the updated and revised version of this video. It should clear up the questions you have. If not, please post more questions: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
I don't think I've ever understood PCA the way I just did. THANKS A LOT.
Do you have any videos on Pareto analysis?
"It will keep your head from exploding"
-Double BAM
:)
I've spent the better part of six hours trying to teach myself PCA for an assignment and completely failed to understand what the SPSS output I had actually meant until just now. Thank you so fucking much, I was really struggling to get this final piece of my assignment finished.
I didn't get the part about how to find the most influencing gene from checking the PC1 scores. Based on your video, the PC1 or PC2 scores are sum of counts of genes multiplied by their influence factor. I don't know how to find out a specific gene in this case.
where can I find the scores (influence factors) of a certain gene? I usually use an R script developed by my colleague to generate PCA plot for my RNA-seq data. But I don't know how to identify which genes are important and can give me best separation of my samples. All I get is calculated PC1, PC2, PC3 & PC4 values of each sample.
I sent you my email address through Google+ earlier. Thanks!
Hello Joshua,
Thank you very much for your helpful video. I learned a lot from this, and I'd like to apply it to my study. Could you please also send me your R script that does all the PCA stuff? I would highly appreciate for your help.
Best regards
Amber
Thank you in advance
Dear Joshua,
Could you teach me how to assign our samples into particular group (by adding different color or making circles both ways are fine for me).
Thank you very much
Thanks again for the wonderful video about PCA.
For those who have the same confusion like me:
I tried to consider Cell columns are features of a Gene, but actually it's just in the opposite manner.
I guess it would be helpful if the table was drawn with Cells as indexes and Genes as columns so that we know Genes are features(components) of any Cell. A data point is a cell with Genes as its vector.
Here is my understanding regrading PCA in general:
Intuitively, we plot a PCA to research correlations by comparing significance sum(influence) of all the components of a datapoint to other datapoints.
I have an updated version of this video that is easier to understand. See: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
@@statquest thank you, Josh.
This Is like heroin to my ears.
Greate video!
It's really the best one. I have been searching long for a good lesson about PCA, and I lastly found it. I strongly recommend it.
Awesome, thank you!
Thank you for explaining PCA in such a simplified manner for biologists.
Glad you liked the video! If you have time, you should check out the new and revised version of this video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
This is an excellent video describing PCA. Lots of others have said so, but I thought it was worth commenting again because it really did help me a ton. Thanks for taking the time to make it!
The TV / Movie analogy was a brilliant addition to this video. I've never heard anyone liken PCA to something like that but it will now forever be ingrained into my mind (which is an amazing achievement on your part really!). Really loved the teaching style.
Awesome!!! Thank you so much. I'm glad you liked that part of the video - it's my favorite part as well. If you have time, now that you understand what it means to reduce dimensions, you should check out the new and improved version of this video (the new version doesn't do as good a job explaining dimensions as well - but it does a better job explaining how PCA actually works): th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
Thank you! I will check it out.
Most excellent explanation, of PCA ever seen (, well at least the first 11 minutes ) Massive help to grasp the basic understanding of PCA without mentioning Eigenvectors.
I have not came across an easy explanation of PCA than this. Thank you very much.
Thanks! :)
Definitely the BEST 'Clearly Explained' video on PCA!!!
Great!!! Love the way you explain from zero. I have been reading and even went to a class trying to understand this concept but this video is the TOP! Thank you!
Thanks! If you have time, check out the updated version of this video: th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
There can't be any simpler explanation than this video.
Thank you, Josh
i really liked the way you used our elementary concept of plotting graph to explain this PCA !! it helped a lot to get the concept. Thank you (Y)
You're the best virtual teacher i've ever had, I felt smart for 20 minutes (lol). Regards from Mexico!
Whichever video I start, first I hit the like button, because I am fully confident that your video will give me enough understanding of the given title. You are just awesome😁Thank you.
Thank you very much! :)
2017 and it's still the best pca video
Thank you so much for the clear explanation to help me understanding the main idea that is mostly buried while reading about PCA. I hope other math concepts can be explained in a similar simple way.
This was great! I just got my first RNA-seq data (which somebody analyzed with PCA) and now I can understand how it was generated and what it means! Thank you so much Joshua!
I can still learn even though I have to put up with the BAM, double BAM, triple BAM from time to time. Thank you. And I subscribed.
This is the ONLY video that made sense. I know understand what PCA is. Thank you so much.
This is the most awesome explanation of PCA I have ever seen.....I struggled with it for months !!! You are phenomenal :)
Learnt and grabbed something to incorporate into my seminar presentation. Before this video I had no clue. Thanks.
Bam! :)
Best video explanation on TH-cam......simple to catch the concept .Thanks a lot Joshua
Thank you so much! If you have time, you should check out the new and revised version of my PCA video. Believe it or not, it's even better! th-cam.com/video/FgakZw6K1QQ/w-d-xo.html
@@statquest ok sir
thank you for making a clear PCA video without all the jargon!!!
came here after listening to professor's lecture. Understood better here thank you.
Bam!
i think i just found a gem in youtube. wow. your statquest videos are awesome.
awesome raised to infinity Sir.. I'm doing masters in applied mathematics & this is our topic for seminar & it's really very helpful.. Thanks a ton!!!
Really Brilliant... Thank you very very much...I have not found a more simple way of explaining the concept any where..The best resource on understanding the concept.
Thaaank yoouu..I'm a 1st yr comp bio grad student! That was very intuitive and so relatable.
Some dimensions are more important than others: PCA. Great video.
Thanks, I'll
Cary on please, your videos are one of the most useful videos I have seen. In my case they make connection between theoretical explanations and common sense so I can truly understand the topic and use that method in real world applications. Once again, great job!
Awesome! I was having trouble with PCA. Your comprehensive explanation helped me to improve my understanding a lot!
one of the most simplest explanation for PCA..glad to have watched this video!! Thanks a lot Joshua..looking fwd to more videos!!
Best explanation of PCA I have ever seen ! Thank you. Now it is very clear to me.
Hooray! :)
This Video is amazing! If you have no idea, it explains very lightly everything and make you understand what is PCA!! Thank you so much
You're welcome!
this is the best video I have watched regarding PCA
This is the best video I have found to explain PCA. Please do one for PLS-DA!
Thanks!
thanks man, you saved my brain, I was near to give up when I found this video
What a creative way to teach statistical concepts behind PCA !!.. So glad to know the meaning behind mathematical steps :)
You rock!!
This is a better PCA video than the others and later ones you have made.
That's why I left it up. It does some things very well. I'm glad you found it helpful! :)