NOTE: Although I do not mention it by name in the video, this StatQuest covers Pearson's Correlation Coefficient. Unfortunately, this did not occur to me until after I posted the video, otherwise I would have mentioned it at least 20 times...so maybe it's better the way it turned out. ;) Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
Hi Josh Thanks a lot for the wonderful work. it helps learners a lot. My query: At 9 : 08, it is mentioned p = 2.2 * 10 ^ -16 means low probability that a randomly selected point has similarly strong relationship. Does it mean to say that the hypothesis or prediction (line through the data points) of the trend cannot generalize with respect new data point a randomly selected data point? Is that what a low p means to say? At the same time a low p means high confidence level in the trend which means that high confidence level implies that a randomly selected that will have similarly stronger relationship? Let me please know if I am missing some point.
@@sunilkumarsamji8507 No. The p-value tells us that the probability that random noise could create the relationship we observed, or a stronger relationship. When you have small p-value, that means the probability that the relationship we observed is due to noise is small. This means we can have confidence that new observations will behave similarly to what we have seen before, rather than completely randomly. Does that make sense?
@@statquest Thanks for your amazing videos. I am watchng them all to try to catch up in statistics for my master degree in geology. In this video, I am unsure on how you calculated the p-values. Can you please explain a little ?
@@lorisbach9905 Unfortunately, I don't have a video that explains the p-values for Pearson's correlation coefficient in detail. However, I do have a video that explains the p-value for R-squared, which is very, very closely related (and is actually much more useful) here: th-cam.com/video/nk2CQITm_eo/w-d-xo.html
I am crying rn, Statistics was the one thing that scared me in high school, never studied it in engineering & after watching tons of videos & losing hope. I finally found your channel. I am finally understanding bits and bytes of statistics & I owe everything to this beautiful pedagogy Infinite BAM
I am so thankful to you!!! I tried learning statistics multiple times in my life and never succeded with any source. I discovered your stat quests about a week ago and I already feel so comfortable with many concepts in statistics! Huge thanks.
I have yet to get into most of these concepts in my statistics major, but I am so thankful to have these bite-sized informational videos with lots of visual explanations to explain each concept so I can start practicing and studying machine learning early. Thank you so much for every single video you put out. Truly a blessing.
you just explained this better than i ever heard. im a phd student (who for some reason wasn't given a decent statscourse through his master degree in robotics engineering. Needless to say, statistics are good for science)
Thanks! Great summary at 9:00 Correlation strength nothing to do with slope, but with how many points the line goes through. Can have correlation of 1 with large slope or small slope as long as the points lie on a line. 14:00 equation cov(x,y) in previous video
I'm very grateful to all of your videos. I want to support you but I am a student in 3rd world country. Even I get capable enough I'll surely contribute to this great project! Thank you
your video makes it really easy to understand(even my english is not really strong , I can still understand almost all of them) , thank you from Thailand
You might be referring to a t-test for slope. You would need to calculate a sample regression line using the data and then obtain a p value by performing a test on the data with some null hypothesis.
How good you r at this. I tried really hard to understand what it this when i've been in university. but failed. Because there was no explanation why we need this. Only the words that it is "how x related to y"... I figured out what is it actually only 7 years later... Thanks a lot man
@Josh, it is great you actually put the text on the screen, I cannot play sound but I can still follow closely what you are saying. Great videos, I hope you will later dive into more advanced topics in time series analysis (unit roots, ARIMA, GARCH, etc). Pls keep it up!
Hi, great video. Can you please provide additional guidance on the following: a. How do you quantitatively determine the P-value for a correlation? b. What's the difference, both formulaically and conceptually between R2, Correlation, and Beta/coefficient in a regression?
If you want to have a super deep understanding on t-tests and ANOVA, you should check out my StatQuest videos on Linear Models: th-cam.com/play/PLblh5JKOoLUIzaEkCLIUxQFjPIlapw8nU.html
Thank you for your amazing video! Could you explain how to calculate the p-value in this video (such as 12:30). I have watched your p-value, but still do not know how to use it in this video's examples' calculation. 🙏🙏🙏
Hi, Josh. Nice to meet you! I am Tai from Taipei, Taiwan. From the video you mentioned in @7:42, can we say that the probability of a random dot on a random line is equal to the proportion of a line to the 2-D plain, which is the area of a line/area of a plain = 0/1? As we are interested in the probability of a random dot on a random line, it's actually the same as asking the chance of the dot on the line/the chance of the dot on the whole plain. As a line is 1-D, and the plain is 2-D, the proportion is 0. Hence, the probability of a random dot on a random line is equal to 0.
Triple Bam!! Thanks for the great lecture, although I think the p-Value not only depend on the amount of data we have, but also depend on the strength of relationship. For example, given the same amount of data, the chance to generate stronger relationship from random points is smaller for higher correlation than lower correlation.
Yes, that's sometimes true, but not always (for example, if your sample size = 2), so I decided to focus on the things that are always true in my video, and that is Correlation is determined by the strength of the relationship and p-values are determined by sample size. In other words, if the sample size is too small you will never have a small p-value, and if the sample size is huge, then it doesn't matter what the correlation is, the p-value will probably be significant. For example, if we have any 2 data points, we can draw a line through them, and correlation = 1, however, the p-value = 1. In contrast, if we have enough data, it doesn't matter how close the correlation is to 0, we can still have a significant p-value.
Dear Josh Starmer , I am thankful to you for your wonderful videos. May I know why the numerator of correlation formula is always lower than denominator?
That would take a whole StatQuest to explain. We'd have to go through the Cauchy-Schwarz inequality. However, it's on the to-do list. One day I will do it.
Still getting this clear in my mind. ..At 13:11 you say that adding data (and a decreased p value) increases our confidence in our guess. I think this may be misleading because it suggests that b smaller p values mean more accurate guesses. I would rather say that smaller p value means more confidence that we are accurately seeing the QUALITY of the guesses we can make (not the guess itself, which is indicated by the correlation value). So with a weak correlation, smaller p value means I am more certain that there is a weak relationship and that my guess will be poor I hope that makes sense. Thanks for a great series
What I was trying to say was in the picture on the left, we can't be sure if adding more data would give us a totally different correlation value, so we have low confidence in it. In the picture on the right, we have enough data to be confident that the correlation value will not change much with additional data.
Dear professor, at 12:57 in respect to the picture on the left, you said "increase the sample size ,don't increase the correlation". I have a different opinion about the statement. Because that at starting if I have two dots, so no doubt the correlation of the straight line is equal to 1,and P-value =1.then I add randomly some dots to the graph, well the correlation value will be changed , and so the P-value will do .thus, the P-value just tell us if there is a trend or not ,don't tell you how much the difference and how accurate the trend you find close to the actual of the stuff . Alternatively, the accurateness of trend or model you find depends on not only the amount of dots ,but also the development of technology, right?@@statquest
th-cam.com/video/vemZtEM63GY/w-d-xo.html th-cam.com/video/5Z9OIYA8He8/w-d-xo.html Both answer this.... but I agree... a quick explanation of p values would be the only extra credit that I felt was missing from this video. Much the way he did variance recap at the beginning.
Hi. Your explanation was perfectly fine. I have a doubt at 16:20, shouldn't it be "That means that there is 3% chance that random data could produce a weak relationship, or weaker". or "That means that there is 97% chance that random data could produce a strong relationship, or stronger". Because smaller the p value, stronger the correlation.
The video is correct. p-values are kind of tricky, and to learn more about how to interpret them, you can check out this video: th-cam.com/video/vemZtEM63GY/w-d-xo.html Also, a small p-value doesn't mean a strong correlation. We could have a weak correlation, like 0.1, and still have a small p-value.
Hey nice video! In wikipedia there is also a "non-pearson" corelation, that aims to center data points around the origin, and calculate correlation with the use of covarianve in the form of the dot product with respect to vector norm of data points.
I watched it as background music so not sure if this is already addressed: I think it might be worth mentioning that here "relationship" refers to "linear relationship". Otherwise, e.g. data generated by=x^2 on (-1,1) will get 0 correlation but obviously have a relationship. Relationship sounds more corresponding to "(in)dependence".
Hi Josh. Thanks for the great video. I have a question. 1) Why does the correlation have a bound of -1 to 1 when you divide covariance with the product of the two standard deviations? Is the product of the standard deviations the maximum covariance the two random variables can have? If so, how do you show that 2) And, how does the correlation of 1 tell you that the points lie on the straight line?
Unfortunately, showing how the limits of correlation are -1 and 1 isn't super easy. However, you're on the right track. When all of the points are on the same line, then the absolute value of the covariance = the product of the standard deviations.
It can be, but it's not as easy (however, modern neural networks can fit a squiggly line to just about anything. For details, see: th-cam.com/video/zxagGtF9MeU/w-d-xo.html ). When we use squiggly lines, we use R^2 instead of Pearson's Correlation because Pearson's correlation is explicitly defined for straight lines.
Yes - it depends on how much variation in there is in your data. If there is not much variation, then you don't need many observations. If there is a lot of variation, then you need a lot of observations.
First time hearing a female voice on your channel, and it's hilarious. Anyway, thanks for all of your videos, it helps me survive throughout my statistic course
Hi, one doubt, in practical as you mentioned, smallest p-value will have high correleation I agree. However, I'm confused, If I goes by theoretical explanation of p-value. As per my understanding from your p-value video, p-value is the sum of probability of 1. choosen rare event to occur 2. similar rare event to occur 3. Any other rare event to occur If this is the case, p-value for correleated value shouldn't be high, because p-value will inform what is the probability of having this event to occur i.e event of higher correlation. So it will inform that there is high probability that such correlation will occur. I'm new to stats, so please bear if my understandings are wrong.
It's not true that a small p-value = high correlation. As illustrated in this video, high correlation is simply a function how well a line fits the data, and a line fits any 2 random data points perfectly, and thus, will have the highest correlation, even though the points are random, and thus, will have a p-value of 1. To learn more about p-values, see: th-cam.com/video/vemZtEM63GY/w-d-xo.html and th-cam.com/video/nk2CQITm_eo/w-d-xo.html
Awesome video again! But just a question about 15: 07 - 15:13, regarding "When the data all fall on a straight line with a positive or negative slope, then the covariance and the product of the square roots of the variance terms are the same and the division gives us 1 or -1, depending on the slope", I don't think I fully get it intuitively. So how could we know the absolute value of nominator and denominators are the same without calculation?
Unfortunately the mathematics that show why correlation is limited to a maximum value of 1 and a minimum value of -1 are quite complicated, which is why I glossed over it in the video.
Hi Josh, as always, thank you for your great videos! Would you consider making a video to explain the relationship between correlation and R-squared? I've watched all the videos about these two terminologies, but still can not figure out the relationship.
Presumably you want something other than, "take your correlation value, r, and square it, and that's r-squared". More like, "how does the square of this equation get transformed into this other equation"?
@@statquest So how? I also curious about how/why correlation value^2=r-squared. The equations are so different. Appreciate it if you can kindly explain on that! Thank you, Josh!
The p-value is the probability that random noise could generate a relationship as strong or stronger than what you observed. A small p-value suggests that it is unlikely that random noise created the data that you observed. Thus, this gives us more confidence that our model is correct. Does that make sense?
Hi Josh, Could you explain how you get the P-value? I have split into the below 2 sub-questions. - what is the input to come out P-value? - which probability density function are you using to calculate p? are you using ChiSquare person?
There are a lot of ways to calculate p-values for Pearson's correlation coefficient. For details, see: en.wikipedia.org/wiki/Pearson_correlation_coefficient
@@statquest I can't believe u replied. I am pursuing MS Data Science. Your work really give me better understanding. I will pay ur tuition fee when I get job. ✌🤟👆👍😎
Thanks for you video! So covariance is just used to calculate correlations? What is the reason for making term covariance if it is just being used as stepping stone for calculating correlation?
It's used in other contexts as well (like in PCA or in longitudinal analysis). It's a useful intermediate step in a lot of ways, so it's good to give it its own name.
@@statquest Thanks but why not simply use correlation as stepping stone for other calculations as it provides weather slope is +,-,neutral(like covariance) as well as slope and closeness to line?
Thank you very much for every video, you are awesome. 16:26 I think there is a wording mistake; instead, that means that there is 3% chance that random data could "not" produce a smilarly strong relationship or stronger, am I right?
Correlations values of 1 and -1 represent situations when the data is all on the same straight line. However, when the data is not all on a straight line, then the correlation values get closer to 0.
You mention that R^2 as a more intuitive / useful method for understanding goodness of fit than correlation, but doesn't R^2 require the assumption that the model is linear (so it cant be used for logistic regression and other non-linear models)? Does correlation have this same requirement too?
NOTE: Although I do not mention it by name in the video, this StatQuest covers Pearson's Correlation Coefficient. Unfortunately, this did not occur to me until after I posted the video, otherwise I would have mentioned it at least 20 times...so maybe it's better the way it turned out. ;)
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
Hi Josh Thanks a lot for the wonderful work. it helps learners a lot. My query: At 9 : 08, it is mentioned p = 2.2 * 10 ^ -16 means low probability that a randomly selected point has similarly strong relationship. Does it mean to say that the hypothesis or prediction (line through the data points) of the trend cannot generalize with respect new data point a randomly selected data point? Is that what a low p means to say? At the same time a low p means high confidence level in the trend which means that high confidence level implies that a randomly selected that will have similarly stronger relationship? Let me please know if I am missing some point.
@@sunilkumarsamji8507 No. The p-value tells us that the probability that random noise could create the relationship we observed, or a stronger relationship. When you have small p-value, that means the probability that the relationship we observed is due to noise is small. This means we can have confidence that new observations will behave similarly to what we have seen before, rather than completely randomly. Does that make sense?
@@statquest yes, that is the reason we keep the threshold to only 5% or 0.05.
@@statquest Thanks for your amazing videos. I am watchng them all to try to catch up in statistics for my master degree in geology.
In this video, I am unsure on how you calculated the p-values. Can you please explain a little ?
@@lorisbach9905 Unfortunately, I don't have a video that explains the p-values for Pearson's correlation coefficient in detail. However, I do have a video that explains the p-value for R-squared, which is very, very closely related (and is actually much more useful) here: th-cam.com/video/nk2CQITm_eo/w-d-xo.html
I am crying rn, Statistics was the one thing that scared me in high school, never studied it in engineering & after watching tons of videos & losing hope. I finally found your channel.
I am finally understanding bits and bytes of statistics & I owe everything to this beautiful pedagogy
Infinite BAM
Hooray! I'm glad my videos are helpful. :)
I am so thankful to you!!! I tried learning statistics multiple times in my life and never succeded with any source. I discovered your stat quests about a week ago and I already feel so comfortable with many concepts in statistics! Huge thanks.
That's awesome! I'm glad the videos are helpful. :)
My days spent on statistics before knowing statquest were so wasted
Wow! I'm glad you like my videos! :)
I can relate
Same
Same
You are a genius in pedagogy.
Thank you! :)
100 % agree! I love StatQuest with Josh Starmer!! ♥
@@anitapallenberg80 Me too. Alot
Simple, easy to understand.
As soon as I started the video, the differences between r-square, covariance and correlation were lingering in my mind. Glad you cleared them all!!
Glad it was helpful!
I just watched the Covariance and Correlation videos back to back. Very well put together and really easy to follow
Thank you! :)
Very much appreciate the crawl, walk, run approach with emphasis on conceptual understanding
Thank you!
Ohhh man!!! I'm instantly falling in love with this channel, definitly the best sense of humor to learn machine learning.
Thanks! :)
I've added this channels videos to my Anki cards and every time I review them I get even deeper insights. well done statquest
bam! :)
All My Life I have been looking out for you, glad that I found you... BAM!!!
Hooray! :)
Bam Bam BAM...
Eventually, I've fallen in love with your BAMs :)
Addictive BAMs and gorgeously simple videos!
Thanks a lot!
Thanks!
I have yet to get into most of these concepts in my statistics major, but I am so thankful to have these bite-sized informational videos with lots of visual explanations to explain each concept so I can start practicing and studying machine learning early. Thank you so much for every single video you put out. Truly a blessing.
Thank you very much! :)
I find the best and non-boring stats explanations in this channel.
BAM! :)
This is why when drawing trend lines on stock charts they say you need at least 3 points/touches and not 2. Very helpful video!
Thanks!
Thank you! You actually help me to understand many basic concepts in a clear and easy-acceptable way, you are so smart and kind-hearted.
Thank you very much! :)
you just explained this better than i ever heard. im a phd student (who for some reason wasn't given a decent statscourse through his master degree in robotics engineering. Needless to say, statistics are good for science)
Thank you! :)
I just learn to my exam in two days with your videos ! You are awesome man keep going ! thank you !
Best of luck!
Thanks!
Great summary at 9:00
Correlation strength nothing to do with slope, but with how many points the line goes through. Can have correlation of 1 with large slope or small slope as long as the points lie on a line.
14:00 equation cov(x,y) in previous video
bam!
Appreciative bäm from Germany.
That's awesome! I'm glad Bam has an umlaut in German. ;) That makes it twice as cool. TWICE BÄM!
I'm very grateful to all of your videos. I want to support you but I am a student in 3rd world country. Even I get capable enough I'll surely contribute to this great project! Thank you
Thank you very much! BAM! :)
Dear Josh, This video made my endless nights trying to grasp on this topic.
:)
These are the best videos which explains the concept in simple way. Thanks for making these videos.
Please upload Al and deep learning videos.
Your videos are way better than most of the paid courses.
:)
You have a knack for teaching... this was an amazing video, thank you!!
Thank you! :)
The best intro on correlation, thank you!
Thank you! :)
I cant believe how all your videos are so perfect !
Wow, thank you!
Stat quest is the best ..
Thanks! :)
Great rhyme!
Josh you are just a genius of Stat explanations, thank you.
Thank you very much! :)
your video makes it really easy to understand(even my english is not really strong , I can still understand almost all of them) , thank you from Thailand
Hooray! I'm glad you like my videos. :)
Big thanks! I couldn't get any intuition from my school lecture, and it's lucky for me to find this video a day before my exam for this!
Good luck on your exam! :)
how to obtain the p-value from this data?
@@minhtoto1542 Had the same question. Found this video helpful: th-cam.com/video/8Aw45HN5lnA/w-d-xo.html
You might be referring to a t-test for slope. You would need to calculate a sample regression line using the data and then obtain a p value by performing a test on the data with some null hypothesis.
Very well explained. I like that you give lots of examples and answer many of the possible questions in advance. Thanks a lot!
Thank you very much! :)
you are doing a great job enabling us to learn may super tough concepts relatively easy .. that too free of cost...thankss
Thank you very much! :)
How good you r at this. I tried really hard to understand what it this when i've been in university. but failed. Because there was no explanation why we need this. Only the words that it is "how x related to y"... I figured out what is it actually only 7 years later... Thanks a lot man
Happy to help! :)
If anyone finds a better teacher than this guy on you tube, do let me know 😎😎
Bam....I started to think that statistics can be fun....Huge thanks from Korea
Hooray!!!
@Josh, it is great you actually put the text on the screen, I cannot play sound but I can still follow closely what you are saying. Great videos, I hope you will later dive into more advanced topics in time series analysis (unit roots, ARIMA, GARCH, etc). Pls keep it up!
I'm glad you like my style. :)
Guys like this help make the study world a better place!
Thank you!
I've learned so much from this channel. Thanks, Josh.
Awesome, thank you!
Extremely helpful and clear with good examples and explanation! Wonderful, thank you!
BAM!!!
Thanks!
Thank you from Indonesia, I love your videos!
Thank you! :)
Josh, you explain in such a way that even layman can understand easily.
A big shout out to all the hard work you put in for making these videos.👏👏
Thank you very much!!! :)
Josh you're super great man. I really enjoy listening you.
Thank you!
Soooo thankful to have found this video. Why did it seem so hard to understand before?!
bam! :)
As a graduate level I-O Psychology student.... thank you... I watched the summary first and then went back to watch the entire video
bam!
Hi, great video. Can you please provide additional guidance on the following:
a. How do you quantitatively determine the P-value for a correlation?
b. What's the difference, both formulaically and conceptually between R2, Correlation, and Beta/coefficient in a regression?
For details on p-values and linear regression, see: th-cam.com/video/nk2CQITm_eo/w-d-xo.html
Thanks for the video.
And please make next video series on hypothesis testing (z test, t test, anova, chi square)
That is right!!!
If you want to have a super deep understanding on t-tests and ANOVA, you should check out my StatQuest videos on Linear Models: th-cam.com/play/PLblh5JKOoLUIzaEkCLIUxQFjPIlapw8nU.html
Sure I will check it and let you know if anything else is needed. Thank you very much. You are doing great man keep up the good work.
Thanks for your detailed and clear explanation. Saving much of my time to read books which hard to understand.
Thanks! I'm glad the video is helpful.
StatQeust is really amazing to learn and understand things very easy
Thanks!
Thank you for your amazing video!
Could you explain how to calculate the p-value in this video (such as 12:30). I have watched your p-value, but still do not know how to use it in this video's examples' calculation. 🙏🙏🙏
Unfortunately I can't explain it in a comment. Hopefully one day I'll make a video.
@@statquest Great😊😇🤓 I look forward to it😍😍. thank you very much!🙏🙏
Hi, Josh. Nice to meet you! I am Tai from Taipei, Taiwan. From the video you mentioned in @7:42, can we say that the probability of a random dot on a random line is equal to the proportion of a line to the 2-D plain, which is the area of a line/area of a plain = 0/1? As we are interested in the probability of a random dot on a random line, it's actually the same as asking the chance of the dot on the line/the chance of the dot on the whole plain. As a line is 1-D, and the plain is 2-D, the proportion is 0. Hence, the probability of a random dot on a random line is equal to 0.
That might be a way to look at it.I've never thought of it that way.
@@statquest Thank you :)
thank you , you are distinguished brilliant mind and great teacher for many
Wow, thank you!
Thank you very much. You saved my day with (silly) songs and also my day, even my course :))))
Happy to help!
Great course. May I point out that at (17:38) it is better to say "correlation quantifies the strength of linear relationships"
True! :)
I am familiar with the concepts you talk about.
But I am a fan of your songs, so I am here to listen to the music.
BAM! :)
Triple Bam!! Thanks for the great lecture, although I think the p-Value not only depend on the amount of data we have, but also depend on the strength of relationship. For example, given the same amount of data, the chance to generate stronger relationship from random points is smaller for higher correlation than lower correlation.
Yes, that's sometimes true, but not always (for example, if your sample size = 2), so I decided to focus on the things that are always true in my video, and that is Correlation is determined by the strength of the relationship and p-values are determined by sample size. In other words, if the sample size is too small you will never have a small p-value, and if the sample size is huge, then it doesn't matter what the correlation is, the p-value will probably be significant. For example, if we have any 2 data points, we can draw a line through them, and correlation = 1, however, the p-value = 1. In contrast, if we have enough data, it doesn't matter how close the correlation is to 0, we can still have a significant p-value.
@@statquest You reply my comments! Bam!!!!
@@yangyu5525 Corrected!
your video is so great and easy to understand!
Thanks! :)
Waiting for your videos is a cause worth waiting for 👍👍👍
Thanks! :)
It solves my confusion. Thanks a lot.
Bam! :)
Thank you for your time to explain and make this video!!!
Thank you very much! I really appreciate your feedback.
Dear Josh Starmer
,
I am thankful to you for your wonderful videos.
May I know why the numerator of correlation formula is always lower than denominator?
That would take a whole StatQuest to explain. We'd have to go through the Cauchy-Schwarz inequality. However, it's on the to-do list. One day I will do it.
@@statquest It was a great clue. I start reading about Cauchy-Schwarz inequality and look forward to watching your lecture in future.
Still getting this clear in my mind. ..At 13:11 you say that adding data (and a decreased p value) increases our confidence in our guess. I think this may be misleading because it suggests that b smaller p values mean more accurate guesses. I would rather say that smaller p value means more confidence that we are accurately seeing the QUALITY of the guesses we can make (not the guess itself, which is indicated by the correlation value). So with a weak correlation, smaller p value means I am more certain that there is a weak relationship and that my guess will be poor
I hope that makes sense. Thanks for a great series
What I was trying to say was in the picture on the left, we can't be sure if adding more data would give us a totally different correlation value, so we have low confidence in it. In the picture on the right, we have enough data to be confident that the correlation value will not change much with additional data.
Dear professor, at 12:57 in respect to the picture on the left, you said "increase the sample size ,don't increase the correlation". I have a different opinion about the statement. Because that at starting if I have two dots, so no doubt the correlation of the straight line is equal to 1,and P-value =1.then I add randomly some dots to the graph, well the correlation value will be changed , and so the P-value will do .thus, the P-value just tell us if there is a trend or not ,don't tell you how much the difference and how accurate the trend you find close to the actual of the stuff . Alternatively, the accurateness of trend or model you find depends on not only the amount of dots ,but also the development of technology, right?@@statquest
the ultimate clearly explanation
BAM! :)
Very good - I would have liked to see a p-value calculation also :)
th-cam.com/video/vemZtEM63GY/w-d-xo.html th-cam.com/video/5Z9OIYA8He8/w-d-xo.html Both answer this.... but I agree... a quick explanation of p values would be the only extra credit that I felt was missing from this video. Much the way he did variance recap at the beginning.
Hi. Your explanation was perfectly fine.
I have a doubt at 16:20, shouldn't it be "That means that there is 3% chance that random data could produce a weak relationship, or weaker".
or
"That means that there is 97% chance that random data could produce a strong relationship, or stronger".
Because smaller the p value, stronger the correlation.
The video is correct. p-values are kind of tricky, and to learn more about how to interpret them, you can check out this video: th-cam.com/video/vemZtEM63GY/w-d-xo.html
Also, a small p-value doesn't mean a strong correlation. We could have a weak correlation, like 0.1, and still have a small p-value.
p-value superbly explained!
Thanks!
This was so incredibly helpful, thank you!
Thanks!
Thank you for making this great video!
My pleasure!
Hey nice video!
In wikipedia there is also a "non-pearson" corelation, that aims to center data points around the origin, and calculate correlation with the use of covarianve in the form of the dot product with respect to vector norm of data points.
Thanks for the info!
Best video ever seen on correlation👍😁
Thank you very much! :)
@@statquest Welcome and thank you for making these videos😁
I watched it as background music so not sure if this is already addressed: I think it might be worth mentioning that here "relationship" refers to "linear relationship". Otherwise, e.g. data generated by=x^2 on (-1,1) will get 0 correlation but obviously have a relationship. Relationship sounds more corresponding to "(in)dependence".
Throughout the entire video I mention that we are using a straight line to define the relationship.
Cara, seu vídeo é mega claro, sem deixar de ser rigoroso! Super obrigado pelo trabalho!
Muito obrigado!!!
Hi Josh. Thanks for the great video.
I have a question.
1) Why does the correlation have a bound of -1 to 1 when you divide covariance with the product of the two standard deviations? Is the product of the standard deviations the maximum covariance the two random variables can have? If so, how do you show that
2) And, how does the correlation of 1 tell you that the points lie on the straight line?
Unfortunately, showing how the limits of correlation are -1 and 1 isn't super easy. However, you're on the right track. When all of the points are on the same line, then the absolute value of the covariance = the product of the standard deviations.
Uncle josh, ur only one who answers my query of why can't squiggly line be made. Thanku
It can be, but it's not as easy (however, modern neural networks can fit a squiggly line to just about anything. For details, see: th-cam.com/video/zxagGtF9MeU/w-d-xo.html ). When we use squiggly lines, we use R^2 instead of Pearson's Correlation because Pearson's correlation is explicitly defined for straight lines.
@@statquest ok thanku.. It's entirely new for me
I love how you teach us like we're bunch of 7-8 year's old kids.
I just teach the way I teach myself.
Great video! Can you also explain the difference between spearman and pearson corrlelation? Thanks a million!
I'll keep that in mind.
That's simply amazing education...!! Just one question: What is "much" data? Doesn't it always depend on the context?
Yes - it depends on how much variation in there is in your data. If there is not much variation, then you don't need many observations. If there is a lot of variation, then you need a lot of observations.
@@statquest thank you so much !!
First time hearing a female voice on your channel, and it's hilarious. Anyway, thanks for all of your videos, it helps me survive throughout my statistic course
Hooray! :)
Bedankt
TRIPLE BAM!!! Thank you so much for supporting StatQuest!!! :)
This is much better than the class in uni..
Thank you! :)
Hi, one doubt, in practical as you mentioned, smallest p-value will have high correleation I agree.
However, I'm confused, If I goes by theoretical explanation of p-value.
As per my understanding from your p-value video, p-value is the sum of probability of
1. choosen rare event to occur
2. similar rare event to occur
3. Any other rare event to occur
If this is the case, p-value for correleated value shouldn't be high, because p-value will inform what is the probability of having this event to occur i.e event of higher correlation.
So it will inform that there is high probability that such correlation will occur.
I'm new to stats, so please bear if my understandings are wrong.
It's not true that a small p-value = high correlation. As illustrated in this video, high correlation is simply a function how well a line fits the data, and a line fits any 2 random data points perfectly, and thus, will have the highest correlation, even though the points are random, and thus, will have a p-value of 1. To learn more about p-values, see: th-cam.com/video/vemZtEM63GY/w-d-xo.html and th-cam.com/video/nk2CQITm_eo/w-d-xo.html
Thank you so much! This was so helpful.
Thanks!
Awesome video again! But just a question about 15: 07 - 15:13, regarding "When the data all fall on a straight line with a positive or negative slope, then the covariance and the product of the square roots of the variance terms are the same and the division gives us 1 or -1, depending on the slope", I don't think I fully get it intuitively. So how could we know the absolute value of nominator and denominators are the same without calculation?
Unfortunately the mathematics that show why correlation is limited to a maximum value of 1 and a minimum value of -1 are quite complicated, which is why I glossed over it in the video.
@@statquest Thank you so much for your instant reply! Then without calculation, is there a possible way to just understand it intuitively?
@@JupiterChamsae991102 I did the best I could with this video.
@@statquest Ok~ Thank you so much as always ❤️
BAM !!
You are legend 😭👏
Thanks!
Hi Josh, as always, thank you for your great videos! Would you consider making a video to explain the relationship between correlation and R-squared? I've watched all the videos about these two terminologies, but still can not figure out the relationship.
Presumably you want something other than, "take your correlation value, r, and square it, and that's r-squared". More like, "how does the square of this equation get transformed into this other equation"?
@@statquest So how? I also curious about how/why correlation value^2=r-squared. The equations are so different. Appreciate it if you can kindly explain on that! Thank you, Josh!
@@julieyananzhu1134 I'd have to make a whole video to go through that derivation. Maybe one day I will! :)
BAM.. Get addicted to your video
Hooray! :)
at 8:47 why doesn't smaller p value lead to lesser confidence in predicting a random new point?
The p-value is the probability that random noise could generate a relationship as strong or stronger than what you observed. A small p-value suggests that it is unlikely that random noise created the data that you observed. Thus, this gives us more confidence that our model is correct. Does that make sense?
always enjoys your song josh!
Thanks!
can u please tell how did you calculate p- value?
He did a video on P-Value
I didn't think that Machine Learning and humor were correlated but here we are...BAM!
bam! :)
Hi Josh,
Could you explain how you get the P-value? I have split into the below 2 sub-questions.
- what is the input to come out P-value?
- which probability density function are you using to calculate p? are you using ChiSquare person?
There are a lot of ways to calculate p-values for Pearson's correlation coefficient. For details, see: en.wikipedia.org/wiki/Pearson_correlation_coefficient
@@statquest thanks!
Thanks Josh!!!!!!!!!!!!!! Helps lot.
Thank you! :)
@@statquest I can't believe u replied. I am pursuing MS Data Science. Your work really give me better understanding. I will pay ur tuition fee when I get job. ✌🤟👆👍😎
When Phoebe decides to sing stats... xD
Love the videos... lifesavers to sinking ships in the sea of numbers
Check out: th-cam.com/video/D0efHEJsfHo/w-d-xo.html
Omg xD best!
Thanks for you video! So covariance is just used to calculate correlations? What is the reason for making term covariance if it is just being used as stepping stone for calculating correlation?
It's used in other contexts as well (like in PCA or in longitudinal analysis). It's a useful intermediate step in a lot of ways, so it's good to give it its own name.
@@statquest Thanks but why not simply use correlation as stepping stone for other calculations as it provides weather slope is +,-,neutral(like covariance) as well as slope and closeness to line?
Thank you very much for every video, you are awesome.
16:26 I think there is a wording mistake; instead, that means that there is 3% chance that random data could "not" produce a smilarly strong relationship or stronger, am I right?
The wording in the video is correct. For more details on p-values, check out the 'Quest: th-cam.com/video/5Z9OIYA8He8/w-d-xo.html
you can find the answer for your question in @ali alqaraan comment below.
Just to confirm, this correlation coefficient is the R that we have to square to get R squared?
Yep
11:23 you mean the closer the correlation values get to 1 or -1, right?
Correlations values of 1 and -1 represent situations when the data is all on the same straight line. However, when the data is not all on a straight line, then the correlation values get closer to 0.
You mention that R^2 as a more intuitive / useful method for understanding goodness of fit than correlation, but doesn't R^2 require the assumption that the model is linear (so it cant be used for logistic regression and other non-linear models)? Does correlation have this same requirement too?
Yes, they both share that requirement.
Hi Josh.. Very well explained... Thank you
Please do a video on ACF & PACF (Auto Correlation & Partial Auto Correlation)