IMHO, this episode is basically the entire reason why this series needed to exist. P-Hacking continues to be one of the most detrimentally misunderstood concepts of my lifetime. It started getting talked about a few years back, but it hasn't stopped lazy science journalists from picking up the odd worthless deceptive science press-release and failing to scrutinize the validity.
p-values can be really hard to comprehend. There can be a lot of double negatives and false dichotomies. I think thats why it is important to use consistent language.
A great video. I'm a risk manager, so I work a lot with probability and hypothesis testing, and a lot of people in my field misinterpret p-values as probability of rejecting a true null hypothesis given data, which it absolutely isn't. It's a damn confusing thing to understand, but every student of statistics should learn that all that a p-value tells you is how extreme your sample is given that the null is true. It's important because this type of misunderstanding then creeps into academic papers.
I'm currently taking an Experimental Methods course in Psycholinguistics for my Master's degree, and we're currently doing a TON of significance testing in R, so these videos are amazingly timely for me!
I'm a PhD student in economics and my honest opinion is that p-values is that forcing a policy asking for ever lower p-value cutoff just encourages researchers to get ever larger data samples, which is not bad. However, we run into the fact that with large enough sample, the distributions thin out and we will get ever more likely to reject null hypotheses. The problem is that statistical significance can be meaningless when we fail to have economic significance: i.e. the effect is large enough for it to matter. So, p-values are important but by no means should one ever take too seriously the result of any given statistical test in isolation.
When looking at differences in error statistics, say RMSE, I've taken to distinguish statistically significant from practically significant. Sure, the 0.05 K degree of improvement in that weather model might be statistically significant, but does it really impact the forecast meaningfully?
And to go further, we always have to have spurious relationships on the back of our minds. Even when results are statistically significant, do we want to believe that we didn’t just get a bad draw?
That's truly the only and best thing to do. First time I watched the series, 6month ago, I dropped at 10th episode. Now that I have let it sit enough to renew my mtivation, I started over and am going all the way. Nothing wrong with going your own pace.
My null hypothesis: The data is normally distributed with a mean u, and a standard deviation of sigma. As I collect data, the data has mean x_bar, and standard deviation of s. I know that the normal distribution is a conjugate prior to itself, so I update my distribution to more consistent with the given data.
This is such helpful information. Just wish it had some think-time. The sentences are cognitively heavy. They need a few seconds to put into working memory and process. Did this very useful content have to be so densely packed and so very breathlessly delivered??
Thank you for the series. I find there are too many examples per video. Some, like the black swan example, are necessary to prove the point, but I wonder if other ones (cats' weights, bees, et al) could be rolled into one?
You mean 0.00036? It comes from calculating the p-value. Crash Course tries not bog down their lessons with formulas, but for statistics its hard to do. If you REALLY WANT to know I can explain it to you comment-by-comment, but if you don't want that then the best I can tell you is to remember the idea of histograms and density curves. For histograms you plot data into bins of frequency. For example, if you want to figure out the average number of miles a person takes going to school or work, you create bins of 0-4 miles, 5-9 miles, 10-14 miles, and so on. Certain histograms show distributions, and the case of the normal distribution all the data is centered around the middle so if the you created a histogram of average # of miles and it was normally distributed, if you drew an outline of the histogram it would resemble a bell curve. If you made the bins smaller and smaller, your bell curve would become more apparent histogram would eventually become a density graph. This is the important part: a density graph is a graph based on probability, remember that 100% of the data you collected is under the curve. A normal curve is symmetric so 50% of your data would be one side and 50% would be on another. So again the main point is the normal curve gives you probabilities, and you use those probabilities to calculate p-values
Most of the error here referred in defining/limiting p-value are surely because of sampling error... I'm pretty sure if better sampling techniques are adopted they will have lot more to explain for the sake of what they had explained it now...
about problem #2 (that p-values assume null hypothesis to be true): Can we not say that this isn't really a problem since the way we use p-value is not just the conditional probability? We use the "reductio ad absurdum" argument in relation to p-value. So to say that the fact that p-values assume null hypothesis to be true means it cant help us judge whether the null is true or not, we would also have to say that "reductio ad absurdum" is a faulty argument in general. Which, i suppose, could be argued but would be a hard task.
Can't you look at the power to determine that you had a big enough sample size to pick up an effect, and use that to determine whether you have a type 2 error?
Let's say your null hypothesis is "there are only white swans in the world". As soon as you find a black swan, you can reject the null hypothesis. But what happens if you find only white swans in your attempt to investigate your question? You have looked and looked and you only found white swans. Does that mean that there are no black swans in the world (i.e. can you accept your null hypothesis)? No (This is a classical example of an incorrect inductive argument). It might be that you just were not looking the right place at he right time and that is why you cannot accept your null hypothesis if you only find white swans. You failed to reject your hypothesis because you did not find any black swans. This is why we have to always provide evidence to prove that something is wrong and can't provide evidence that something is right. We can say that all our evidence hints at our hypothesis being right (but can't say that it is right), and can work under such assumptions, until this assumption gets disproven.
Kaminaji Thank you for the explanation. I do understand the case of the swans, but I don’t understand the analogy: how it is translated to hypothesis testing.
Both those psychology experiments actually show the opposite of what she said they showed. I know that they are just examples. But they are terrible examples. We cannot train our cognitive intellect.
is this series going to get to stuff like multidimensional reduction? or is this more of only an "intro" kind of thing? There is a lack of resources to learn statistics in youtube, I hope this series fixes that
Sorry, but talking about numbers showing fast clips of fish, deers and chess is IN NO WAY better than the good old whiteboard. It really doesn't help visualization and abstraction.
It's important to remeber that the statistical tests can be overly insensitive at very large sample sizes since the power increase. The alpha then needs to be adjusted to a lower level, possibly 1%. It's also important to study the practical significance in these cases. For example, when testing if data is normally distributed and it's rejected, to test the practical significance a graphical approach can be done to study if its approximately normally distributed.
IMHO, this episode is basically the entire reason why this series needed to exist. P-Hacking continues to be one of the most detrimentally misunderstood concepts of my lifetime. It started getting talked about a few years back, but it hasn't stopped lazy science journalists from picking up the odd worthless deceptive science press-release and failing to scrutinize the validity.
p-values can be really hard to comprehend. There can be a lot of double negatives and false dichotomies. I think thats why it is important to use consistent language.
A great video. I'm a risk manager, so I work a lot with probability and hypothesis testing, and a lot of people in my field misinterpret p-values as probability of rejecting a true null hypothesis given data, which it absolutely isn't. It's a damn confusing thing to understand, but every student of statistics should learn that all that a p-value tells you is how extreme your sample is given that the null is true. It's important because this type of misunderstanding then creeps into academic papers.
this explanation was so helpful! it really solidified the idea of p-values in my head. i've been struggling with the concept a LOT.
I'm currently taking an Experimental Methods course in Psycholinguistics for my Master's degree, and we're currently doing a TON of significance testing in R, so these videos are amazingly timely for me!
I would reccomend that the courses have outlines written or showned on video, it would be better for us to catch up with the main points.
I'm a PhD student in economics and my honest opinion is that p-values is that forcing a policy asking for ever lower p-value cutoff just encourages researchers to get ever larger data samples, which is not bad. However, we run into the fact that with large enough sample, the distributions thin out and we will get ever more likely to reject null hypotheses. The problem is that statistical significance can be meaningless when we fail to have economic significance: i.e. the effect is large enough for it to matter. So, p-values are important but by no means should one ever take too seriously the result of any given statistical test in isolation.
When looking at differences in error statistics, say RMSE, I've taken to distinguish statistically significant from practically significant. Sure, the 0.05 K degree of improvement in that weather model might be statistically significant, but does it really impact the forecast meaningfully?
And to go further, we always have to have spurious relationships on the back of our minds. Even when results are statistically significant, do we want to believe that we didn’t just get a bad draw?
Finally I understood what are p-values. Thank you!
Now, I have to figure out how to internalize that knowledge and don't commit errors in the future.
To be honest, I get lost a tad with this subject. Will review it again and again until I get it ;)
Did you get it ?
That's truly the only and best thing to do.
First time I watched the series, 6month ago, I dropped at 10th episode. Now that I have let it sit enough to renew my mtivation, I started over and am going all the way. Nothing wrong with going your own pace.
Do one on permutation tests and bootstrapping. I'm loving it.
Can't wait for Bayesian statistics now!
My null hypothesis: The data is normally distributed with a mean u, and a standard deviation of sigma.
As I collect data, the data has mean x_bar, and standard deviation of s. I know that the normal distribution is a conjugate prior to itself, so I update my distribution to more consistent with the given data.
This is such helpful information. Just wish it had some think-time. The sentences are cognitively heavy. They need a few seconds to put into working memory and process. Did this very useful content have to be so densely packed and so very breathlessly delivered??
Thank you for the series. I find there are too many examples per video. Some, like the black swan example, are necessary to prove the point, but I wonder if other ones (cats' weights, bees, et al) could be rolled into one?
6:25 best slide
Had to listen to 3:21-3:41 four times.
i really liked that the fischer quote was included
I didn't like this style of statistics in class. I liked priors so much more. They made much more sense.
99% good. Hypothetically.
Can someone explain to me where that super small P value came from in the example on 7:37 please?
You mean 0.00036? It comes from calculating the p-value. Crash Course tries not bog down their lessons with formulas, but for statistics its hard to do. If you REALLY WANT to know I can explain it to you comment-by-comment, but if you don't want that then the best I can tell you is to remember the idea of histograms and density curves. For histograms you plot data into bins of frequency. For example, if you want to figure out the average number of miles a person takes going to school or work, you create bins of 0-4 miles, 5-9 miles, 10-14 miles, and so on. Certain histograms show distributions, and the case of the normal distribution all the data is centered around the middle so if the you created a histogram of average # of miles and it was normally distributed, if you drew an outline of the histogram it would resemble a bell curve. If you made the bins smaller and smaller, your bell curve would become more apparent histogram would eventually become a density graph. This is the important part: a density graph is a graph based on probability, remember that 100% of the data you collected is under the curve. A normal curve is symmetric so 50% of your data would be one side and 50% would be on another. So again the main point is the normal curve gives you probabilities, and you use those probabilities to calculate p-values
The central idea of this video is p( null / data) vs p( data / null ) .
Such a great video, why so little thumbs up. I would even pay for this...
Most of the error here referred in defining/limiting p-value are surely because of sampling error...
I'm pretty sure if better sampling techniques are adopted they will have lot more to explain for the sake of what they had explained it now...
I really hate stats 😭
about problem #2 (that p-values assume null hypothesis to be true):
Can we not say that this isn't really a problem since the way we use p-value is not just the conditional probability? We use the "reductio ad absurdum" argument in relation to p-value.
So to say that the fact that p-values assume null hypothesis to be true means it cant help us judge whether the null is true or not, we would also have to say that "reductio ad absurdum" is a faulty argument in general. Which, i suppose, could be argued but would be a hard task.
Can't you look at the power to determine that you had a big enough sample size to pick up an effect, and use that to determine whether you have a type 2 error?
Can someone explain what's the difference between "fail to reject" and "accept" the null hypothesis. I didn't understand that part 😢.
Let's say your null hypothesis is "there are only white swans in the world". As soon as you find a black swan, you can reject the null hypothesis. But what happens if you find only white swans in your attempt to investigate your question? You have looked and looked and you only found white swans. Does that mean that there are no black swans in the world (i.e. can you accept your null hypothesis)? No (This is a classical example of an incorrect inductive argument). It might be that you just were not looking the right place at he right time and that is why you cannot accept your null hypothesis if you only find white swans. You failed to reject your hypothesis because you did not find any black swans.
This is why we have to always provide evidence to prove that something is wrong and can't provide evidence that something is right. We can say that all our evidence hints at our hypothesis being right (but can't say that it is right), and can work under such assumptions, until this assumption gets disproven.
Kaminaji Thank you for the explanation. I do understand the case of the swans, but I don’t understand the analogy: how it is translated to hypothesis testing.
@@GottsStr Do understand the concept of "reductio ad absurdum"?
Is this it for discussion of p-values, or is "p-hacking" going to be brought up at some point?
What building takes ten minutes to walk around!?
Math is Hard.
i think the problem of p-value could be solve by meta-study.
Que gran material.
Both those psychology experiments actually show the opposite of what she said they showed. I know that they are just examples. But they are terrible examples. We cannot train our cognitive intellect.
Nice 'n analog chess clock.
im not convinced
is this series going to get to stuff like multidimensional reduction? or is this more of only an "intro" kind of thing?
There is a lack of resources to learn statistics in youtube, I hope this series fixes that
Sorry, but talking about numbers showing fast clips of fish, deers and chess is IN NO WAY better than the good old whiteboard. It really doesn't help visualization and abstraction.
It's important to remeber that the statistical tests can be overly insensitive at very large sample sizes since the power increase. The alpha then needs to be adjusted to a lower level, possibly 1%. It's also important to study the practical significance in these cases. For example, when testing if data is normally distributed and it's rejected, to test the practical significance a graphical approach can be done to study if its approximately normally distributed.
What
2
The swans in the pool of my university are black ^ ^
puppycat in the background ( * __ * )
So that's what it's called. I've been wondering since I saw it!
nice se
p fishing is nonsense.
My neurons are confused.
Great video just she talk too fast .
Firs