People should take a course in naming things before they establish new terms. Worst names ever. Names should be few syllables and somewhat self-explanatory. E.g. good hit, bad hit, good miss, bad miss.
It's also important to emphasize p-values require defining *a priori* exactly what a "more extreme" result is. If it can't be defined, or you define it afterward, you haven't actually generated a p-value.
Great video idea! I do think, however, that since these concepts are all quite too simple for someone with a wide experience in statistics, it would be cool to see more versions of this video concerning deeper concepts. As someone which deals with a huge amount of statistics, but since it's under Econometrics, quite limited to Casual Inference, I'd love to see what I'm missing out in the other subfields. Keep up the great work!
Yeah that’s the tricky thing about videos like this one. On one hand, my audience is full of people who do have deep stats experience, so it’s more of a quick check. But on the other, these are also ideas that I regularly have to teach to researchers during consults. I appreciate the input, I’ll try to think of ways to strike a balance here
@@very-normal i would love if instead, you made another one of these videos but "for advanced viewers", because i thought this was too perfect of an introduction to miss out on! keep up the great work dude!!
I finally figured out how to remember the difference after watching this video. "false positive" is pretty common language, so that one is type one, as opposed to a false negative which, at least I, didn't hear as often prior to taking stats. Thank you again, love the content, and the channel name.
Woow great video. I felt really engaged with the idea of first watch if I know the concept, and later seeing the explanation. It is a much better format to feel I am actually improving, but setting first that I dont fully understand that topic
Reading the other comments, at least for me I learned a lot. I had a couple of courses in deep statistics, but i dont use it in my day to day (im a computer engineer), so the subjects were not new, but it definitely show me i dont remember anything about them.
Lastly, I think for me the best method to engrave this type of knowledge is by practical examples (hopefully outside the medical trial ones, they are all over the place and overused)
really interesting to see some other ways to explain the same concepts I'm taught in uni. even though I think I had all the questions right I found it helpful to hear the ideas paraphrased and visualized. good way to enhance intuition.
Hey mate, great work! Would love some videos on the difference between doing causal inference on observational vs experimental data, the pitfalls of linear regression, etc; econometrics topics that aren't technically rigorous but form the foundations of model based inference.
I think that a more general math formula for the pvalue should be 2*min(P(T≥t|H0), P(T≤t|H0)) because for example when testing H0: σ²=σ²_0 vs H0: σ²≠σ²0 with a normal srs the usual test statistics is asymmetrical (it's a chi squared with n-1 df). It's not very common though, because most of the tests are either chi squared with bigger values the farther from the null or normal/student's t distributed under the null
I wonder how I have never heard of the long run idea... Anyway, great video, I was able to answer questions 1-4! Looking forward for more question videos :)
Thank you for the video. Just a small remark: at 4:30 , your text says "... the two have a similar effect." I am not a native speaker, but doesn't it imply that they are not equal, but have a small difference, i.e. a small effect size. In that case, the drug would be indeed different from placebo and the null hypothesis WOULD be wrong and you did NOT commit a type 1 error (which is a valid criticism to frequentism and NHST, that in the real world, the Null is never true...).
One request please make more elaborative video on Type-1,2,3 or sthg else or similar if exist and also on power with real use case in biology as you are Biosatistician.
With question 5, would it be okay to say that there is a number L such that probability that distance between a sample proportion and L can be arbitrarily small yet that probability will approach one over many trials?
Yeahh, that’s about right. At that part, I made a vague reference to the Law of Large Numbers, which is similar to what you’re describing. Convergence in probability (or almost surely, depending on what law is used)
It'd be nice if the terminology was more descriptive of what they're measuring. Especially type 1/2 error. Like when programming, you'd want good variable names. Instead of type 1, maybe "false alarm rate"/"cry wolf rate". Type 2: "overlooked rate"/"failed rejection rate", power: "detection rate"/"bullseye rate"
lol yeah I agree. Unfortunately, statisticians are the worst at naming things. Don’t even get me started on stuff like “sufficiency”, “completeness”, or “almost sure convergence”. But to be fair, statistics can be used in so many different contexts that it almost has to suffer from needing to use very vague, general terms
@@very-normal I would love to see more of these type of videos, but for more advanced topics. E.g. pitfalls when using MCMC or pitfalls when doing logit, to name a few topics you've covered earlier
Let`s say I have a Bayesian prediction system. I test it 1000000 times. I find that the 90% quantile is exceeded 20% of the time. So it`s a fail, obviously. This illustrates that even if you are a Bayesian, you also have to be a frequentist, because not being a frequentist breaks the definition of probability and probability doesn`t make sense any more. B and F are not opposites. They refer to different things. They can work hand in hand.
Not really. We can assign a probability for the explosion of a bomb, but once it happens it is no longer replicable. Sure, you can argue, that the bomb could be rebuilt, but let's say it's unique for the sake of the argument. In this case there is no frequency. You could also think of a locally probabilistic process, which evolves over time: e.g. reproduction of organisms in a sequestered eco-system (they do breed, but they go extinct after some, likely long, time)
@@the_multus Wrt the bomb, the explosion can't be repeated. But the model that you use to model the explosion can be tested as many times as you want, using random data based on the assumptions in the model. So let's say you test it, and you find that the 90% quantile is exceeded 20% of the time. That means that the model is inconsistent with it's own assumptions. Folk are not going to accept a model like that, once that's been pointed out. The point is...even Bayesian predictions need to have good frequentist properties in this way (the modelling methodology has to have good frequentist probabilities), or the probabilities that come out are not plausible. Over your career as a bomb disposal expert, you would try and defuse 1000s of bombs. You'd hope that the models that you use would have the property that the 90% quantile is exceeded 10% of the time, over the course of your career. I'm not sure I quite understand the second point. Is the point that it's non-stationary? If that's the point, then that's just like weather forecasts. And weather forecasts need to have the property that the daily 90% quantile is exceeded 10% of the time (over many repeats), otherwise they are not plausible (whether they come from a Bayesian analysis or something else). There was a famous paper on this topic by Philip Dawid back in the 80s, which is a good starting point. And I just wrote a paper about it, but it's not published just yet... The difference between what you are saying and what I am saying is very subtle, and I'm not sure I 100% understand it. I am making a point about predictions, that all predictions need to have good frequentist properties. I don't think anyone would really disagree with my point. With your bomb, you are making a point about the definition of probability for real events, and I don't think anyone would disagree with your point either. I think the resolution is that we're not really disagreeing because we are talking about different things.
@@simonpedley9729 oh, I see now. We were just talking about different things: predictions in a model environment and a real environment. I do agree, that a model should stand to frequentist analysis. I've just provided two examples of objects, which don't demonstrate a cyclical (in a sense) behaviour but could still be reasonably described by probabilistic methods. That's all. Good point.
Everybody knows about Type 1 Errors and Type 2 Errors, but few know about Type 3 errors: confusing Type 1 Errors with Type 2 Errors
People should take a course in naming things before they establish new terms. Worst names ever. Names should be few syllables and somewhat self-explanatory. E.g. good hit, bad hit, good miss, bad miss.
@bp56789 for real. I have no idea how "Type One" and "Type 2" sounded like a good, memorable, or intuitive name for these.
Oh, I would say type 3 is by far the most common error. 😅
It's also important to emphasize p-values require defining *a priori* exactly what a "more extreme" result is. If it can't be defined, or you define it afterward, you haven't actually generated a p-value.
Would you mind to explain? Newbie here
expound please.
Great video idea! I do think, however, that since these concepts are all quite too simple for someone with a wide experience in statistics, it would be cool to see more versions of this video concerning deeper concepts. As someone which deals with a huge amount of statistics, but since it's under Econometrics, quite limited to Casual Inference, I'd love to see what I'm missing out in the other subfields. Keep up the great work!
Yeah that’s the tricky thing about videos like this one. On one hand, my audience is full of people who do have deep stats experience, so it’s more of a quick check. But on the other, these are also ideas that I regularly have to teach to researchers during consults. I appreciate the input, I’ll try to think of ways to strike a balance here
Love that you care mate. It's always hard to balance complexity with educational themes, just know you make great videos!
@@very-normal i would love if instead, you made another one of these videos but "for advanced viewers", because i thought this was too perfect of an introduction to miss out on! keep up the great work dude!!
@@very-normal just wanted to chime in and say the depth of this video was perfect for a stats beginner like me!
I finally figured out how to remember the difference after watching this video. "false positive" is pretty common language, so that one is type one, as opposed to a false negative which, at least I, didn't hear as often prior to taking stats.
Thank you again, love the content, and the channel name.
Woow great video. I felt really engaged with the idea of first watch if I know the concept, and later seeing the explanation. It is a much better format to feel I am actually improving, but setting first that I dont fully understand that topic
Reading the other comments, at least for me I learned a lot. I had a couple of courses in deep statistics, but i dont use it in my day to day (im a computer engineer), so the subjects were not new, but it definitely show me i dont remember anything about them.
Lastly, I think for me the best method to engrave this type of knowledge is by practical examples (hopefully outside the medical trial ones, they are all over the place and overused)
There are two types of statisticians: those who understand power, those that don’t, and those that aren’t sure
really interesting to see some other ways to explain the same concepts I'm taught in uni. even though I think I had all the questions right I found it helpful to hear the ideas paraphrased and visualized. good way to enhance intuition.
Hey mate, great work! Would love some videos on the difference between doing causal inference on observational vs experimental data, the pitfalls of linear regression, etc; econometrics topics that aren't technically rigorous but form the foundations of model based inference.
Yeah I think some causal questions would be good! They come up a lot in Biostat as well
@@very-normal Haha just noticed I had typed casual instead of causal; the exact opposite of what to do XD
8:50 thats so unhinged for you lol
10:35 that's obviously wrong!
Everybody knows the word »frequentist« comes from the word »freaqy« ( ͡° ͜ʖ ͡°)
oops fell for the classic pitfall
This way of teaching is really good - with questions leading forward a full narrative. I think you're on to something. Thank you.
I think that a more general math formula for the pvalue should be
2*min(P(T≥t|H0), P(T≤t|H0)) because for example when testing
H0: σ²=σ²_0 vs H0: σ²≠σ²0 with a normal srs the usual test statistics is asymmetrical (it's a chi squared with n-1 df).
It's not very common though, because most of the tests are either chi squared with bigger values the farther from the null or normal/student's t distributed under the null
I wonder how I have never heard of the long run idea...
Anyway, great video, I was able to answer questions 1-4! Looking forward for more question videos :)
Thank you for the video. Just a small remark: at 4:30 , your text says "... the two have a similar effect." I am not a native speaker, but doesn't it imply that they are not equal, but have a small difference, i.e. a small effect size. In that case, the drug would be indeed different from placebo and the null hypothesis WOULD be wrong and you did NOT commit a type 1 error (which is a valid criticism to frequentism and NHST, that in the real world, the Null is never true...).
more of these please
Super idea, go on please!
Very Nice 👏
One request please make more elaborative video on Type-1,2,3 or sthg else or similar if exist and also on power with real use case in biology as you are Biosatistician.
With question 5, would it be okay to say that there is a number L such that probability that distance between a sample proportion and L can be arbitrarily small yet that probability will approach one over many trials?
Yeahh, that’s about right. At that part, I made a vague reference to the Law of Large Numbers, which is similar to what you’re describing. Convergence in probability (or almost surely, depending on what law is used)
It is useful. Thanks 😊
It'd be nice if the terminology was more descriptive of what they're measuring. Especially type 1/2 error. Like when programming, you'd want good variable names. Instead of type 1, maybe "false alarm rate"/"cry wolf rate". Type 2: "overlooked rate"/"failed rejection rate", power: "detection rate"/"bullseye rate"
lol yeah I agree. Unfortunately, statisticians are the worst at naming things. Don’t even get me started on stuff like “sufficiency”, “completeness”, or “almost sure convergence”.
But to be fair, statistics can be used in so many different contexts that it almost has to suffer from needing to use very vague, general terms
Simple questions really help me to up my terminology game in english! So thx, I guess.
Nice
try to make a video where you question chatgpt's models on statistical data analysis and how it succeeds or fails?
Nice vid
🌱💨🟢
thank you
Baba Is You music in a statistics video 🤯
Omg baba is you music!!!
Technically, the standard p-value is not a conditional probability.
how would you describe it tho lol
These were too easy and basic.
You know your stuff!! If you have another question you get tripped up by, I’m game to try to help out
@@very-normal I would love to see more of these type of videos, but for more advanced topics.
E.g. pitfalls when using MCMC or pitfalls when doing logit, to name a few topics you've covered earlier
@@yoeri7004 Truuu
Let`s say I have a Bayesian prediction system. I test it 1000000 times. I find that the 90% quantile is exceeded 20% of the time. So it`s a fail, obviously. This illustrates that even if you are a Bayesian, you also have to be a frequentist, because not being a frequentist breaks the definition of probability and probability doesn`t make sense any more. B and F are not opposites. They refer to different things. They can work hand in hand.
Not really. We can assign a probability for the explosion of a bomb, but once it happens it is no longer replicable. Sure, you can argue, that the bomb could be rebuilt, but let's say it's unique for the sake of the argument. In this case there is no frequency.
You could also think of a locally probabilistic process, which evolves over time: e.g. reproduction of organisms in a sequestered eco-system (they do breed, but they go extinct after some, likely long, time)
@@the_multus Wrt the bomb, the explosion can't be repeated. But the model that you use to model the explosion can be tested as many times as you want, using random data based on the assumptions in the model. So let's say you test it, and you find that the 90% quantile is exceeded 20% of the time. That means that the model is inconsistent with it's own assumptions. Folk are not going to accept a model like that, once that's been pointed out. The point is...even Bayesian predictions need to have good frequentist properties in this way (the modelling methodology has to have good frequentist probabilities), or the probabilities that come out are not plausible. Over your career as a bomb disposal expert, you would try and defuse 1000s of bombs. You'd hope that the models that you use would have the property that the 90% quantile is exceeded 10% of the time, over the course of your career.
I'm not sure I quite understand the second point. Is the point that it's non-stationary? If that's the point, then that's just like weather forecasts. And weather forecasts need to have the property that the daily 90% quantile is exceeded 10% of the time (over many repeats), otherwise they are not plausible (whether they come from a Bayesian analysis or something else).
There was a famous paper on this topic by Philip Dawid back in the 80s, which is a good starting point. And I just wrote a paper about it, but it's not published just yet...
The difference between what you are saying and what I am saying is very subtle, and I'm not sure I 100% understand it. I am making a point about predictions, that all predictions need to have good frequentist properties. I don't think anyone would really disagree with my point. With your bomb, you are making a point about the definition of probability for real events, and I don't think anyone would disagree with your point either. I think the resolution is that we're not really disagreeing because we are talking about different things.
@@simonpedley9729 oh, I see now. We were just talking about different things: predictions in a model environment and a real environment. I do agree, that a model should stand to frequentist analysis. I've just provided two examples of objects, which don't demonstrate a cyclical (in a sense) behaviour but could still be reasonably described by probabilistic methods.
That's all. Good point.