The full video describes a clean way to think about examples like this, using something called the "Bayes factor". The link is at the bottom of the screen. Or, for reference: th-cam.com/video/lG4VkPoG3ko/w-d-xo.html
Actually ROC doesn't help alleviate this because the issue isn't related to the test, it's related the the pre-test probability (the prior). I discuss this at length in a lot of my videos, but it's a major reason why you shouldn't do "full body MRIs" or other such nonsense.
@@riverrockmedicalYeah mb, I didn't finish the video before typing that :), just saw what I assumed would be sensitivity and specificity and assumed there would be a ROC curve somewhere
Do these numbers have anything to do with reality? Like 90/(90+89) would be ≈ 50% and 9/(1+9) = 90%. Absurdly there are actual medical publications that suggest to lower the number of tests because of this. 😮
I took many statistics courses in college with a great professor who worked in fields including healthcare. When we began our segment on Bayes’s Rule, he told a story about the time his mother called panicking and scared that her screening came for cancer came back positive. He asked if she could get him information on the test, told her she didn’t have cancer and could stop worrying, and said he would call her back by the end of the day. He used the manufacturers information from the test, calculated the probabilities, and called her back and once again said she shouldn’t panic. He would be right, as further testing showed she did not have cancer. My professor stressed it’s better for people to get false positives than for the test to miss someone who does have cancer.
There was a study done where doctors were asked to assess this exact problem and 100% of them failed. That's why this example is standard in every textbook I've ever read when they explain Bayesian statistics.
“it’s better for people to get false positives than for the the test to miss someone who does have cancer” That’s a brilliant quote. I couldn’t agree more. That changes my perspective of medical testing. Thanks for sharing this!
I heard about a woman who got a false positive on her cancer screening. When she got the news, she became so stressed that she had a heart attack and died. Doctors said she was one of the rare cases where she wasn’t better off with the false positive.
@@Jschmuck8987 situations like that highlight how crucial it is for people to have some understanding of testing statistics and for medical professionals to communicate clearly with patients. It’s rational to lose hope and negatively react to potentially facing cancer, and there have been instances of people committing suicide or acting recklessly after being exposed to that probability. My wife found a lump in her breast about a year ago - fortunately benign and will be routinely checked - and it was extremely stressful from the moment we discovered it. Initial testing indicated a moderate risk that it was cancerous, and I had real experience using Bayes and other statistical methods to demonstrate to her that tests were not a guarantee. While it may have helped ease her mind, it still is extremely stressful to find out that your chances of having cancer are >0.
It's actually the case with every screen test where we have to make a compromise between the sensitivity of the test and accuracy (generally increasing sensitivity decreases the accuracy). That is why you hear doctors usually say we positively suspect that you got this and to confirm you need this ( a biopsy in most cases)
Me who had my yearly check-up in May last year and they told me I was pregnant based on my blood results but not pregnant based on my urine results, so they assumed I was pregnant. I am a virgin and have had my period every month, at the same time for 11 years. In fact I had my period while they took that blood test. False positives are very depressing, imagine if it was a false positive for something I couldn't immediately disprove. Like cancer or an infection.
@@LycanFerret I am sorry to hear that. Blood-based pregnancy tests are somewhat unreliable, especially during the periods but I advise you to report this false negative to your hospital. I am only a med student at the moment but trust me there are things far worse than pregnancy that could show up in this way so, it's better to discuss it over with your doctor.
@@LycanFerret I'm not a doctor or in any medical field, but I do statistical analysis on some medical datasets, there is a decent portion of the time I don't understand what it is I'm looking at. That being said, there is something called "loss" in stats that's factored into most/every medical test (I can say that with pretty much 100% certainty), and what it does is it puts a value on "what would happen if I'm wrong about this", so if you said somebody has cancer and they don't, you'd run all the further testing and x-ray etc. and it would suck for the person for a bit but imagine the flip side of "this says you don't have cancer but you actually do", it's a lot worse so there is a much bigger "risk" (in the statistical and real sense). If it was 50/50 whether somebody had cancer or not, it would say they do just to be on the safe side. Similar with pregnancy (with a slightly lower risk factor) but that means that if it's "ehhhh the tests say one thing but this test says another", a lot of the time it will be negative but they value spotting true pregnancies more. Actually building these "loss" functions is a whole different story because you're trying to put a value on a potential life. Too high and you say "everybody has cancer", too low and you start to miss people because it's less than 50% chance.
@@LycanFerretthis balance between sensitivity and specificity is what allows people who do have cancer detect the cancer early. It's better to assume one has cancer at the earliest time but then be a false positive rather than not detect the cancer and then actually have them. Plus usually it only takes another test to confirm it. It sucks for the false positives, but it will suck more for the false negatives.
Normally I understand everything of this nature that would confuse the general population, but this is the first time something really “broke my brain”
One thing that could help : before you made the test if you had taken a random woman you had 0.9% chance of her having cancer. Now if you take one that was positive on this test you have around 9% chance of her having cancer. The test was extremly good the issue is that the thing you are trying to detect has really low probability
I'm going to appeal to human's physical and concrete nature, as well as being a bit savage in the context of current wars. It seems surprisingly relevant in our current context. Suppose there is a hospital that has 1000 people, and you suspect that 10 enemies are hidden in the building. You want to neutralize as many enemies as possible with some military technique, and you kinda don't care too much about collateral damage. Suppose you pick some kind of radar technology with an airstrike, that guesses that the enemies are in a certain floor. The airstrike lands, and it turns out that 90% of the enemies were in the missile's landing zone. Unfortunately, the missile's blast radius had 89 civilian casualties. In the end, the missile strike was 90% successful, but at the same time, only 10% of it's casualties were the intended targets. It was accurate but it had a lot of spread. You can invert the count and justify that the strike only affected 10% of the hospital population, but 90% of those were innocent. However, only 10% of the enemies dodges the attack. It may be just me, but Grant's way of explaining things makes it hard for me to follow. I'm not sure if it's his voice, cadence, or content, but he somehow manages to make it harder for me to understand.
@@daysofendThat makes so much more sense. The video made it seem like those percentages were just made up and I couldn't math it out for the life of me.
Great video! This 'paradox' is the reason why we learn to use positive and negative predictive values in med school, which give you more information the just the specificity of the test.
I mean I get that it's an important example to help people understand Bayes theorem but I feel like it's always put in a way where it's like "Oh my gosh what are we going to do if the number doesn't mean what we think it means." And as you point, simply just use the correct terminology. Just describe it better and in ways people will understand. "You tested positive but just know this test is wrong ten times more than it's right" Or more or less technical depending on the audience.
A great way to visualize this is by splitting the groups along what you are studying- in this case the accuracy of the test. Visualize two groups, positives and negatives. We have 98 positives with 89 false positives, and 902 negatives with just one false negative. Now we can see that while we can almost definitely trust a negative, a positive result requires further testing.
I think the best value in this first test is that it is cheaper and less involved that subsequent testing. A very accurate test might have fewer false positives, but cost a lot more, in money time and stress. This first test is absolutely not a waste, so long as it doesn't take so long that the result comes too late for effective treatment.
@@annana6098this is exactly why many tests like these are used, and if you ever are unfortunate enough to have you or someone you know test positive for a disease like this, the next step in their treatment will almost always be further testing. These tests are either more invasive or expensive as you said but also sometimes more dangerous.
@@piiinkDeluxeof course. All medicine these days is heavily supported by people in the fields of statistics or data science. Your doctors will understand if they give you a test like this that negative results are almost always accurate, oftentimes upwards of 99.9%. They will also know and inform you that preliminary positive results aren’t always accurate, and they will give you further testing.
False positives are more desirable than false negatives for screenings. Worst case in a false positive is that they get additional testing and are found to be negative. Worst case in a false negative is someone continues to have an untreated condition and then any thoughts of it being what they were tested for are dismissed due to the negative test. Screenings are only really to determine who should get further testing. Using a screening test is often much cheaper and better than doing the full tests on everyone. Tests are typically only good for proving you have or you don’t have something, rarely both. Like with an antibody test for Addison’s disease, if you test positive you definitely have Addison’s, but only 70-80 % of people with Addisons actually test positive. So the test can be used on its own to prove you have Addison’s but cannot on its own be used to prove you don’t have Addison’s. In the same way this breast screening test can be used to prove with a lot of certainty that you don’t have the condition but can’t be used on its own to prove you do have the condition.
There are 2 core concepts in the statistics of diagnostic tests. Sensitivity and Specificity. Sensitivity is the priotity for screening tests to "catch as many as we can" while high specificity tests such as confirmatory tests are to say "yep, you have this disease". Both have their uses.
You are forgetting Positive predictive value and negative predictive value. This video demonstrates PP specificall. Not really a paradox. As prevalence increases, PPV increases, while NPV decreases. PPV is the likelihood a positive test result represents actual disease. Or true positive / (true positive + false positive)
That's exactly what irked me. Prevalence of the thing you're looking for has a major impact. If I'm testing a group of males for breasts cancer I'm gonna have a much lower ppv when compared to a female test group.
@@wh4t3v3rrr I'm not a statistician but I did some basic biostats during medical school and residency (including research). I've never heard of this described as a paradox, ever. That irked me.
Here a thought experiment to make it more clear: suppose noone in this group of 1000 women has breast cancer. The even if the test is 99.9% specific, it will still give ≈ 1false positive. (and no matter how sensitive it is, it will get no false negative, as this would be impossible) So if you knew a positive patient, with there being no breast cancer in this world,the chance of the test being false is still 100%. And in another world, where every women has breast cancer, a positive result would always be right, even if the specificity is only 0.01%. So this strongly depends on how high the chance of the sickness is, before taking the test.
I come back to it often. I always understand as I watch the video, but incorporating Bayesian thinking into my assumptions at work has proven.. well it’s tricky. And I never know if I’m thinking straight
@@3blue1brown Would you mind creating a short for the Bayes factor specifically? I occasionally forget how to make and use it properly and keep referring back to the original video as a refresher.
You have 20 apples. 2 of them are rotten. The apple test is done, and only one of the rotten apples are correctly identified as rotten. So the test has a 50% accuracy rating. Which means of the other 18 apples, 9 of them are "False Positives" for being rotten. So in all likelihood, if you're told your apple is rotten there's a 90% chance (out of the 1 actual rotten and 9 false positives) it's actually fine and the test is wrong.
I'll try to break it down another way. There are 1,000 patients. They all get screened for one symptom of cancer. 98 tests come back "yes they have symptom" and those 98 get sent for further testing to see if they have cancer or that one symptom for other reasons, and the other 902 who got "no they don't have this symptom" as a result go home. The test said 90.2% of the original group of 1000 patients don't need to be tested further. Of the 98 (or 9.8% of the original 1000) in the "yes" category, 9 people (or 9.2% of that 98) end up having cancer, and the other 89 can go home fine. 1 of the 902 people who were originally flagged "no" dods have cancer and was sent home by mistake. That's .1% of the "no" category being wrong. People flagged "yes" who have cancer: 9 People flagged "yes" who don't: 89 People flagged "no" who have cancer: 1 People flagged "no" who don't: 901 Accurate readings: 91% False positives: 8.9% False negatives: .1% Obviously this sucks for the 1 false negative, especially considering that .1% of the population is well over 7 million. But it's something to put people at ease when they get flagged "yes" initially, since 10 out of 11 don't have cancer. And it shows how the test is still considered very accurate with all these incorrect readings because the vast majority are accurate and it's better to have more false positives than false negatives.
DNA evidence is not how they screen for breast cancer. Some look for internal chemical imbalances and some use mammograms. DNA is cut with certain enzymes and then put in a gel electrophoresis to separate the size of the strands that are left. Which gives a DNA finger print. Dues to the sizes being different for every person unless you are identical twins
I was pretty sure, but, I'm no expert, so I just looked up "false positive dna test." The only way a dna test is false, is via samples getting mixed up, be it mistake or fraud. I get wanting to educate juries, but dna tests for identity or paternity aren't statistical inferences. It's hard science. Again, I'm not an expert. I looked it up a minute ago to double check.
This is exactly the kind of fallacious conclusion that I was instantly afraid people would draw from this. While you‘re _technically_ correct in that DNA evidence still is „only“ evidence rather than conclusive proof, it is way more reliable than this video makes it seem. That massive reduction of confidence to 9% comes from the premise that people were tested once and at random without individual indication, resulting in roughly ten times as many false positives as true positives. Do the same test just a second time, and if it still turns out positive, the probability of the patient actually having cancer already goes up to ~90% by this alone. Now factor in that usually people don‘t get tested at random in the first place, but due to some form of prior indication (which is also usually the case in court trials), the rate of false positives goes down (and therefore the reliability of even a single positive test result goes up) significantly as well. Back to the court room, the accuracy of DNA evidence by far exceeds 90% to begin with. Again, it’s not a flawless mathematical proof, but taken everything into account, false positives are one in many millions. What the jury has to be made aware of , because it’s a way more severe cause of erroneous judgement, is the fact that DNA testing only tells you whether a DNA sample stems from a certain person, but not how that sample got where it was found. On another note, if you want to make sure the jury is able to adequately evaluate evidence, the first thing you should do is telling them what psychology and neuroscience have to say about the reliability of even sincere eyewitness testimony - only you wouldn‘t want that, because you‘d basically have to throw the „best“ - if not only - evidence there is in countless trials right out of the window.
Careful there Carl, thats dangerously close to misinformation. Not that people would ever take sides based off having differing political ideals in a medical matter. Maybe someday we can get back to humanity over political power again.
AWESOME! I’m a math teacher. I find this concept to be one of the hardest things for students to comprehend. (Adults often seem beyond hope with it.) I will be memorizing and refining this excellent explanation. THANK YOU!!!
This is a class 12th mathematics frequently asked question in India. First probability of all +ve and -ve results calculated i.e. P(E1) and P(E2) respectively. Then A is an event taken for all actual +ve. P(A/E1) and P (A/E1 intersection E2) are calculated, and their sum is used calculate P(A). To find probability that a person who is actually +ve and tested +ve is P(E1/A) = P(A/E1) *P(E1)/P(A).
I’ve heard specificity and sensitivity used as terms in this specific scenario, but in something like Natural Language Processing when you’re evaluating how well a system classifies text, yeah we do indeed call it precision and recall!
Not only. What matters is these numbers compared to the *incidence rate* (or prevalence) of the disease you want to test for. Which drives up or down your specificity and sensitivity requirements for your test. That's the whole point of the video, a test is only good or bad depending on the context.
i will kms bc i can’t have all of this health issues. i’m not who i am in my brain. i don’t have meaning in life, i am not in my body, my body will not serve natural purpose, i will die without completing great purposes of nature. i was born to work, fight and then die with some stupid machine crushing my bones
@IsItOver-xhkx it's reliable for the people who HAVE it who tested positive. Which is the point. To very accurately pick up those that could have it, so people with it aren't missing out on treatment. Then a further test confirms or denies.
Which is why they do a lot of tests for things only when they suspect the thing, to rule it out reasonably confidently. Because some things are basically a coin toss if you test everyone, but almost everyone with a condition will test positive.
@IsItOver-xhkxThe probability of a person having the disease and testing positive is different from the probability of a person not having a disease and testing positive. What the other commenter was getting at is that the probability of a person having the disease and correctly being identified as having it is very high, which is desirable in the real world.
I’ve missed your videos so much. It’s crazy, but I’m not even the same person I was the last time I watched your vids. I went to therapy, stopped having panic attacks, and have started to learn how to drive. Weird how life just goes on like that lol
People in all walks of life should see this video, because this helps one understand how statistics can be used and/or misused to build any story you want. Amazing. Thank you!
Accuracy versus precision, and in this case we want the screening to be geared towards minimizing false negatives, so the percent of false positives will be large
I'm the complete opposite personally. I never even passed Pre-Calculus but I did fine enough in AP Statistics. Statistics is just a lot easier for me to wrap my head around.
@@KingNedyacalculus feels deterministic while statistics feels random for me, so I prefer calculus. Even if the rules might be a bit weird, at least the rules are rules
@@KingNedyaStatistics can be more accessible, yes - to a certain extent. But in order to make use of its potential to an even mildly profound degree, you have to apply a virtually pathological paranoia grade caution to avoid a gazillion traps, only starting with the „correlation vs causation“ distinction, and maybe the most hideous of which is the strong human inclination to intuitively make subtle yet impactful assumptions without even realizing it. It’s these traps that are really hard to avoid, and in addition can be - and too often are being - exploited to manipulate a statistical analysis in favour of a predetermined „desired“ result, that is responsible for the oftentimes bad reputation of statistics in the public perception.
There are cases in medical data where you can use Bayesian analysis to do a meta epistemology review of Fourier Transform sorted data in a sea of noise, with calculus used to give quantitative scores in dynamic vector rates. Some med students really struggle with Fourier, and have trouble understanding why it's so important when industrial electronics noise infests so many hospitals that need to measure very weak signals.
That’s why screening tests are usually chosen due to their high sensitivity (ability of the test to be positive for those who actually have the illness, ie true positives). Afterwards, if the screening test came out positive, other tests that are usually high in specificity (ability of a test to be negative for those who do not have the illness, ie true negatives) are performed, thus clearing out the false positives from the screening test.
Though this short is a point in statistics,it’s fair to note that doctors understand that tests could be falsely positive. In high risk conditions like breast cancer screening,suspected patients undergo a process of triple screening:clinical,radiological and pathological.
It’s funny how any single variable can be slightly altered to make wildly different outcomes. This is based purely on this one scenario, not actual events and statistics.
There’s no paradox. What he described is the positive predicative value of the screening test. What he failed to mention is the negative predictive value of the test (True negatives/true negatives + false negative) which is 99.9% (901/[901+1]). So you can have very high confidence that if you test negative you really are negative. For the ones who screen positive you can then go on to do a more definitive diagnostic test (ie biopsy). There’s absolutely no paradox here, it’s a screening test not a diagnostic test.
It only seems counterintuitive at first because he didn’t fully explain the concept of screening tests…just click bait and actually harmful because it makes it seem like medical tests are unreliable and rampantly misdiagnose people.
It only seems counterintuitive at first because he didn’t fully explain the concept of screening tests…just click bait and actually harmful because it makes it seem like medical tests are unreliable and rampantly misdiagnose people.
It only seems counterintuitive at first because he didn’t fully explain the concept of screening tests…just click bait and actually harmful because it makes it seem like medical tests are unreliable and rampantly misdiagnose people.
I have been working on some part of this problem at the University of Pittsburgh my entire 27.5 year career. Lots of reader studies and evaluation of new technologies to try to reduce the false positive rates and to better train radiologists, fellows and residents.
just yesterday i watched a video by Medlife Crisis about the harm of doing screening just for the heck of it and this is exactly one of the points Rohin was bringing up (though not with mathematical rigour)
I remember studying this in probability class when the covid tests were only 50% accurate. The college was quarantining everyone with a positive test….
I used to teach this to co-workers as part of signal detection theory & contingency matrices. Its not a paradox but rather a function of the incidence rate on the test results and properly interpreting the results. The biggest problem is many people in the medical field get it wrong and people getting tested get needlessly panicked.
My boyfriends aunt was a false positive. They gave her chemo and radiotherapy and shortened her lifespan and damaged jer physical condition. She is suing
I believe the issue is coming from having 1% probability of having a disease, and 10% of having failed screening. It would be absolutely different if the probability of having disease and screening failure were the same.
Doctors need to find a way to explain to people that a positive result sometimes just means that more testing is necessary. We're adding tons of unnecessary stress to people's lives by making it seem like they tested positive for cancer.
I appreciate putting paradox in quotations. People are too accustomed to calling counterintuitive things paradoxes. Just because it’s unexpected doesn’t mean it’s impossible.
The full video describes a clean way to think about examples like this, using something called the "Bayes factor". The link is at the bottom of the screen.
Or, for reference: th-cam.com/video/lG4VkPoG3ko/w-d-xo.html
ROC curve gonna do some wonders here
Actually ROC doesn't help alleviate this because the issue isn't related to the test, it's related the the pre-test probability (the prior).
I discuss this at length in a lot of my videos, but it's a major reason why you shouldn't do "full body MRIs" or other such nonsense.
@@riverrockmedicalYeah mb, I didn't finish the video before typing that :), just saw what I assumed would be sensitivity and specificity and assumed there would be a ROC curve somewhere
My brain short-circuited when you introduced numbers
Do these numbers have anything to do with reality? Like 90/(90+89) would be ≈ 50% and 9/(1+9) = 90%. Absurdly there are actual medical publications that suggest to lower the number of tests because of this. 😮
I took many statistics courses in college with a great professor who worked in fields including healthcare. When we began our segment on Bayes’s Rule, he told a story about the time his mother called panicking and scared that her screening came for cancer came back positive. He asked if she could get him information on the test, told her she didn’t have cancer and could stop worrying, and said he would call her back by the end of the day. He used the manufacturers information from the test, calculated the probabilities, and called her back and once again said she shouldn’t panic. He would be right, as further testing showed she did not have cancer. My professor stressed it’s better for people to get false positives than for the test to miss someone who does have cancer.
There was a study done where doctors were asked to assess this exact problem and 100% of them failed. That's why this example is standard in every textbook I've ever read when they explain Bayesian statistics.
@@abebuckingham8198thats not great if they failed because it is something we’re taught how to assess.
“it’s better for people to get false positives than for the the test to miss someone who does have cancer”
That’s a brilliant quote. I couldn’t agree more. That changes my perspective of medical testing. Thanks for sharing this!
I heard about a woman who got a false positive on her cancer screening. When she got the news, she became so stressed that she had a heart attack and died. Doctors said she was one of the rare cases where she wasn’t better off with the false positive.
@@Jschmuck8987 situations like that highlight how crucial it is for people to have some understanding of testing statistics and for medical professionals to communicate clearly with patients. It’s rational to lose hope and negatively react to potentially facing cancer, and there have been instances of people committing suicide or acting recklessly after being exposed to that probability. My wife found a lump in her breast about a year ago - fortunately benign and will be routinely checked - and it was extremely stressful from the moment we discovered it. Initial testing indicated a moderate risk that it was cancerous, and I had real experience using Bayes and other statistical methods to demonstrate to her that tests were not a guarantee. While it may have helped ease her mind, it still is extremely stressful to find out that your chances of having cancer are >0.
Thanks for the mental Stroke...needed that at 2am...(thx yall for the likes)
Its actually 2:00 in the morning as I'm reading this omg
Aye timezone bros✌🏻
It’s 8:30 in the morning right now and I haven’t slept
@@Ambarrabmafellow European
Hawaii?
This is why they repeat tests or change methods to a more specific test.
this is also why we don’t just test everyone for everything all the time
I heard "picture a thousand women" and had a panic attack
Almost too scary too imagine
i nearly died. glad that they aren't real though
Lmaooo y’all that’s so dumb omg pls 😂
@@ilomilo850 literally 😭 they're the acting like 17 yo boys and for what
..just messing around? Damn.
It's actually the case with every screen test where we have to make a compromise between the sensitivity of the test and accuracy (generally increasing sensitivity decreases the accuracy). That is why you hear doctors usually say we positively suspect that you got this and to confirm you need this ( a biopsy in most cases)
Me who had my yearly check-up in May last year and they told me I was pregnant based on my blood results but not pregnant based on my urine results, so they assumed I was pregnant. I am a virgin and have had my period every month, at the same time for 11 years. In fact I had my period while they took that blood test.
False positives are very depressing, imagine if it was a false positive for something I couldn't immediately disprove. Like cancer or an infection.
@@LycanFerret I am sorry to hear that. Blood-based pregnancy tests are somewhat unreliable, especially during the periods but I advise you to report this false negative to your hospital. I am only a med student at the moment but trust me there are things far worse than pregnancy that could show up in this way so, it's better to discuss it over with your doctor.
@@LycanFerret I'm not a doctor or in any medical field, but I do statistical analysis on some medical datasets, there is a decent portion of the time I don't understand what it is I'm looking at.
That being said, there is something called "loss" in stats that's factored into most/every medical test (I can say that with pretty much 100% certainty), and what it does is it puts a value on "what would happen if I'm wrong about this", so if you said somebody has cancer and they don't, you'd run all the further testing and x-ray etc. and it would suck for the person for a bit but imagine the flip side of "this says you don't have cancer but you actually do", it's a lot worse so there is a much bigger "risk" (in the statistical and real sense). If it was 50/50 whether somebody had cancer or not, it would say they do just to be on the safe side.
Similar with pregnancy (with a slightly lower risk factor) but that means that if it's "ehhhh the tests say one thing but this test says another", a lot of the time it will be negative but they value spotting true pregnancies more.
Actually building these "loss" functions is a whole different story because you're trying to put a value on a potential life. Too high and you say "everybody has cancer", too low and you start to miss people because it's less than 50% chance.
That's Covid for you.
@@LycanFerretthis balance between sensitivity and specificity is what allows people who do have cancer detect the cancer early.
It's better to assume one has cancer at the earliest time but then be a false positive rather than not detect the cancer and then actually have them.
Plus usually it only takes another test to confirm it.
It sucks for the false positives, but it will suck more for the false negatives.
This is an EXCELLENT summary of "positive predictive value" and the importance of factoring in prevalence. Well done
Thank you for the TH-cam Shorts escape hatch!
I was in a spiral doom scroll and this made me think to much and now im out haha
I should fkin start making these. Thanks for the ideas! lol
I needed this
This guy always helps us out of the endless scrolling
hee ho
Normally I understand everything of this nature that would confuse the general population, but this is the first time something really “broke my brain”
One thing that could help : before you made the test if you had taken a random woman you had 0.9% chance of her having cancer.
Now if you take one that was positive on this test you have around 9% chance of her having cancer.
The test was extremly good the issue is that the thing you are trying to detect has really low probability
I'm going to appeal to human's physical and concrete nature, as well as being a bit savage in the context of current wars. It seems surprisingly relevant in our current context.
Suppose there is a hospital that has 1000 people, and you suspect that 10 enemies are hidden in the building. You want to neutralize as many enemies as possible with some military technique, and you kinda don't care too much about collateral damage.
Suppose you pick some kind of radar technology with an airstrike, that guesses that the enemies are in a certain floor. The airstrike lands, and it turns out that 90% of the enemies were in the missile's landing zone. Unfortunately, the missile's blast radius had 89 civilian casualties.
In the end, the missile strike was 90% successful, but at the same time, only 10% of it's casualties were the intended targets. It was accurate but it had a lot of spread. You can invert the count and justify that the strike only affected 10% of the hospital population, but 90% of those were innocent. However, only 10% of the enemies dodges the attack.
It may be just me, but Grant's way of explaining things makes it hard for me to follow. I'm not sure if it's his voice, cadence, or content, but he somehow manages to make it harder for me to understand.
@@daysofendThat makes so much more sense. The video made it seem like those percentages were just made up and I couldn't math it out for the life of me.
Learning and implementing Bayes and Naive Bayes Theorem has always been a pleasure.
"Picture a thousand women" no im scared
I'm more afraid of the fact this is a USMLE exam question 😅
Kmzo
i was about to say that
As a lesbian, I'm scared too 😭
NAME A WOMAN
Great video! This 'paradox' is the reason why we learn to use positive and negative predictive values in med school, which give you more information the just the specificity of the test.
I mean I get that it's an important example to help people understand Bayes theorem but I feel like it's always put in a way where it's like "Oh my gosh what are we going to do if the number doesn't mean what we think it means." And as you point, simply just use the correct terminology. Just describe it better and in ways people will understand. "You tested positive but just know this test is wrong ten times more than it's right" Or more or less technical depending on the audience.
That's why it's a screening test, and not a diagnostic test.
This! Bumping so more people see this.
@@VaryaEQ thank you for specifying what you were doing. I thought you were loitering, about to break and enter a different comment.
@@xinpingdonohoe3978 😂
@@xinpingdonohoe3978 well I am loitering
@@xinpingdonohoe3978lmfao
“sample of 0 women” literally me
lmao
Fax
My app had a glitch where it was just frozen on that frame and playing goofy music from another short and I was just like wow
@@AidanRahder Mine too. It happened way too many times on diff devices and I think someone at YT screwed yp
@@AidanRahder mind blowing stats LMAO
"we are all Bayesians now'
I think this is a great concept to share. Thank you for doing so.
I updated my odds of correctly answering a Bayesian probability question 3 years ago when I first watched the full video
😂
A great way to visualize this is by splitting the groups along what you are studying- in this case the accuracy of the test. Visualize two groups, positives and negatives. We have 98 positives with 89 false positives, and 902 negatives with just one false negative. Now we can see that while we can almost definitely trust a negative, a positive result requires further testing.
I think the best value in this first test is that it is cheaper and less involved that subsequent testing. A very accurate test might have fewer false positives, but cost a lot more, in money time and stress. This first test is absolutely not a waste, so long as it doesn't take so long that the result comes too late for effective treatment.
Thank you, that information is actually really calming to me. 😊
@@annana6098this is exactly why many tests like these are used, and if you ever are unfortunate enough to have you or someone you know test positive for a disease like this, the next step in their treatment will almost always be further testing. These tests are either more invasive or expensive as you said but also sometimes more dangerous.
@@piiinkDeluxeof course. All medicine these days is heavily supported by people in the fields of statistics or data science. Your doctors will understand if they give you a test like this that negative results are almost always accurate, oftentimes upwards of 99.9%. They will also know and inform you that preliminary positive results aren’t always accurate, and they will give you further testing.
False positives are more desirable than false negatives for screenings. Worst case in a false positive is that they get additional testing and are found to be negative. Worst case in a false negative is someone continues to have an untreated condition and then any thoughts of it being what they were tested for are dismissed due to the negative test. Screenings are only really to determine who should get further testing. Using a screening test is often much cheaper and better than doing the full tests on everyone.
Tests are typically only good for proving you have or you don’t have something, rarely both. Like with an antibody test for Addison’s disease, if you test positive you definitely have Addison’s, but only 70-80 % of people with Addisons actually test positive. So the test can be used on its own to prove you have Addison’s but cannot on its own be used to prove you don’t have Addison’s.
In the same way this breast screening test can be used to prove with a lot of certainty that you don’t have the condition but can’t be used on its own to prove you do have the condition.
Ngl the beginning of this short is a perfect take for the whole "think of a woman" "no" meme
I've learned this a half-dozen times and it's still hard to keep my head wrapped around.
I used pictures of your who is Steve video to teach probabilities given an information.
It was truly a experience for my students.
So thank you!
There are 2 core concepts in the statistics of diagnostic tests. Sensitivity and Specificity.
Sensitivity is the priotity for screening tests to "catch as many as we can" while high specificity tests such as confirmatory tests are to say "yep, you have this disease".
Both have their uses.
You are forgetting Positive predictive value and negative predictive value. This video demonstrates PP specificall. Not really a paradox. As prevalence increases, PPV increases, while NPV decreases. PPV is the likelihood a positive test result represents actual disease. Or true positive / (true positive + false positive)
That's exactly what irked me. Prevalence of the thing you're looking for has a major impact. If I'm testing a group of males for breasts cancer I'm gonna have a much lower ppv when compared to a female test group.
@@wh4t3v3rrr I'm not a statistician but I did some basic biostats during medical school and residency (including research). I've never heard of this described as a paradox, ever. That irked me.
Here a thought experiment to make it more clear:
suppose noone in this group of 1000 women has breast cancer. The even if the test is 99.9% specific, it will still give ≈ 1false positive. (and no matter how sensitive it is, it will get no false negative, as this would be impossible)
So if you knew a positive patient, with there being no breast cancer in this world,the chance of the test being false is still 100%.
And in another world, where every women has breast cancer, a positive result would always be right, even if the specificity is only 0.01%.
So this strongly depends on how high the chance of the sickness is, before taking the test.
This is one of my favourite episodes you’ve done! What a great opportunity to escape the youtube shorts doom scroll and revisit this classic
Thanks! I'm glad you liked it. The Bayes factor is one of those topics I wish I had developed a good intuition for earlier.
I come back to it often. I always understand as I watch the video, but incorporating Bayesian thinking into my assumptions at work has proven.. well it’s tricky. And I never know if I’m thinking straight
@@3blue1brown Would you mind creating a short for the Bayes factor specifically? I occasionally forget how to make and use it properly and keep referring back to the original video as a refresher.
I feel like in class when everyone is like "Ooh that's why" and I'm still sitting trying to understand the first 5 words the professor said.
Same, the false positive thingy confused me tbh
You have 20 apples. 2 of them are rotten. The apple test is done, and only one of the rotten apples are correctly identified as rotten. So the test has a 50% accuracy rating. Which means of the other 18 apples, 9 of them are "False Positives" for being rotten.
So in all likelihood, if you're told your apple is rotten there's a 90% chance (out of the 1 actual rotten and 9 false positives) it's actually fine and the test is wrong.
I'll try to break it down another way. There are 1,000 patients. They all get screened for one symptom of cancer. 98 tests come back "yes they have symptom" and those 98 get sent for further testing to see if they have cancer or that one symptom for other reasons, and the other 902 who got "no they don't have this symptom" as a result go home. The test said 90.2% of the original group of 1000 patients don't need to be tested further.
Of the 98 (or 9.8% of the original 1000) in the "yes" category, 9 people (or 9.2% of that 98) end up having cancer, and the other 89 can go home fine.
1 of the 902 people who were originally flagged "no" dods have cancer and was sent home by mistake. That's .1% of the "no" category being wrong.
People flagged "yes" who have cancer: 9
People flagged "yes" who don't: 89
People flagged "no" who have cancer: 1
People flagged "no" who don't: 901
Accurate readings: 91%
False positives: 8.9%
False negatives: .1%
Obviously this sucks for the 1 false negative, especially considering that .1% of the population is well over 7 million. But it's something to put people at ease when they get flagged "yes" initially, since 10 out of 11 don't have cancer. And it shows how the test is still considered very accurate with all these incorrect readings because the vast majority are accurate and it's better to have more false positives than false negatives.
It really sucks for the person - properly who actually have cancer but the test gives a no result
@@Brack_86 lost me after the first word
"So do I have or not have cancer?"
"Yesn't 😀"
Incredible explanation honestly
This should be a mandatory presentation to juries that are asked to assess statistical information such as DNA evidence.
DNA evidence is not how they screen for breast cancer. Some look for internal chemical imbalances and some use mammograms.
DNA is cut with certain enzymes and then put in a gel electrophoresis to separate the size of the strands that are left. Which gives a DNA finger print. Dues to the sizes being different for every person unless you are identical twins
@@afez2752not identical twins, but accurency of such test far excides 99%.
I was pretty sure, but, I'm no expert, so I just looked up "false positive dna test." The only way a dna test is false, is via samples getting mixed up, be it mistake or fraud. I get wanting to educate juries, but dna tests for identity or paternity aren't statistical inferences. It's hard science. Again, I'm not an expert. I looked it up a minute ago to double check.
This is exactly the kind of fallacious conclusion that I was instantly afraid people would draw from this.
While you‘re _technically_ correct in that DNA evidence still is „only“ evidence rather than conclusive proof, it is way more reliable than this video makes it seem.
That massive reduction of confidence to 9% comes from the premise that people were tested once and at random without individual indication, resulting in roughly ten times as many false positives as true positives. Do the same test just a second time, and if it still turns out positive, the probability of the patient actually having cancer already goes up to ~90% by this alone. Now factor in that usually people don‘t get tested at random in the first place, but due to some form of prior indication (which is also usually the case in court trials), the rate of false positives goes down (and therefore the reliability of even a single positive test result goes up) significantly as well.
Back to the court room, the accuracy of DNA evidence by far exceeds 90% to begin with. Again, it’s not a flawless mathematical proof, but taken everything into account, false positives are one in many millions.
What the jury has to be made aware of , because it’s a way more severe cause of erroneous judgement, is the fact that DNA testing only tells you whether a DNA sample stems from a certain person, but not how that sample got where it was found.
On another note, if you want to make sure the jury is able to adequately evaluate evidence, the first thing you should do is telling them what psychology and neuroscience have to say about the reliability of even sincere eyewitness testimony - only you wouldn‘t want that, because you‘d basically have to throw the „best“ - if not only - evidence there is in countless trials right out of the window.
@@christianosminroden7878 Its not about the dna evidence but statistical evidence in general.
I learned about this on my own through the covid lock downs. The amount of misunderstanding, even in mainstream media, was crazy.
Statistical illiteracy is ubiquitous.
Careful there Carl, thats dangerously close to misinformation. Not that people would ever take sides based off having differing political ideals in a medical matter.
Maybe someday we can get back to humanity over political power again.
The vast majority of people do not understand statistics and probability at all ....
Yet most people are positive they understand it properly
@@yopomdpin6285 this is why gambling is so widespread.
“picture 1000 women” and my immediate reaction was “nice” ngl
I don’t know much about you but I’m gonna guess you’re on the orange site a lot
@@JITCompilation hackernews?
@@JITCompilationis there a rainbow site?
bro missed the point
@@Laceration_Gravityyy grindr
AWESOME!
I’m a math teacher. I find this concept to be one of the hardest things for students to comprehend. (Adults often seem beyond hope with it.) I will be memorizing and refining this excellent explanation.
THANK YOU!!!
This is a class 12th mathematics frequently asked question in India. First probability of all +ve and -ve results calculated i.e. P(E1) and P(E2) respectively. Then A is an event taken for all actual +ve. P(A/E1) and P (A/E1 intersection E2) are calculated, and their sum is used calculate P(A). To find probability that a person who is actually +ve and tested +ve is P(E1/A) = P(A/E1) *P(E1)/P(A).
Was looking for this one 💯.
Actually it also taught in 10th if you have taken advanced batch ..😅
He made a mistake in 9/(9+89). The answer is not ≈1/11, it is =1/10. There's no reason for it to be ≈.
@@Creamin_All_Offensiveit is 1/10.88888 which is about 11 you are wrong
Roundin ic
@@dansda5174
“Picture a thousand women”
I didn’t even know there were that many
Specificity and sensitivity has entered the chat
Precision and recall
I’ve heard specificity and sensitivity used as terms in this specific scenario, but in something like Natural Language Processing when you’re evaluating how well a system classifies text, yeah we do indeed call it precision and recall!
@@--86-- yes precision and recall in data science, but specificity and sensitivity in medical studies. Potayto Potahto
@@sadas3190sensitivity and specificity in statistical pattern recognition as well
Not only. What matters is these numbers compared to the *incidence rate* (or prevalence) of the disease you want to test for. Which drives up or down your specificity and sensitivity requirements for your test. That's the whole point of the video, a test is only good or bad depending on the context.
And that's why you don't panic right away when they ask you to come back for more testing!
i will kms bc i can’t have all of this health issues. i’m not who i am in my brain. i don’t have meaning in life, i am not in my body, my body will not serve natural purpose, i will die without completing great purposes of nature. i was born to work, fight and then die with some stupid machine crushing my bones
@@nooooheyyyyou always have purpose
@@nooooheyyyman, I get you. I’ve tried typing this shit before and it’s hard
*yet*
"picture a thousand women"
..... *starts drooling*
Had this question once in my calculus exam. Fucked me up for a while. It still hurts
but also, if you get a negative result, there's a 901/902 chance you dont got it
Which is why such a test would be valuable as a screening tool particularly if the test was fast or inexpensive or non-invasive or any of the three.
@IsItOver-xhkx it's reliable for the people who HAVE it who tested positive. Which is the point. To very accurately pick up those that could have it, so people with it aren't missing out on treatment.
Then a further test confirms or denies.
Which is why they do a lot of tests for things only when they suspect the thing, to rule it out reasonably confidently. Because some things are basically a coin toss if you test everyone, but almost everyone with a condition will test positive.
@@FFKonoko >it's reliable for the people who HAVE it who tested positive.
This statement makes no sense.
@IsItOver-xhkxThe probability of a person having the disease and testing positive is different from the probability of a person not having a disease and testing positive. What the other commenter was getting at is that the probability of a person having the disease and correctly being identified as having it is very high, which is desirable in the real world.
I’ve missed your videos so much. It’s crazy, but I’m not even the same person I was the last time I watched your vids. I went to therapy, stopped having panic attacks, and have started to learn how to drive. Weird how life just goes on like that lol
"picture a thousand women-" This is truly a paradox
You don't know what a paradox is
@@Leto_0 Well, "paradox"
@@Leto_0 THAT'S QUITE A PARADOX!
People in all walks of life should see this video, because this helps one understand how statistics can be used and/or misused to build any story you want. Amazing. Thank you!
I was just thinking about this! How did youtube know?
"picture a thousand women" anxiety overload
Lol
"No sir"
Learned that for the first time with covid tests
I never took a covid test. Why would I take a test for some random flu?
That's why the training data matters. The distribution should align with distribution of the total population
Just so you can miss more people??
“Numbers don’t lie”
Thank you for explaining prior probability, hopefully some people get it
Accuracy versus precision, and in this case we want the screening to be geared towards minimizing false negatives, so the percent of false positives will be large
I invented a better test that tells you if you have cancer or not with a 99% correctness. The test simply always says you don't have cancer.
I heard this explained by eddie woo using the analogy of covid testing, its an interesting concept
I could listen to you talk about math forever
A thousand? Damn. I didn’t know there were that many… uh…
Good way to understand type I and type II errors
This is why one of the subjects I hate in math is statistics 😂, I would rather do calculus.
This is one of the most useful skills to master; it corrects for our faulty human intuition.
I'm the complete opposite personally. I never even passed Pre-Calculus but I did fine enough in AP Statistics. Statistics is just a lot easier for me to wrap my head around.
@@KingNedyacalculus feels deterministic while statistics feels random for me, so I prefer calculus. Even if the rules might be a bit weird, at least the rules are rules
@@KingNedyaStatistics can be more accessible, yes - to a certain extent. But in order to make use of its potential to an even mildly profound degree, you have to apply a virtually pathological paranoia grade caution to avoid a gazillion traps, only starting with the „correlation vs causation“ distinction, and maybe the most hideous of which is the strong human inclination to intuitively make subtle yet impactful assumptions without even realizing it.
It’s these traps that are really hard to avoid, and in addition can be - and too often are being - exploited to manipulate a statistical analysis in favour of a predetermined „desired“ result, that is responsible for the oftentimes bad reputation of statistics in the public perception.
There are cases in medical data where you can use Bayesian analysis to do a meta epistemology review of Fourier Transform sorted data in a sea of noise, with calculus used to give quantitative scores in dynamic vector rates.
Some med students really struggle with Fourier, and have trouble understanding why it's so important when industrial electronics noise infests so many hospitals that need to measure very weak signals.
“Picture a thousand women-“
*sighs*
*checks comments*
Yo XD
You're a true inspiration.
The best and most beautiful things in the world cannot be seen or even touched, they must be felt with heart.
Yo how do you edit so good
I’m pretty sure it’s programmed and scripted then rendered to a file
That’s why screening tests are usually chosen due to their high sensitivity (ability of the test to be positive for those who actually have the illness, ie true positives). Afterwards, if the screening test came out positive, other tests that are usually high in specificity (ability of a test to be negative for those who do not have the illness, ie true negatives) are performed, thus clearing out the false positives from the screening test.
High incidence rate is also important to begin with.
Though this short is a point in statistics,it’s fair to note that doctors understand that tests could be falsely positive. In high risk conditions like breast cancer screening,suspected patients undergo a process of triple screening:clinical,radiological and pathological.
Can’t imagine what I haven’t been near
This guy makes you question life😂
You had me at "Picture a thousand women..."
"Picture 1000 women, and your chance of getting a single girlfriend is 0"
Better to get a false positive that can be further assessed, retested and found negative than the opposite
Mindblown by your skills.
It’s funny how any single variable can be slightly altered to make wildly different outcomes. This is based purely on this one scenario, not actual events and statistics.
Apparently even most doctors get this wrong when asked.
Doctors, like most other people in most other professions, are often crap at their jobs.
Surprised?
I mean, as a mathematician, it would surprise me more if most non-mathematicians didn't get this wrong.
Sad thing is that most medical experts do not understand this
Doctors fail any questions where the answer is not intuitive, they are not educated very effectively
Source : your ass
Did you get it ?
There’s no paradox. What he described is the positive predicative value of the screening test. What he failed to mention is the negative predictive value of the test (True negatives/true negatives + false negative) which is 99.9% (901/[901+1]). So you can have very high confidence that if you test negative you really are negative. For the ones who screen positive you can then go on to do a more definitive diagnostic test (ie biopsy). There’s absolutely no paradox here, it’s a screening test not a diagnostic test.
Paradox in the original sense of "counter-intuitive", not the modern sense of "self-contradictory".
Thanks for that different perspective. I was really confused by the video, but your explanation clarified it for me.
It only seems counterintuitive at first because he didn’t fully explain the concept of screening tests…just click bait and actually harmful because it makes it seem like medical tests are unreliable and rampantly misdiagnose people.
It only seems counterintuitive at first because he didn’t fully explain the concept of screening tests…just click bait and actually harmful because it makes it seem like medical tests are unreliable and rampantly misdiagnose people.
It only seems counterintuitive at first because he didn’t fully explain the concept of screening tests…just click bait and actually harmful because it makes it seem like medical tests are unreliable and rampantly misdiagnose people.
I have been working on some part of this problem at the University of Pittsburgh my entire 27.5 year career. Lots of reader studies and evaluation of new technologies to try to reduce the false positive rates and to better train radiologists, fellows and residents.
I first heard of this issue in a book called Steal this Urine Test by Abbie Hoffman.
just yesterday i watched a video by Medlife Crisis about the harm of doing screening just for the heck of it and this is exactly one of the points Rohin was bringing up (though not with mathematical rigour)
How one thing can mean something different to what you think it means.. lovely video 🫶
Not a paradox but still crazy to think about
Bayes’ Theorem is so iconic.
Bro, you had me at “picture women” I’m too lonely 😩
Have you never even spoken to a woman that's not related to you
Aaahh I see that dirty twist in perspective, that’s cool
Do the test once for each person and give each positive person another test, and each positive person from that another test.
I remember studying this in probability class when the covid tests were only 50% accurate. The college was quarantining everyone with a positive test….
I used to teach this to co-workers as part of signal detection theory & contingency matrices. Its not a paradox but rather a function of the incidence rate on the test results and properly interpreting the results. The biggest problem is many people in the medical field get it wrong and people getting tested get needlessly panicked.
Oh I had that in school at the beginning of the year.
This is why you follow up the positive screening test with a gold standard test.
Great explanation.
You had me at “breast (cancer)”
My boyfriends aunt was a false positive. They gave her chemo and radiotherapy and shortened her lifespan and damaged jer physical condition. She is suing
if only the field of probability was this easy, my life would`ve been much easier this semester
Excellent video
For some reason I read the title as medieval test paradox
That's so important to understand
I believe the issue is coming from having 1% probability of having a disease, and 10% of having failed screening.
It would be absolutely different if the probability of having disease and screening failure were the same.
Doctors need to find a way to explain to people that a positive result sometimes just means that more testing is necessary. We're adding tons of unnecessary stress to people's lives by making it seem like they tested positive for cancer.
I appreciate putting paradox in quotations. People are too accustomed to calling counterintuitive things paradoxes. Just because it’s unexpected doesn’t mean it’s impossible.
Accuracy vs precision of a confusion matrix
had pretty much the exact same question on my probability and statistics midterm this semester
I swear that's the exact Dark Soul font being used
and thats why there are search tests and confirmation tests. the difference between sensitive and specific
Wow that's kinda awesome perspective
I remember this was the reason why some people were so concerned with Covid tests
This has always been a "The proof is in front of me but I still can't believe it" thing for me
Saving this for later
Extremely simple concept