Quantiles and Percentiles, Clearly Explained!!!

แชร์
ฝัง
  • เผยแพร่เมื่อ 9 ก.ค. 2024
  • Although there is a ton of conflicting information about quantiles and percentiles on the internet, this StatQuest filters out the noise and focuses and the most important things you need to know about these two summary statistics.
    For a complete index of all the StatQuest videos, check out:
    statquest.org/video-index/
    If you'd like to support StatQuest, please consider...
    Buying The StatQuest Illustrated Guide to Machine Learning!!!
    PDF - statquest.gumroad.com/l/wvtmc
    Paperback - www.amazon.com/dp/B09ZCKR4H6
    Kindle eBook - www.amazon.com/dp/B09ZG79HXC
    Patreon: / statquest
    ...or...
    TH-cam Membership: / @statquest
    ...a cool StatQuest t-shirt or sweatshirt:
    shop.spreadshirt.com/statques...
    ...buying one or two of my songs (or go large and get a whole album!)
    joshuastarmer.bandcamp.com/
    ...or just donating to StatQuest!
    www.paypal.me/statquest
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    #statquest #quantile #percentile

ความคิดเห็น • 250

  • @statquest
    @statquest  2 ปีที่แล้ว +11

    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

    • @DataNgineering
      @DataNgineering 8 หลายเดือนก่อน +1

      aaand Sold Out! JK :)

  • @kakusniper
    @kakusniper 6 ปีที่แล้ว +1

    Its a joy watching your stats videos. Thanks a lot.

  • @tianleizhou3947
    @tianleizhou3947 4 ปีที่แล้ว +13

    Hi dude, your last sentence saved the rest of my day! I was struggling for hours figuring out the results calculated by quantile() on a vector of only 10 entries!!!

    • @statquest
      @statquest  4 ปีที่แล้ว +3

      Hooray! I'm glad the video helped you figure what was going on. :)

  • @isaacfurlani8015
    @isaacfurlani8015 2 ปีที่แล้ว +22

    Clearly explained for a novice. Thank you, Josh. I really appreciate the time you've put into creating these. They're very helpful.

    • @statquest
      @statquest  2 ปีที่แล้ว

      Glad you like them!

  • @grantsmith3653
    @grantsmith3653 4 ปีที่แล้ว +2

    I liked the level at which you explained this. It was easy enough for me to understand, but explained fully so I feel like I totally get it. Thank you!

    • @statquest
      @statquest  4 ปีที่แล้ว +2

      Thank you very much! :)

  • @jackignatev
    @jackignatev 5 ปีที่แล้ว +51

    Dude, your intros are incomparable!

    • @AboutOliver
      @AboutOliver 3 ปีที่แล้ว +3

      I'm definitely in the minority, I know that. But I really hate them.

  • @nourajamal8166
    @nourajamal8166 ปีที่แล้ว +1

    Thank you so much for this video. This was really clearly explained as promised in the video's title. I watched so many videos before I found yours. None of the previous videos was well explained as yours . You are totally right when u said that StatQuest is special , YES IT IS!

    • @statquest
      @statquest  ปีที่แล้ว

      Great to hear!

  • @anaswahid8520
    @anaswahid8520 4 ปีที่แล้ว +24

    You are technically sound and logically consistent
    Since you explain in depth
    Therefore I love watching your videos

    • @statquest
      @statquest  4 ปีที่แล้ว +2

      Thanks! :)

  • @abnp434
    @abnp434 6 ปีที่แล้ว +1

    Nicely explained !! crystal clear !

  • @reytns1
    @reytns1 6 ปีที่แล้ว

    Always clearly explaining!

  • @rahmanhi
    @rahmanhi 6 ปีที่แล้ว

    I was struggling to find out what Quantile means & finally got it! thank you.

  • @naf7540
    @naf7540 ปีที่แล้ว +1

    Hi Josh, as usual, a super video, taking into account all the subtleties of quantiles/percentiles, Thank you!!

  • @oleersoy6547
    @oleersoy6547 4 ปีที่แล้ว +1

    Explanations are brilliant too!! NICE WORK!!

  • @charlottet7548
    @charlottet7548 ปีที่แล้ว +1

    This is incredibly clear and well explained! Thank you!

  • @simpuruit5289
    @simpuruit5289 4 ปีที่แล้ว +7

    Thanks for the hard work.
    I didn't expect that intro from a Statistician LOL

  • @tanjimrafi6211
    @tanjimrafi6211 4 ปีที่แล้ว

    Love your videos!!

  • @niknoor4044
    @niknoor4044 6 ปีที่แล้ว +9

    Hi Joshua. Once again, a very good video! Any plans on making videos about quantile regression?

  • @skartikey
    @skartikey ปีที่แล้ว +1

    Easy to understand and to the point. Thanks!

  • @carlosherrero4990
    @carlosherrero4990 ปีที่แล้ว +1

    What a great and clear way to teach! congrats :)

    • @statquest
      @statquest  ปีที่แล้ว

      Thank you! 😃

  • @shinhyelee3169
    @shinhyelee3169 5 ปีที่แล้ว +3

    awesome explanation! Thanks a lot

  • @kennetharguedas2316
    @kennetharguedas2316 3 ปีที่แล้ว +1

    Thank you! clear and to the point!

  • @stonededge
    @stonededge 5 ปีที่แล้ว +10

    How do you have the first blue dot being a 0.25 quantile at 3:07 and then suddenly becoming a 0.20 quantile at 5:05? I am a little confused, if the technical definition is that of a quantile
    being the amount of points less than itself. Thanks!

    • @statquest
      @statquest  5 ปีที่แล้ว +8

      The difference is that when we are explicitly trying to find quantiles - 4 ways to divide the data into equal sized bins - we have to round to either the 25th, 50th or 75th quantile.. So that's what is going on at 3:07. Later, we are calculating percentiles (but calling them quantiles because that is what is commonly done - and I mention this at 4:16 ) and we don't have to round, so we call the point the 20th quantile, because that is what it is without rounding.

    • @satyaprakash784
      @satyaprakash784 3 ปีที่แล้ว

      @@statquest Thank you for clearing it. I also had the same doubt.

  • @XY-yg1ci
    @XY-yg1ci 4 หลายเดือนก่อน +1

    very clear explanation!!! tks!

    • @statquest
      @statquest  3 หลายเดือนก่อน

      Thank you!

  • @troyliddell7373
    @troyliddell7373 5 ปีที่แล้ว +2

    Super dude. Keep them coming!!!!

    • @statquest
      @statquest  5 ปีที่แล้ว

      Thank you! :)

  • @DS-nr9zc
    @DS-nr9zc 6 ปีที่แล้ว +4

    Can you go over hidden markov models? Love your videos btw.

  • @oleersoy6547
    @oleersoy6547 4 ปีที่แล้ว +1

    AWESOME BEGINNING!!!

  • @IngridKen
    @IngridKen 5 ปีที่แล้ว +1

    I'll be honest 50% of this video is what i need.

  • @Cowwy
    @Cowwy ปีที่แล้ว +1

    Thank you for the clear explanation. :D

    • @statquest
      @statquest  ปีที่แล้ว

      You're welcome!

  • @user-yu1oz9rh8p
    @user-yu1oz9rh8p 2 ปีที่แล้ว +3

    hahaha great lecture and you got good sense of humour which makes the whole video more entertaining :)

    • @statquest
      @statquest  2 ปีที่แล้ว

      Glad you enjoyed it!

  • @Im-Assmaa
    @Im-Assmaa ปีที่แล้ว +1

    Thank you so much. Your explanation is Top notch👌

  • @jimhawkins300
    @jimhawkins300 5 ปีที่แล้ว +4

    do we always need to arrange data to ascending order for ungrouped data?

    • @statquest
      @statquest  5 ปีที่แล้ว +1

      In practice, you can just call a quantile function on your data without having the pre-sort it. The quantile function will sort it for you.

  • @JoshKonoff1
    @JoshKonoff1 2 ปีที่แล้ว +1

    Wow...This is literally the best movie I've ever seen. Thank you!

  • @fkhan4504
    @fkhan4504 6 ปีที่แล้ว +6

    Love watchng ur videos

    • @statquest
      @statquest  6 ปีที่แล้ว +1

      Thank you so much! I'm really glad to hear you like the videos :)

  • @peshalgoel7414
    @peshalgoel7414 4 ปีที่แล้ว +3

    Stat Quest is special.....Yes it is!!

  • @pellurumanoj519
    @pellurumanoj519 5 ปีที่แล้ว +4

    Wow!!!! no one can explain quantiles and percentiles better than this explanation, at least I feel this way.

    • @statquest
      @statquest  5 ปีที่แล้ว

      I'm glad you like the video so much! :)

  • @wanhope3660
    @wanhope3660 6 ปีที่แล้ว +1

    Awesome!

  • @stavalfi
    @stavalfi 3 ปีที่แล้ว +2

    "Quantiles and percentiles are just a metter of finding out how many values are less than the value you are interested in". Interesting. Thanks!

  • @mosama22
    @mosama22 2 ปีที่แล้ว +1

    Thank you for the beautiful video :-)

    • @statquest
      @statquest  2 ปีที่แล้ว

      Glad you enjoyed it!

  • @pratapseshachalam2859
    @pratapseshachalam2859 5 ปีที่แล้ว +1

    awesome video. It's made my day :)

    • @statquest
      @statquest  5 ปีที่แล้ว

      Thank you! :)

  • @kimi708
    @kimi708 2 ปีที่แล้ว +1

    Love your videos

  • @shubhamsharma14
    @shubhamsharma14 5 ปีที่แล้ว +2

    very nice explanation

    • @statquest
      @statquest  5 ปีที่แล้ว

      Thank you! :)

  • @jaysonklau3683
    @jaysonklau3683 4 ปีที่แล้ว +1

    Thanks you ^^

  • @bestest43
    @bestest43 4 ปีที่แล้ว +2

    If you consider the first element as 0th quantile, then how do you get 100th as you get 14/15 for the last one?

    • @statquest
      @statquest  4 ปีที่แล้ว

      It doesn't really make sense to call the first element the 0th quantile because that means 0% of the data is equal to or less than that quantile.

  • @parthmadan671
    @parthmadan671 2 ปีที่แล้ว +1

    Thanks a lot.

    • @statquest
      @statquest  2 ปีที่แล้ว

      Most welcome!

  • @LUSCIOUSDUNCAN
    @LUSCIOUSDUNCAN ปีที่แล้ว +1

    just saw someone post a Q-Q plot with regards to the price of an asset and was like, "what the hell is a Q-Q plot? and what the hell is a 'quantile'?" only to find out that quantiles were related to percentiles which were a mathematical concept that i had always struggled with. i love love love math but percentiles were one of those things i just couldn't fully grasp. bookmarking this to my lil "education" folder so i can come back to this when i need it. thanks!! :^)

  • @alexiasantos5526
    @alexiasantos5526 3 ปีที่แล้ว

    Hi Josh. There is a rule to decide the quantity of quantiles to separate the data? Or Can I just pick a random number independent of characteristic of data?

    • @statquest
      @statquest  3 ปีที่แล้ว

      Generally speaking, the most commonly used are quartiles (dividing the data into 4 equally sized pieces) or percentiles.

  • @ashishtiwari1912
    @ashishtiwari1912 4 ปีที่แล้ว

    I would like to see videos on time series- ARIMA Model,ACF and PACF plots

  • @SunSan1989
    @SunSan1989 10 หลายเดือนก่อน

    Dear Josh, Why do I get residual plots in some software after fitting line,where the X-axis is labeled 'Regular Residual' and the Y-axis is labeled 'percentile'? Is this a Q-Q plot of residuals?

    • @statquest
      @statquest  10 หลายเดือนก่อน

      I don't know. I've never seen one before.

  • @serdomal8796
    @serdomal8796 5 ปีที่แล้ว +7

    5:44 i have the feeling that this kinda explains the central limit theorem, am i wrong?

    • @statquest
      @statquest  5 ปีที่แล้ว +9

      It's a little different. For example, if you only had 3 points (point A, B and C), then the gaps between datapoints will be large and there would be a relatively big difference between the quantiles if one method said the first quantile was A and another method said the first quantile is B. But when there is tons of data, then the gaps between datapoints will be small and difference between A and B will be much smaller, so if one method says the first quantile is A and the other says it is B, those two values will be close to each other. Does that make sense?

    • @serdomal8796
      @serdomal8796 5 ปีที่แล้ว +1

      @@statquest yeah totally, thanks

  • @singrevolution
    @singrevolution 6 ปีที่แล้ว +5

    Thanks for the video! Can you make one quantile regression, please

    • @statquest
      @statquest  6 ปีที่แล้ว

      That's on the to-do list, but it might be a while before I get to it.

    • @omarabdelrahman4454
      @omarabdelrahman4454 3 ปีที่แล้ว +1

      @@statquest "Waah, waah, waah". :(

  • @piotrszocik7775
    @piotrszocik7775 4 ปีที่แล้ว +1

    Great explanation, have a nice day :)

  • @tohabin5064
    @tohabin5064 6 ปีที่แล้ว +2

    tnx

  • @conybabyyeah
    @conybabyyeah ปีที่แล้ว +1

    Thank you

  • @phucthinh3157
    @phucthinh3157 5 ปีที่แล้ว +3

    Dear Josh, so the top dot is 14/15 = 93% quantile? And we never have the 100% quantile?
    Supposed we have 1000 dots, the top is 999/1000 = 99.9% quantile, could we round it to say it is the 100% quantile?

    • @statquest
      @statquest  5 ปีที่แล้ว +2

      Remember, there are a lot of ways to define quantile and percentile. One way to define it, used in this video, is the percent of values below a specific value. However, it's also defined as the number of values equal to or less than. In this case you'd have 100%.

    • @phucthinh3157
      @phucthinh3157 5 ปีที่แล้ว +2

      @@statquest thank you so much for the detailed explanation

  • @agiledev5773
    @agiledev5773 ปีที่แล้ว

    Hi Josh, I don't understand the graph. Since the y-axis is gene expression, what is the x-axis? Also, what do you mean by gene expression on the y-axis? Are these types of gene expression?

    • @statquest
      @statquest  ปีที่แล้ว +1

      The data come from gene expression measurements made from mouse cells. So the y-axis is gene expression (how much each gene is transcribed) and the x-axis represents the specific mouse. If we had 2 mice, we'd have 2 columns of dots.

  • @harshmalik3470
    @harshmalik3470 5 หลายเดือนก่อน +1

    What would i even do without you Josh

    • @statquest
      @statquest  5 หลายเดือนก่อน

      :)

  • @mengfu6461
    @mengfu6461 4 ปีที่แล้ว +1

    Is the blue point the 25th(3:09) percentile or 20th (4:56)? Thanks for answering.

    • @statquest
      @statquest  4 ปีที่แล้ว

      Both! Starting at 0:36 I talk about how quantile and percentile have multiple definitions and multiple ways to be calculated. The point is that with quantiles and percentiles, if you have a lot of data, details are not important, the bigger picture is important. If you don't have a lot of data, then be very cautious with your conclusions.

    • @mengfu6461
      @mengfu6461 4 ปีที่แล้ว +1

      @@statquest Got it! thanks for helping out

  • @jenevavergara4125
    @jenevavergara4125 5 ปีที่แล้ว +1

    Hi thanks for the great video, do u have R tutorial on hoe to get the quantile from a fitted distribution?

    • @statquest
      @statquest  5 ปีที่แล้ว

      I don't, but that's a great idea. :)

    • @jenevavergara4125
      @jenevavergara4125 5 ปีที่แล้ว +1

      @@statquest would love to watch it from your channel soon

  • @giovanavieira7708
    @giovanavieira7708 4 ปีที่แล้ว +1

    loved this! thanks from Belém do Pará, in Brazil!

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      Muito obrigado! :)

  • @adhiyamaanpon4168
    @adhiyamaanpon4168 4 ปีที่แล้ว

    hey josh..one doubt!! if someone says 1st quantile is 0.07..what should i interpret from that..does that mean below that value only 1 data point falls or something else?

    • @statquest
      @statquest  4 ปีที่แล้ว

      It depends...however, usually that means 1% of the data are less than that point (0.07).

    • @adhiyamaanpon4168
      @adhiyamaanpon4168 4 ปีที่แล้ว +1

      @@statquest thanks man!!

  • @swarnabandi7670
    @swarnabandi7670 4 ปีที่แล้ว +1

    Superb

  • @salma-hh1sf
    @salma-hh1sf 3 ปีที่แล้ว

    OMG so easy THANK YOU SO MUCH 😃😃❤️❤️❤️❤️❤️😍😍😍😍😍

    • @statquest
      @statquest  3 ปีที่แล้ว

      You're welcome 😊

  • @IngridKen
    @IngridKen 5 ปีที่แล้ว +4

    I have so much trouble with our teacher, he just inserted quartile, percentile, deciles and teach us in one subject and now its exam and were having trouble because its included in the test and we barely gets anything 😣

  • @itsmichaelaforeal
    @itsmichaelaforeal 3 ปีที่แล้ว +4

    I'm staying for the intros (and the content, of course!)

  • @venkatramanirajgopal7364
    @venkatramanirajgopal7364 4 ปีที่แล้ว +2

    At 3:37 can be related to Box Plots.

  • @tanyasingh2781
    @tanyasingh2781 ปีที่แล้ว

    Considering the point where you mentioned "the terms quantile and percentile are used when we divide each datapoint in it's own group" , what happens when we have lets say 200 datapoints .... do we have 200 pecentiles ? If yes, do we plot all these 200 pecentiles in Q-Q plot ? I am really stuck at this ....

    • @statquest
      @statquest  ปีที่แล้ว

      Usually we just use 100 percentiles.

  • @josevaldes7493
    @josevaldes7493 2 ปีที่แล้ว +1

    Thanks

  • @derrickjator9282
    @derrickjator9282 3 ปีที่แล้ว

    how are you getting the values of 2.5 & 7.3?

    • @statquest
      @statquest  3 ปีที่แล้ว

      Those are just the y-axis values that correspond the the values at the different quantiles.

  • @osamahassan7029
    @osamahassan7029 2 ปีที่แล้ว

    Hey, can you please recommend book for practicing your taught concepts?

    • @statquest
      @statquest  2 ปีที่แล้ว

      Not yet. I'm writing one right now, though, and it should be out in early 2022.

  • @subjord
    @subjord 5 ปีที่แล้ว +2

    Thanks for the explanation. By the way, there is no difference between 0.5 and 50% since 50%=0.5. It's mathematically exactly the same, so both notations can always be used.

    • @statquest
      @statquest  5 ปีที่แล้ว

      That's exactly right! :)

  • @alexlee3511
    @alexlee3511 ปีที่แล้ว

    say if today i have 1000 samples, can i refer the third data point (2/1000) as 0.2% quantile or 0.2th percentile

    • @statquest
      @statquest  ปีที่แล้ว +1

      If you wanted to.

  • @rossxie9809
    @rossxie9809 ปีที่แล้ว

    you said " quantiles are just the liens that divide data into equally sized groups". What equally-sized groups does the 25% quantile split the data into ?

    • @statquest
      @statquest  ปีที่แล้ว +1

      When we look at all of the quantiles that we are going to use, so, in your case, you might look at the 25% 50% and 75%, you'll create 4 equally sized groups.

  • @conybabyyeah
    @conybabyyeah 2 ปีที่แล้ว +1

    You save me thanks

  • @SirGeforce
    @SirGeforce ปีที่แล้ว

    I have a auestion. How can the blue point (4th from the bottom) be the 20th and 25th percentile at the same time?

    • @statquest
      @statquest  ปีที่แล้ว +1

      Rounding. With more data points, we'd end up with finer, more precise quantiles and percentiles.

  • @nobiaaaa
    @nobiaaaa ปีที่แล้ว +1

    I never know the meaning of quantile until today.

  • @Dominus_Ryder
    @Dominus_Ryder 4 ปีที่แล้ว +1

    @ 3:25, why is the 0.75 quantile 7.3, instead of 7.5? The 0.25 quantile was 2.5...

    • @statquest
      @statquest  4 ปีที่แล้ว +2

      The 75% quantile crosses the y-axis at 7.3.

  • @python4beginners164
    @python4beginners164 3 ปีที่แล้ว

    Sir I have a question?
    How to find interquartile range when full dataset not given. Instead Q1,Q3, min and max values are given.
    Plz reply...

    • @statquest
      @statquest  3 ปีที่แล้ว +1

      The interquartile range is the middle 50%, so the values between Q1 and Q3

  • @rodeooswald6940
    @rodeooswald6940 3 ปีที่แล้ว +1

    lov it

  • @RebeliousSapien
    @RebeliousSapien 3 ปีที่แล้ว

    i'm confused. if a quantile is dividing the data into groups of equal numbers of points how is, for example a .95 quantile achieving this?
    i get that at the .95 point it means %95 of my data points are below that "line" but where does the definition lie here ? how are the datapoints divided into equal points? because below that "line" you have %95 of you data and above it %5 ... the datapoints are not equally divided.

    • @statquest
      @statquest  3 ปีที่แล้ว

      Depending on the number of values in your dataset, you end up with situations where it's not possible to groups that are exactly equal. In order to deal with this problem, there are a ton of ways to calculate quantiles ( As I mentioned at 0:44 ). However, that said, when the dataset is large enough, the differences don't matter any more.

    • @RebeliousSapien
      @RebeliousSapien 3 ปีที่แล้ว

      @@statquest thank you for replying.
      I think it's more clear now what quantiles are all about. thanks for the help

  • @crystalss8354
    @crystalss8354 5 ปีที่แล้ว +1

    Thanks 🙏🏻

  • @govamurali2309
    @govamurali2309 4 ปีที่แล้ว

    At 2:05, how is the gene expression calculated as 4.5 and how is the scale for the axis choosen?

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      The scale is arbitrary. However, we pick 4.5 as the median because 50% of the measurements are below that value.

    • @govamurali2309
      @govamurali2309 4 ปีที่แล้ว

      @@statquest thanks for your response, so we can pick 5 as the median as well and we can change the scale to 10 instead of how you chose the scale as 9 and the median as 4.5, am I correct?

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      @@govamurali2309 In this example, 5 is not a good value for the median because there are more observations with values < 5 than there are observations values > 5. In contrast, 4.5 is a good value for the median because there are an equal number of observations with values < 5 and observations with values > 5.
      If you changed the scale, then you would change the median value. However, we still want a value that splits the data such that there is an equal number of observations with values > median and observations with values < median.

    • @govamurali2309
      @govamurali2309 4 ปีที่แล้ว +1

      @@statquest Thanks got it now :)

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      @@govamurali2309 Hooray! :)

  • @BeginnerVille
    @BeginnerVille 9 หลายเดือนก่อน

    So, thers's no 100% percentile in the example, for the top only have 14 lower than it as 14/15 right?

    • @statquest
      @statquest  9 หลายเดือนก่อน +1

      What time point in the video, minutes and seconds, are you asking about?

    • @BeginnerVille
      @BeginnerVille 9 หลายเดือนก่อน

      @@statquest
      Thanks for notifying!
      At 5:03.

    • @statquest
      @statquest  9 หลายเดือนก่อน +1

      @@BeginnerVille It's a toss up as to whether or not we have 0% percentile or a 100% percentile. We can have either one, but not both. So you can include the point in the calculation (1/15 = 0.07% or 15/15 = 100%) or not (0/15 = 0% or 14/15 = 93%). In this example, we don't include the point. Note: We can do it either way because usually these sort of divisions are only done with a lot of data, and one point doesn't make a big difference.

    • @BeginnerVille
      @BeginnerVille 9 หลายเดือนก่อน +1

      ​@@statquest
      I like you referring back to the example showed in the video, it becomes much clearer!
      Thank you so much!
      (I felt like I heard a 'Bam!)

    • @statquest
      @statquest  9 หลายเดือนก่อน

      @@BeginnerVille bam!

  • @bikashchandragupta6333
    @bikashchandragupta6333 5 ปีที่แล้ว +2

    For the 4th blue Dot, which you are addressing as 25th percentile, there are 3 Dots below that Dot, so the percentage becomes 3/15*100= 20%. If I count that Dot also than will have 4/15*100 = 26.67%. Then why are you calling that Dor the 25th percentage, can't figure it out as you are saying 25th percentile means 25% of the data is equal to or less than that value.

    • @statquest
      @statquest  5 ปีที่แล้ว +1

      In this part of the example, I call this the 25th percentile or 25% quantile, because the lines have divided the data into 4 equal portions. This makes the lowest line the 25th percentile, the middle line the 50th percentile and the highest line the 75% percentile. Dividing the data into four equally sized groups is just one way to determine quantiles. When you do it this way, you have to do some rounding because, as you noticed, there is no specific point that is exactly above 25% of the data, however, since each group of points is equally sized, we still call it the 25th precentile. Later, when I show how each data point can be considered its own quantile, you can be much more precise in defining the quantiles for each point. Does that make sense? The important thing to remember is that you should really only trust quantiles when there is a lot of data. When you have a lot of data, the differences among ways of determining quantiles are insignificant.

    • @bikashchandragupta6333
      @bikashchandragupta6333 5 ปีที่แล้ว

      @@statquest For the data : 5, 10, 15, 20, 25, 30, 35, 40, 45, 50.
      I have calculated the Q1= 13.75, Q2= 27.5, Q3= 41.25 from the Percentile Formula P=(N+1)/100 and Excel also gives the same results.
      But by observing the data, it is 15, 27.5, 40 that devides the data into 4 equal parts, so Q1= 15, Q2= 27.5, Q3= 40 and my CASIO fx-991 EX calculator gives the same results.
      So, can you tell why is this absurdity and which answer should I take?

  • @lilyha2470
    @lilyha2470 4 ปีที่แล้ว

    Hi Josh, do you have videos on Rstudio?

    • @statquest
      @statquest  4 ปีที่แล้ว

      I don't, but maybe one day I will.

  • @roxyzhang6293
    @roxyzhang6293 3 ปีที่แล้ว +1

    this intro is bomb

  • @uiru2900
    @uiru2900 3 ปีที่แล้ว

    So the median has 7/15ths of the observations below it. How is it then the .5 quantile?

    • @statquest
      @statquest  3 ปีที่แล้ว

      Because 7/15ths of the observations are below it and 7/15ths are above it, the median is right in the middle, and thus, the 0.5 quantile.

    • @uiru2900
      @uiru2900 3 ปีที่แล้ว

      @@statquest The first observation marks the 0% quantile, as there are 0 observations below it.
      The second observation marks the 7% quantile, because 1/15th (.666...) of the observations are below it.
      Following this logic...
      The eighth has 7/15 of the observations below it, and would be the 46%.
      I obviously understand that it is "in the middle" but I thought you were defining quantiles by what percentage of the observations are below them.

    • @statquest
      @statquest  3 ปีที่แล้ว

      @@uiru2900 Unfortunately there are a ton of ways to define "quantile", however, one common way is

  • @KhushiKumari-ef4cg
    @KhushiKumari-ef4cg 4 ปีที่แล้ว

    How to calculate???

  • @canernm
    @canernm 3 ปีที่แล้ว +2

    Hey and thanks a lot for your amazing videos, they've helped me a lot. One question regarding this one: "quantiles are just the lines that divide data into equally sized groups". Isn't that true only for the median? For example, the 75% quantile in your video splits the data into 2 groups, one with 3 observations larger than the 75% quantile and 11 observations smaller than it.

    • @statquest
      @statquest  3 ปีที่แล้ว +2

      You are correct in that individual quantiles do not all separate the data into equally sized groups - however, all of the quantiles, taken together, divide the data into equally sized groups. So if someone said, "I divided the data with 4 quantiles" you would know that there were 5 equally sized groups.

    • @canernm
      @canernm 3 ปีที่แล้ว +2

      @@statquest I see, thank you for the clarification! Have a great day.

    • @xoda345
      @xoda345 ปีที่แล้ว

      @@statquest Hi Josh, lets say that we divide the data in 4 quantiles, i,e 25 percentile,50 percentile, 75 percentile and 100 percentile. How can there be 5 regions? Should not there be 4 regions?

    • @statquest
      @statquest  ปีที่แล้ว

      @@xoda345 I guess you could debate whether or not the 100 percentile is actually a quartile or not.

  • @witnessa7x
    @witnessa7x 3 ปีที่แล้ว

    I feel like I understood the video, but I feel like I'm missing a logical jump. We said that in a sample with fifteen data points the 50% quantile would have seven points below it, and seven points above it. Fair enough. 15/2=7.5, and perhaps that 0.5 comes from the line going through the median point itself. But this doesn't really seem to generalize well in the scheme you present at the beginning of this video.
    Perhaps its best shown by saying at 2:56 you highlight a purple point as being the 25% quantile because you've bisected twice. However, at 4:52 you refer to that same point as the 20% quantile, because three of the fifteen points are below it. Both approaches make some intuitive sense to me, but they give notably different results for quantile measurement.

    • @statquest
      @statquest  3 ปีที่แล้ว

      Unfortunately, one of the annoying things about quantiles is that there are a ton of ways to calculate quantiles ( as I mentioned at 0:44 ) Depending on the number of values in your dataset, you end up with situations where it's not possible to groups that are exactly equal, so we have a lot of different formulas to deal with this, and that means we get small differences in the results. However, that said, when the dataset is large enough, the differences don't matter any more.

  • @pjosiahjoseph6730
    @pjosiahjoseph6730 ปีที่แล้ว

    Anyone know what percentile errors are? Thank you

    • @statquest
      @statquest  ปีที่แล้ว

      I'm not familiar with it. :(

    • @pjosiahjoseph6730
      @pjosiahjoseph6730 ปีที่แล้ว +1

      @@statquest No problem, thanks for all the great videos!

  • @angelawang3670
    @angelawang3670 5 ปีที่แล้ว

    is a quartile just a 0.25 quantile ?

    • @SergeySenigov
      @SergeySenigov 7 หลายเดือนก่อน

      I guess the first quartile (or Q1) is a 0.25 quantile. The second (Q2) is 0.5 quantile, the 3rd (Q3) is 0.75 quantile.

  • @sepideh1111
    @sepideh1111 ปีที่แล้ว

    Great explanation, just how 7% quantile is equal with 7 percentile! It make sense if they are different. Can you please explain? thanks

    • @statquest
      @statquest  ปีที่แล้ว

      I'm not sure I understand the question. This video talks about how there is a strict definition of quantile, which is one thing, and then there is how the term is used in practice, which is different. In practice, the terms quantile and percentile are interchangeable.

  • @GauravPadawe
    @GauravPadawe 4 ปีที่แล้ว +2

    First we need to sort the data in this case.

    • @statquest
      @statquest  4 ปีที่แล้ว +2

      My intention was that sorting would be implied by the way the data is put on the graph. However, you are correct, I probably should have stated it explicitly.

    • @GauravPadawe
      @GauravPadawe 4 ปีที่แล้ว +2

      @@statquest No worries Sir. Very well explained tho.

  • @danzinde
    @danzinde ปีที่แล้ว

    How is percentage different from percentile?

    • @statquest
      @statquest  ปีที่แล้ว +1

      A percentile implies that a percentage of data has smaller values. For example, the 6th percentile implies that 6% of the data has lower values. In contrast, 6% simply means 6% of the data share some feature. In other words, percentile has a narrower definition, and is a specific case of a percentage.

  • @rikoimade7042
    @rikoimade7042 3 ปีที่แล้ว

    this is the first video without "BAM!!"

    • @statquest
      @statquest  3 ปีที่แล้ว

      It's an old one, pre-bam.

  • @xiangyuanli1849
    @xiangyuanli1849 2 ปีที่แล้ว

    can I ask why 75% quantile is 7.3 instead of 7.5 ?

    • @statquest
      @statquest  2 ปีที่แล้ว

      Because the value of the point, such that 75% of the data is below it, is 7.3.

  • @milkywayandbeyond
    @milkywayandbeyond 4 ปีที่แล้ว +1

    Is it possible for someone to score in the 100th percentile of a standardized test?

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      It depends on how, exactly, you define percentiles. In this video I demonstrate one of many methods. When there is a lot of data, all of the methods are going to give you very similar results, so it's no big deal. However, if you only have a small amount of data, it's worth trying different approaches.

    • @milkywayandbeyond
      @milkywayandbeyond 4 ปีที่แล้ว

      @@statquest Thanks! So when someone says they scored in the 100th percentile of a large standardized test, did they actually score in a percentile labelled "the 100th percentile" or did they really score in the 99.6th and have it rounded up to 100?

  • @userjieranli
    @userjieranli 5 ปีที่แล้ว +2

    cannot be more clear

  • @darkmatter4768
    @darkmatter4768 4 ปีที่แล้ว

    can someone explain the last part 1/15 part how it is 1 percentile

    • @statquest
      @statquest  4 ปีที่แล้ว

      At 4:36 I say that 1/15 is the 7% quantile or 7th percentile. Is that what you are asking about?

    • @darkmatter4768
      @darkmatter4768 4 ปีที่แล้ว

      ​@@statquestYes , I under stood for Quantile you split the values into 4 equal parts but for percentile you divided into 15 parts how ?
      Thanks for Uploading this video!

    • @statquest
      @statquest  4 ปีที่แล้ว

      There are 15 data points, and each one represents a different percentile.

  • @siddharthgurav6407
    @siddharthgurav6407 ปีที่แล้ว

    Then wht is a Difference between Decile and Quantile

    • @statquest
      @statquest  ปีที่แล้ว +1

      Technically a decile divides the data into 10 parts. But practically speaking, people just use quantiles and percentiles.

    • @siddharthgurav6407
      @siddharthgurav6407 ปีที่แล้ว

      @@statquest so if I have a data set with series / respective frequency & probability.
      1.) Who can I find the worst 5% tail data points ?
      2.) Best method or technique - How can I create class / categorical slabs for worst events data points?