The Coupon Collector's Problem (with Geoff Marshall)

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 พ.ย. 2024

ความคิดเห็น • 894

  • @standupmaths
    @standupmaths  2 ปีที่แล้ว +1085

    Ok, many are suggestion I should have stood up to reveal an even bigger table next to me. Great concept, but ideas like that require some serious resources. *cough* patreon.com/standupmaths

    • @johnchessant3012
      @johnchessant3012 2 ปีที่แล้ว +7

      Hi

    • @_wetmath_
      @_wetmath_ 2 ปีที่แล้ว +1

      second

    • @ScientiaHistoria
      @ScientiaHistoria 2 ปีที่แล้ว +29

      …and there was the recursive “first a sense-check”before we start the sense-check. As usual, I wish I had undertaken another layer of sense-check before watching a Matt video.

    • @Eli-su6ql
      @Eli-su6ql 2 ปีที่แล้ว +17

      Nobody noticed the "diverges" was fixed in post Matt. good job.

    • @ScientiaHistoria
      @ScientiaHistoria 2 ปีที่แล้ว +5

      @@Eli-su6ql I did but figured it was his math autocorrect tool.

  • @ryanparker260
    @ryanparker260 2 ปีที่แล้ว +1264

    You were right, we all knew there was a second even SMALLER miniature table prop

    • @b0nce
      @b0nce 2 ปีที่แล้ว +51

      And that makes us very happy :)

    • @darkshoxx
      @darkshoxx 2 ปีที่แล้ว +46

      I was kinda expecting him to go out a layer as well, and standing up from the table between a giant clock and calendar prop

    • @bl4cksp1d3r
      @bl4cksp1d3r 2 ปีที่แล้ว +20

      I was thinking, he wouldn't have stopped with one 1/10 scale model, and I knew it, I was very happy to see that

    • @Avodroc42
      @Avodroc42 2 ปีที่แล้ว +5

      and it was absolutely worth it

    • @tandemcart1234
      @tandemcart1234 2 ปีที่แล้ว +24

      I legitimately laughed out loud with relief when the smaller one came out. The pause where he should have got it was just a smidgen too long. Perfection!

  • @IMacar
    @IMacar 2 ปีที่แล้ว +1409

    Recursive tables was definitely the pro-TH-camr move.

    • @Anonymous-df8it
      @Anonymous-df8it 2 ปีที่แล้ว +2

      I would like this, but it's at 420 likes so...

    • @Anonymous-df8it
      @Anonymous-df8it 2 ปีที่แล้ว +3

      Guess I'll have to wait until 669 likes!

    • @mattduffyw99
      @mattduffyw99 2 ปีที่แล้ว +4

      The second layer got me. Earned the thumbs up

    • @koenschaper8821
      @koenschaper8821 2 ปีที่แล้ว +1

      It reminded me of something Vsauce would do. Who, by all means, is a certified pro-TH-camr.

    • @joshuascholar3220
      @joshuascholar3220 2 ปีที่แล้ว +2

      The third table got an instant up-vote!

  • @itsmattnelson
    @itsmattnelson 2 ปีที่แล้ว +816

    Thank you for having me as a guest!
    My official parkrun time was confirmed to be still *one* second out 😭

    • @wordzmyth
      @wordzmyth 2 ปีที่แล้ว +14

      Thank you for sharing this! A little shame you couldn't have texted him on the day. Statistically, even sandbagging it should take a few attempts, so you prove the point

    • @chonchjohnch
      @chonchjohnch 2 ปีที่แล้ว +2

      Subbed, I need motivation to get back into cardio

    • @monkeycigs4762
      @monkeycigs4762 2 ปีที่แล้ว

      It's been a few months, have you gotten your time?? Fingers crossed for you!

  • @djadj_
    @djadj_ 2 ปีที่แล้ว +292

    showcasing your prop ability whilst explaining probability, what a beautiful moment

  • @wishiwasabear
    @wishiwasabear 2 ปีที่แล้ว +301

    The way Matt could read our minds with the third level of recursion was a very neat trick.

    • @vigilantcosmicpenguin8721
      @vigilantcosmicpenguin8721 2 ปีที่แล้ว +9

      Recursive patterns are predictable, but not as predictable as people making jokes about recursive patterns.

    • @michaeldirmeyer11
      @michaeldirmeyer11 2 ปีที่แล้ว +2

      @@vigilantcosmicpenguin8721 People making jokes about recursive patterns are predictable, but not as predictable as people making jokes about people making jokes about recursive patterns.

  • @RolandWolf
    @RolandWolf 2 ปีที่แล้ว +217

    A park run special, as opposed to a Parker run, where you give running a go, but don't really get the result you wanted.

    • @plaguey23
      @plaguey23 2 ปีที่แล้ว +10

      I was going to make a similar joke but take my like instead.

    • @unvergebeneid
      @unvergebeneid 2 ปีที่แล้ว +14

      It's sad Matt doesn't run anymore. He could've earned himself the nickname "Park Run Parker"! You know, basically the opposite of "Run, Forrest, run!"

    • @SpassNVDR
      @SpassNVDR 2 ปีที่แล้ว +2

      @@unvergebeneid Wow, I got to laugh three times at this, understanding one little detail at a time :D

    • @unvergebeneid
      @unvergebeneid 2 ปีที่แล้ว

      @@SpassNVDR 😄😄😄

    • @pmoncr
      @pmoncr 2 ปีที่แล้ว

      @@unvergebeneid Is a parkrun parker someone who turns up at parkruns and doesn't get out of their car?
      Matt could then be the parkrun Parker^2, rearranging would make him the park^3 runerer.

  • @SellusionStar
    @SellusionStar 2 ปีที่แล้ว +526

    This recursion joke was no joke. It's a nerd's duty.

    • @nitehawk86
      @nitehawk86 2 ปีที่แล้ว +15

      This recursion joke was no joke. It's a nerd's duty.

    • @MartinJab
      @MartinJab 2 ปีที่แล้ว +6

      This recursion joke was no joke. It's a nerd's duty.

    • @TlalocTemporal
      @TlalocTemporal 2 ปีที่แล้ว +10

      I hate to break it to you guys, but due to how YT comments work, you can only do one recursion. All the rest would be iteration jokes.
      This technically correct joke was no joke, It's a nerd's duty.

    • @zyaicob
      @zyaicob 2 ปีที่แล้ว +1

      @@TlalocTemporal thank you i knew something was off

  • @clarencelam1907
    @clarencelam1907 2 ปีที่แล้ว +255

    You can't conclude that the harmonic series diverges just because the expected time goes to infinity. The expected time reaches infinity because n goes to infinity. n being finite doesn't mean that the harmonic series goes to infinity; it just so happens that both n and nth harmonic number go to infinity. If the n out front were replaced with a *constant*, then you could conclude that.
    As an example, consider the function f(n) = n(1+1/2+1/4+...+1/2^i+...+1/2^n). f(n) reaches infinity as n goes to infinity, but clearly (1+1/2+1/4+...+1/2^i+...+1/2^n) doesn't diverge; it's always less than 2. So the argument here doesn't work.

    • @hOREP245
      @hOREP245 2 ปีที่แล้ว +71

      Parker divergence of a series

    • @jfb-
      @jfb- 2 ปีที่แล้ว +42

      Parker proof

    • @standupmaths
      @standupmaths  2 ปีที่แล้ว +144

      I think you’re right: that lead n breaks my divergent observation. I suspect the result may be salvageable but not in any intuitive way.

    • @fejfo6559
      @fejfo6559 2 ปีที่แล้ว +22

      I think the argument can be saved if you observe the average time needed to collect a coupon ( n(1+1/2+...+1/n)/n ) diverges as the number of coupons goes to infinity.

    • @jordanlinus6178
      @jordanlinus6178 2 ปีที่แล้ว +22

      @@fejfo6559 The problem is, that is not intuitive. The first coupon always takes one try, the one in the middle on average 2. Sure, the last one takes on average n, but that might be negligible among the n coupons. It's not that hard to prove that the harmonic series diverges, but I don't think the park runs can give an easier explanation.

  • @ALMX5DP
    @ALMX5DP 2 ปีที่แล้ว +59

    I was so pumped to start this challenge, knowing I had a 60/60 chance of getting my first 'coupon.' Little did I know that you actually had to finish the run to do so...

  • @karl9840
    @karl9840 2 ปีที่แล้ว +131

    As someone writing my Bachelor's on this exact problem (and the Poisson Process) this was a gem to watch.

    • @viniciusfriasaleite8016
      @viniciusfriasaleite8016 2 ปีที่แล้ว +3

      Luckier than all those runners!

    • @ajschlem
      @ajschlem 2 ปีที่แล้ว +1

      What are you majoring in?

    • @DonReba
      @DonReba 2 ปีที่แล้ว +20

      By "this exact problem" you mean tables with unnecessary props, right?

    • @karl9840
      @karl9840 2 ปีที่แล้ว +3

      @@DonReba I wish!

    • @karl9840
      @karl9840 2 ปีที่แล้ว +2

      @@ajschlem Well, technically I'll be a maths and physics teacher, but I do get the swedish equivalece of a bachelors in maths (and physics if I just write the thesis since im eligible for it).

  • @Kaepsele337
    @Kaepsele337 2 ปีที่แล้ว +238

    I don't think the seconds would be uniformly distributed even when you're not trying. That would require your time to fluctuate much more than a minute and I think most people run more consistent times. Also, while training you gradually increase your time and might "scan" through a minute, so that way you'd need less runs than if it was randomly distributed.

    • @tth-2507
      @tth-2507 2 ปีที่แล้ว +69

      Hi, runner here.
      Of course I run a consistent time (when lucky, even slightly increasing), but not that consistent. A variation of +/-1min is to be expected - at least in my case. Additionally one has to take different terain features across locations into account.

    • @alimanski7941
      @alimanski7941 2 ปีที่แล้ว +17

      If you scan over a minute, there's not an insignificant chance of missing a seconds value. If you converge on a run time, which is a reasonable assumption for most runners, then your chances of achieving previously skipped times are much, much lower, thereby increasing the number of runs necessary. So, even though I agree with your modelling, I think a uniformity assumption is still a safe approximation.

    • @Kaepsele337
      @Kaepsele337 2 ปีที่แล้ว +18

      @@tth-2507 Yeah I was thinking about time per kilometer, which is pretty consistent for me (basically between 4min 20 and 4min 30 every time). I forgot that you have to multiply the spread by 5 for 5km obviously. It would still cluster, but less than I had in mind.

    • @viniciusfriasaleite8016
      @viniciusfriasaleite8016 2 ปีที่แล้ว +4

      It would be cool to see the time distribution of a runner on the park run

    • @kane2742
      @kane2742 2 ปีที่แล้ว +8

      Matt's time (Runderground Matt, not Matt Parker) was around 22 minutes. At that pace, a variation of a minute is less than 5%. That seems reasonable, especially given variable weather and terrain - some parks are going to be hillier than others, for example.

  • @johnchessant3012
    @johnchessant3012 2 ปีที่แล้ว +90

    There's actually a recursive solution to this problem. Let f(n) be the answer for n coupons.
    Your first coupon is guaranteed to be a new one, after which you're left with n-1 coupons to collect, except, you have probability 1/n of getting your first coupon again so only (n-1)/n of your attempts matter. So f(n) = 1 + n*f(n-1)/(n-1). Divide both sides by n to get f(n)/n = f(n-1)/(n-1) + 1/n. Thus f(n)/n is the harmonic series up to 1/n, as expected.

  • @bigmoneysam8820
    @bigmoneysam8820 2 ปีที่แล้ว +59

    The recursive tables gag really put the 'stand-up' in 'Stand-up Maths'.

    • @joelluber
      @joelluber 2 ปีที่แล้ว +7

      Puts the sit down in stand-up math. Lol

    • @SomeRandomDevOpsGuy
      @SomeRandomDevOpsGuy 2 ปีที่แล้ว

      Is 2 layers even enough to deduce recursion?

  • @tymo7777
    @tymo7777 2 ปีที่แล้ว +143

    Really upset you missed the “run the numbers” pun!

    • @pembrokeshiredan
      @pembrokeshiredan 2 ปีที่แล้ว +18

      Not to mention the Parker Run pun

    • @bill_and_amanda
      @bill_and_amanda 2 ปีที่แล้ว

      I came here to say this

    • @thegreatmup
      @thegreatmup 2 ปีที่แล้ว +11

      No he didn't, 1:18

    • @Whatwhat3434
      @Whatwhat3434 2 ปีที่แล้ว +9

      He says it 10:57 as well

  • @onebronx
    @onebronx 2 ปีที่แล้ว +9

    16:03 Matt -single-handedly- bi-pedally saved the narrative of this video.

  • @lunasophia9002
    @lunasophia9002 2 ปีที่แล้ว +12

    4:59 I love you, Matt. I was hoping for it, wishing in my heart, and you did it!

  • @Illumas
    @Illumas 2 ปีที่แล้ว +15

    Me, "But you didn't make a tinier table prop for your tiny table prop." Mat, "You know I did!" Me, "Yay"

  • @PsiVolt
    @PsiVolt 2 ปีที่แล้ว +12

    The recursion bit was incredible, I might have to use that! This video is giving me discrete math flashbacks

  • @TheInternetHelpdeskPlays
    @TheInternetHelpdeskPlays 2 ปีที่แล้ว +9

    This reminds me of the old seaside Fascination games where you had to sink balls in holes, 1 in each. At the start youd get loads but as you get closer to the end it'd get harder and harder to get the final ones.

  • @charliedobbie8916
    @charliedobbie8916 2 ปีที่แล้ว +58

    Let me tell you a joke about recursion: two people were sitting at a table, and one turned to the other and said "let me tell you a joke about recursion:"

    • @VAXHeadroom
      @VAXHeadroom 2 ปีที่แล้ว +8

      In one of the early copies of the VRTX operating system documentation there were two entries:
      Recursion: see Hofstadter, Douglas
      Hofstadter, Douglas: see Recursion
      It made the nerd in me laugh out loud...unfortunately nobody else in the room got the joke...

    • @Pseudomous
      @Pseudomous 2 ปีที่แล้ว +1

      Pete and repeat were sitting on a bridge. Pete fell off. Who was left?

    • @nathankarn5557
      @nathankarn5557 2 ปีที่แล้ว

      @@Pseudomous Repeat?

  • @mathmachine4266
    @mathmachine4266 2 ปีที่แล้ว +21

    The mean value would be n*(1+1/2+1/3+...+1/n). In that case, that would be 60*(1+1/2+1/3+...+1/60), or 280.7922. As you already mentioned.
    The variance, however, would be n²(1+1/2²+1/3²+...+1/n²) minus the mean. In this case, that would be 60²(1+1/4+1/9+1/16+...+1/60²) - 280.7922, or 5581.4676. That means the standard deviation is the square root of that, or 74.7092. So, for him to get so far under the expected value is not really that out of the ordinary.

    • @gmalivuk
      @gmalivuk 2 ปีที่แล้ว +2

      Yeah, I just ran a bunch of simulations, and the complete set occurs by run 229 a bit under 28% of the time.

    • @driwen
      @driwen 2 ปีที่แล้ว

      isnt that the average value is n*(1+1/2+1/3+...+1/n), but the mean value should be lower shouldnt it? The distribution of 1 out of 60 will be 1 to infinite. Which pulls the average tries needed to higher number than the mean.
      edit: sorry got confused with median. But I'm curious if the average or mean is the value people are really interested in. Or the value at which 50% of the people would have completed it

    • @gmalivuk
      @gmalivuk 2 ปีที่แล้ว +1

      @@driwen The mean is exactly the expected value calculation done in the video. That's usually what we mean by average.
      The median is more complicated to calculate, but ends up being 267.5.

    • @TheMetallerik
      @TheMetallerik 2 ปีที่แล้ว

      So I've run 1 milion loops (simulations).
      Average got pretty close: 281.78,
      min: 103
      max: 1146

    • @driwen
      @driwen 2 ปีที่แล้ว

      @@gmalivuk yeah as i said after my edit i got the median and mean confused.
      But this shows that we wont see a bell curve around 281 but before 267.

  • @ulriksteenandersen4215
    @ulriksteenandersen4215 2 ปีที่แล้ว +39

    Love the jokes and props; never stop, Matt : )

  • @anfanta2010
    @anfanta2010 2 ปีที่แล้ว +7

    I just want to validate that the extra effort to build out the props was absolutely worth it. I was laughing out loud by myself 🤣

  • @DaTux91
    @DaTux91 2 ปีที่แล้ว +2

    Matt was like "if I can find them" and I looked at the remaining duration of the video and I was like "he couldn't find them". And that made me sad.

  • @Anonymous-df8it
    @Anonymous-df8it 2 ปีที่แล้ว +3

    I was kinda expecting him to go out a layer as well, and standing up from the table between a giant clock and calendar prop.

  • @melglobus
    @melglobus 2 ปีที่แล้ว +1

    Two of my favourite TH-camrs together again! The platform 0 video made me subscribe here. Loved the Choose Corrour T-shirt too!!

  • @samp-w7439
    @samp-w7439 2 ปีที่แล้ว +1

    I'm very excited because after Matt stated the problem, I figured out the formula for myself and calculated, got 281, and was very happy when I skipped to the reveal, and he had the same answer!

  • @zachrodan7543
    @zachrodan7543 2 ปีที่แล้ว +6

    I feel like a more modern name for this problem might be the (unweighted) lootbox completion problem...
    (the weighted lootbox problem would be where different outcomes have different probabilities)

  • @smor729
    @smor729 2 ปีที่แล้ว +78

    So what you are saying is that to run every single possible trailing decimal amount of seconds, all I have to do is run 1/12th of one park run backwards? This should be easy!

    • @mijkolsmith
      @mijkolsmith 2 ปีที่แล้ว +6

      -60/12

    • @ghislainbugnicourt3709
      @ghislainbugnicourt3709 2 ปีที่แล้ว +13

      I might have missed something, but the -1/12 or -60/12 joke would have worked only if there was the (1+2+3+...) series instead of the harmonic one, right ?

    • @David94spc
      @David94spc 2 ปีที่แล้ว

      @@ghislainbugnicourt3709 joke worked fine since you got it 😘

  • @LukeSumIpsePatremTe
    @LukeSumIpsePatremTe 2 ปีที่แล้ว +1

    I love the 10:45
    "We've managed to prove that harmonic series -converges- *DIVERGES* "

  • @_wetmath_
    @_wetmath_ 2 ปีที่แล้ว +20

    11:40 the camera man awkwardly walking past the two other guys talking was hilarious but completely relatable

  • @CR0SBO
    @CR0SBO 2 ปีที่แล้ว +3

    The initial Matt Parker comparison to the props seemed perfectly proportionally sized, but the Matt Parker that we had for the prop set of props was at least an order of magnitude too large, never mind the Matt Parker that was presenting the prop set of prop props!

  • @Schlups
    @Schlups 2 ปีที่แล้ว +25

    Next challenge: Do the run when a leap second is introduced to tick off the number 60.

    • @nathanrcoe1132
      @nathanrcoe1132 2 ปีที่แล้ว +8

      that is possible with an absolute position in time, but never with a duration, I think

    • @jurjenbos228
      @jurjenbos228 2 ปีที่แล้ว +4

      If the stopwatch is coded by an average programmer, yes.

    • @henrym5034
      @henrym5034 2 ปีที่แล้ว

      @@jurjenbos228 but how should the result be displayed for the 61s minute case?

    • @jazzabighits4473
      @jazzabighits4473 2 ปีที่แล้ว

      @@henrym5034 61s in minutes and seconds is 1 min 01 seconds, so 01 I guess?

    • @henrym5034
      @henrym5034 2 ปีที่แล้ว

      @@jazzabighits4473 I mean it’s definitely correct to say 2017/01/01 00:00:00 is 61 seconds past 2016/12/31 23:59:00. It’s also correct to say it’s 1 minute past that (that minute has 61 seconds).
      That makes me wonder if it’s okay to say it’s “1 minute and 1 second” though.

  • @sbyrstall
    @sbyrstall 2 ปีที่แล้ว

    Thanks for giving the parkrun a shout out. I now have to cross post this in the Global Running Channel. They would probably get a kick out of it. I didn't know that there was a Parkrun Bingo.....in do now.

  • @BobberWCC
    @BobberWCC 2 ปีที่แล้ว +5

    Harmonic series discovered from park runners. Amazing.

  • @Adrianmk2208
    @Adrianmk2208 2 ปีที่แล้ว +1

    A park run in which you almost finish, but not quite, is known as a Parker run.

  • @robertaries2974
    @robertaries2974 2 ปีที่แล้ว +3

    Geoff Marshall Collab. Gonna be a great video

  • @celestialtree8602
    @celestialtree8602 2 ปีที่แล้ว +1

    I was hoping for the third recursion level, but didn't expect you to do it.
    And I was very pleasantly surprised.

  • @ARKGAMING
    @ARKGAMING 2 ปีที่แล้ว +1

    I was waiting for the second prop table
    Glad you didn't disappoint

  • @jerry3790
    @jerry3790 2 ปีที่แล้ว +2

    15:25 “I used to be a runner like you, but then I took an arrow to the knee”

  • @user-bl9of5qe7h
    @user-bl9of5qe7h 2 ปีที่แล้ว

    Absolutely love how this is your typical intro-to-probability problem but solved completely using intuition. So stripped down from bulky theory and just beautiful

  • @gnfnrf
    @gnfnrf 2 ปีที่แล้ว +4

    All of this was interesting, but I was expecting an entirely different set of math about the odds of completing a 1/n task in n attempts, which is not 50%. If I remember correctly, as n increases, those odds converge on 1-1/e, and its fun to see how the formula to calculate it resembles (one of) the formulae for e.

    • @jeffkaylin892
      @jeffkaylin892 2 ปีที่แล้ว

      Yeah, I was pondering this instead of sleeping...
      If I were to catch a bus, which comes once an hour, my expectation is to not wait for more than half an hour. If the bus were there I'd say it was a miracle. If I had to wait 59 minutes I'd say I was jinxed. But if I waited over an hour I'd say I wasn't paying attention. This probability starts as 1 / 60. So 59 / 60 it wasn't there. Then next minute would be multiplied by 58 / 59, and the next 57 / 58. Hmm... I could multiply that all out... hmm cancel the 59s, then cancel the 58s... so at 30 minutes I have 30 / 60 just as one would expect.
      BUT, if the bus doesn't come once an hour, but has a 1 / 60 chance of having left the depot, then there is a string of 59 / 60 multiplied together. That would make my expected wait longer. And by "expected" I mean the "life is fair" type of expectation where half the time I'm pleasantly surprised and half the time I'm a little disappointed, and very rarely see miracles or damnations.

  • @tassiehandyman3090
    @tassiehandyman3090 2 ปีที่แล้ว +2

    All hail, the Amazing Mark - he who digs Matt Parker out of a hole of his own making, by simply being a Thoroughly Decent Chap. Thank you, Mark - you're a good egg!

  • @j.rodolfoprz7713
    @j.rodolfoprz7713 2 ปีที่แล้ว +1

    this ‘fun activity’ will be in my personal purgatory

  • @anuzis
    @anuzis 2 ปีที่แล้ว

    This analysis makes the unsafe assumption of a uniform distribution of finishing times across the seconds. Over long enough distances this this assumption is likely increasingly safe, but consider the distribution of finishing times you'd see for a 50 meter dash: probably a Gaussian distribution (AKA "bell curve"; with Olympic sprinters at one end, couch potatoes at the other, and most of us around the middle). Even at longer distances like 500 meters - 1k meters you likely still don't see a uniform distribution across the seconds. Not to nit-pick: I loved the video and really appreciated seeing the approach taken, just trying to particulate as a supportive TH-cam collaborator thinking about other ways to refine the theoretical analysis. Looking forward to future episodes!

  • @WDCallahan
    @WDCallahan 2 ปีที่แล้ว +5

    60 is a bigger number than 52 😲
    You just never know what you're going to learn about math when you watch this channel!

  • @luca6819
    @luca6819 2 ปีที่แล้ว

    Recently I saw a rerun on an old TV show where scientists were rating crazy inventions or build made by people (usually using TH-cam videos), and you were there! Didn't remembered that, that was a nice surprise!

  • @sorenwestrey4925
    @sorenwestrey4925 2 ปีที่แล้ว +5

    Legendary crossover

  • @ChristianNiederhuber
    @ChristianNiederhuber 2 ปีที่แล้ว +2

    encouraged by this video I did a little experiment: I implemented a program to challenge your calculation experimentally ...
    I did 1.000 instances of this croupon collector´s game from 00 to 59 and my experiment came up with a mean value of 280,946054 tries on average - so this seems to be an experimental confirmation of your calculation ...
    BUT: the median-value in this sample was only 266 tries (minimum value 147 and maximum value 684)
    so the distribution of the result values is quite right-skewed, because of some few values pretty far on the long right end of the scale ...
    now what always fascinates me the most in such cases of right-skewed distributions is, that if you just take any logarithm of the values instead of the original values, then you immediatly get almost perfectly normal distributed log-values !
    and if you take the mean value of this transformed log-values and transform it back to the original scale, then you receive a value that is very near to the median of the original distribution ! (in my case 269)
    how those this "trick" work and where does this relationship exactly come from ?
    maybe you could also make a video about this kind of transformation once in a while ?

    • @Cannongabang
      @Cannongabang 2 ปีที่แล้ว

      What about the standard deviation? A naive calculation of mine results in ~83. Let me know when you have time !

    • @ChristianNiederhuber
      @ChristianNiederhuber 2 ปีที่แล้ว

      @@Cannongabang ~74,64

  • @tuliosabatino
    @tuliosabatino 2 ปีที่แล้ว

    The props were definitely helpful to demonstrate your point, Matt. Time well spent indeed

  • @trigonzobob
    @trigonzobob 2 ปีที่แล้ว +2

    Now that's what I call running the numbers.

  • @okRegan
    @okRegan 2 ปีที่แล้ว

    that recursion gag is the reason no mater how uninterested i am in the title, i will watch any video you put out, you're awesome!

  • @TSutton
    @TSutton 2 ปีที่แล้ว +2

    This video is a perfect explanation of predicting fossil collecting in Animal Crossing!

    • @Jonny_Marko
      @Jonny_Marko 2 ปีที่แล้ว

      I had the same thought but with collecting all the DIY recipes! My odds are not looking so great to find the one I am endlessly searching for :D

  • @AfonsoCL
    @AfonsoCL 2 ปีที่แล้ว +1

    Matt is such a fun guy. Spending an afternoon guinea-pigging for his experiments while listening to his passion for maths would be one of my ideal days.

  • @morscoronam3779
    @morscoronam3779 2 ปีที่แล้ว +9

    10:48 Sounds like editing Matt had to edit the right word in. 🤔 Why do I notice these things...

    • @anthonydillon2969
      @anthonydillon2969 2 ปีที่แล้ว

      Can anyone read lips to see what he really said?

  • @seanc6128
    @seanc6128 2 ปีที่แล้ว +1

    I appreciate the gift of laughter in addition to the gift of knowledge.

  • @Cr42yguy
    @Cr42yguy 2 ปีที่แล้ว

    I was waiting for the prop on a prop table. Thanks for not letting me down, Matt.

  • @GabeUnger
    @GabeUnger 2 ปีที่แล้ว

    Getting so close to 1 mil Matt! Hope you have a good video idea to celebrate:)

  • @pyglik2296
    @pyglik2296 2 ปีที่แล้ว

    The average time to get k out of n "coupons" is a harmonic series which can be approximated by logarithms and inverting the question to "What's the average number of unique coupons after time t?" gives us k = n(1-e^(-t/n)) which fits nicely to the graph.

  • @mustafakalaycioglu9613
    @mustafakalaycioglu9613 2 ปีที่แล้ว

    The profound knowledge shared to us by Matt that 60>52. I didnt know that before :) Great video mate!

  • @NPDGX
    @NPDGX 2 ปีที่แล้ว

    To add to this from a computer science graduate:
    The harmonic series, in asymptotic analysis, is Θ(log n). Because of that, the cleanest way to write the solution is as E[X] = Θ(n log n), where X is our random variable for collecting n coupons. Just some food for thought :)

  • @shershahdrimighdelih
    @shershahdrimighdelih 2 ปีที่แล้ว +10

    When Matt was talking about "I used to be a runner....", I was half expecting him to follow up with, "until I took an arrow to the knee"

  • @findlaysmith6280
    @findlaysmith6280 2 ปีที่แล้ว +4

    Nice save at 10:48 🤣

  • @chrisengland5523
    @chrisengland5523 2 ปีที่แล้ว

    The fact that the seconds in a person's park run time is more or less random reminds me of one way to get truly random numbers in a computer. You simply time the intervals between successive user key strokes in microseconds, then throw away everything except the right-most digit. You can do that as often as necessary to obtain a sequence of truly random numbers.

  • @Sodalis_
    @Sodalis_ 2 ปีที่แล้ว +6

    I've been dealing with this for years. When you started to explain the concept, I realised its exactly the same as games with RNG (random number generation) loottables. Just look at the warframe wiki, which have to not only show the drop chance, but also the number of tries required for a 95% success chance, and the average number of runs expected. Funny how the same concepts come up in such different situations

    • @aonodensetsu
      @aonodensetsu 2 ปีที่แล้ว +2

      that's actually slightly different, it is calculated as 1-(1-c)^n where c is the drop chance and n is number of tries and it gives you an "accumulated" chance of getting the item in no more than n tries

  • @jakebradley3998
    @jakebradley3998 2 ปีที่แล้ว

    Holy crap man you're so close to the big milli! Good Luck!

  • @Marconius6
    @Marconius6 2 ปีที่แล้ว

    This thing is actually pretty useful for game development; I've run into several cases where I had to calculate like, how long it would take for a player to collect everything from a set or something, and it's not that obvious.

  • @ronnytm
    @ronnytm 2 ปีที่แล้ว

    Call me shallow, but the lighting, colours, and exposure of the video look really good for a cloudy day in a park. Props to the cinematographer. I'm sure the content of the video is great too.

  • @Hakasedess
    @Hakasedess 2 ปีที่แล้ว +18

    My one criticism is about the last bit where you hypothesize that you're going to see two clusters, one for sandbaggers at below 281 somewhere, and one for 'the rest' at roughly 281.
    I dispute this, and would willingly bet money that the actual average of 'attempts' for people who don't know about the bingo at all is going to be significantly higher than 281.
    The 281 figure relies on a randomized finishing time, and that's simply not going to be the case for... well, anyone who's going to be running exactly 5km regularly enough to even reach 281 runs in the first place. Their time from run to run is simply going to be too consistent to be compared to a 60sided die.

    • @dflosounds
      @dflosounds 2 ปีที่แล้ว +10

      Fair point, though I don't think you would see THAT much consistency in the seconds value unless you're talking professional or short-distance runners. In my experience of running casual 5 or 10k races (like the one in this video), there is likely to be consistency in minutes, but not so much for seconds. All you need to do is deviate from your average time up to +/- 30 seconds, and any of those 59 values are game. When you consider variables like the weather, how much sleep you got, your general energy level that day, etc, I don't think it's that far-fetched to get (almost) effectively-random seconds values. So while I agree that the value would certainly be higher than 281 (because you're right, it's not like rolling a 60-sided die), I don't think it would be particularly significant.

    • @Hakasedess
      @Hakasedess 2 ปีที่แล้ว +3

      @@dflosounds I guess it's possible it won't be a significant deviation, though I still imagine it would be when averaged across a large sample.
      It'd definitely be very interesting to see data on it though.

    • @ps.2
      @ps.2 2 ปีที่แล้ว +1

      @@dflosounds Fair enough, but it's not really accurate to say you can deviate by ±30 seconds and expect a flat distribution. It's probably more Gaussian, so you need that 60-second interval to be well inside the meaty part of the bell curve. Not way out at the edges of it.

    • @dflosounds
      @dflosounds 2 ปีที่แล้ว +1

      ​@@Hakasedess Would definitely be interesting!

    • @eekee6034
      @eekee6034 2 ปีที่แล้ว +1

      Hakkapeele, it looks like you've missed something, but if we take your objection with the bit it looks like you've missed, we get an interesting question. We're dropping the minutes value of the time; we're only taking the seconds value. Now, I don't know how long a 5km run could be in minutes; I'm deliberately not looking at that. Instead, what I'm thinking is, _if the number of minutes is large enough,_ even the most consistent runner will only be consistent to the nearest minute. Then things get interesting. If a runner has a 1-minute range, would the distribution of times be a bell curve? What if he has a 90-second range of times? I think it gets way more complicated than I can work out in the middle of the night with a headache, at least. :)

  • @BradleyGordon42
    @BradleyGordon42 2 ปีที่แล้ว

    That recursion joke. That's the kind of quality joke I love you for.

  • @mumblbeebee6546
    @mumblbeebee6546 2 ปีที่แล้ว

    Great video as always, but for me the highlight was to see Geoff smile and laugh so much 😎

  • @svibhavm
    @svibhavm 2 ปีที่แล้ว

    the props were DEFINITELY worth the extra effort. Totally agreeed Matt

  • @hebl47
    @hebl47 2 ปีที่แล้ว

    You had me there for a second. I was starting to worry you didn't make a second recursive miniature table.

  • @sbartdbarcelona44
    @sbartdbarcelona44 2 ปีที่แล้ว

    The miniatures were definitely worth the extra effort. Thx for the fun.

  • @bobd2659
    @bobd2659 2 ปีที่แล้ว

    As soon as the small table came out, I said to myself..."Wait for it...WAIT for it!" You did not disappoint!

  • @Veptis
    @Veptis 2 ปีที่แล้ว

    The birthday "paradox" is a great example. How many people do you need in the audience so that everyone shares a birthday with at least one other?
    Plotting those distributions would have been lovely. But as I am meant to be correcting homework for a statistics course instead of watching TH-cam anyway - I might just do it myself

  • @Wordsnwood
    @Wordsnwood 2 ปีที่แล้ว

    appropriately majestic closing music.

  • @camerongray7767
    @camerongray7767 2 ปีที่แล้ว +1

    I’ve done the maths for this before! Did parkrun here in australia for a few years, and this was one of the first things I worked out haha.

  • @adamplace1414
    @adamplace1414 2 ปีที่แล้ว +9

    Working outside with electronics in Britain seems like it's own level of challenge: Can Matt demo the math before the inevitable rain comes?
    And he tempts fate further by taking extra time to do a recursive prop joke. Living on the edge, maths style.

  • @Marronii
    @Marronii 2 ปีที่แล้ว +1

    I was actually just waiting for the third table

  • @Jonny_Marko
    @Jonny_Marko 2 ปีที่แล้ว

    This applies to the game Animal Crossing too, in the endless search to collect all the available DIY recipes that you get mostly through random chance, but there are hundreds and hundreds of them. Explains why it seems impossible to get some of the recipes I've been hoping to find for a long time.

  • @MazerTime
    @MazerTime 2 ปีที่แล้ว

    i really love the recursion joke, definatly worth the extra effort

  • @MrTurboTash
    @MrTurboTash 2 ปีที่แล้ว

    4:50 You had me going. Good on ya Matt, did not disappoint. :P
    11:24 ... Assuming a random distribution.
    13:58 Of course he was :D

  • @grapesofwraith1066
    @grapesofwraith1066 2 ปีที่แล้ว +1

    I love British TH-camr crossovers and recursion jokes!

  • @oddysee3030
    @oddysee3030 2 ปีที่แล้ว

    For the record, I really appreciated the recursion bit :)

  • @gamekiller0123
    @gamekiller0123 2 ปีที่แล้ว +8

    You haven't actually proven that the harmonic series diverges. The argument says that as n approaches infinity the number of runs also approaches infinity, but n is not bounded as it approaches infinity. We could have a situation where n(H_n) only diverges because n diverges.
    EDIT: diverges, not converges.

    • @entropie-3622
      @entropie-3622 2 ปีที่แล้ว +1

      At least we know it does not converge to 0 faster than 1/n sooo that is something XD (especially for a sequence with all positive terms)

  • @EER0000
    @EER0000 2 ปีที่แล้ว

    This morning volunteered at my local park run, this evening watched a math video about Park run, a very recursive Saturday))

  • @bijova
    @bijova 2 ปีที่แล้ว +1

    The recursion joke is why i liked.

  • @tylerm8128
    @tylerm8128 2 ปีที่แล้ว

    Great video Matt! I was just solving this problem myself, the other day. I'm trying to collect one of every pokemon card in the latest set. I calculated it to be a LOT more packs of random cards than I'm willing to buy, so I'll just buy my remaining cards individually ;)

  • @Vares65
    @Vares65 2 ปีที่แล้ว

    4:57 LOL - I DID know! I was literally sitting there waiting for it.

  • @jacob416
    @jacob416 2 ปีที่แล้ว

    I gotta give you props, that recursion joke was great.

  • @stephenbenner4353
    @stephenbenner4353 2 ปีที่แล้ว

    This may be my favorite Matt Parker recursion.

  • @phoenixdragon5154
    @phoenixdragon5154 2 ปีที่แล้ว

    I suspect that Jeff started his running career with these park runs, which made his results so good.

  • @madlad255
    @madlad255 2 ปีที่แล้ว +1

    The Parker Run

  • @happyestus6688
    @happyestus6688 2 ปีที่แล้ว

    Your were correct: that recursion bit was 100% worth the effort. 10/10

    • @nicholasvinen
      @nicholasvinen 2 ปีที่แล้ว

      Actually I'd say it was 50% + 25% + 12.5% ... worth the effort.

  • @Lykrast
    @Lykrast 2 ปีที่แล้ว +1

    Ooh that's like a similar problem to those people in digital card games trying to calculate how many packs do you need to buy to get the full collection (it's usually "too much").

  • @Smithers888
    @Smithers888 2 ปีที่แล้ว

    14:45 [Matt expects a bimodal distribution with the "blissrully unaware" peak at 281] The peak wouldn't be at 281; you calculated the _mean_ time to be 281, but the peak is at the _mode_ which would be earlier.
    To evidence my point without having to run the stats for n=60, consider n=2. The probabilities are: P(2 tries) = 1/2, P(3 tries = 1/4), p(4 tries) = 1/8, etc. with an expected # of tries of 2(1/1 + 1/2) = 3, while the peak of this distribution clearly at 2.

  • @andreasthaler7068
    @andreasthaler7068 2 ปีที่แล้ว

    For stats: Use a D6-dice to explain. This ist easier to recalulate for anyone. Thank you for the vid!

  • @jnaoe
    @jnaoe 2 ปีที่แล้ว

    i would have been disappointed if there was not a second miniature table.
    Thank you :D