CppCon 2016: Cheinan Marks “I Just Wanted a Random Integer!"

แชร์
ฝัง
  • เผยแพร่เมื่อ 26 ส.ค. 2024

ความคิดเห็น • 29

  • @narnbrez
    @narnbrez 4 ปีที่แล้ว +21

    just in case anyone cant understand, he's referencing Stefan T. Lavavej's "rand() Considered Harmful" (2013) talk.

    • @Peregringlk
      @Peregringlk 2 ปีที่แล้ว

      Thank you :)

    • @KroltanMG
      @KroltanMG ปีที่แล้ว

      Yeah, I was very confused, "why is he addressing the Standard Template Library as a person?"

  • @disk0__
    @disk0__ 6 ปีที่แล้ว +18

    I swear to god there's a con curse, NEVER comment on the determinism of the presenters computer

  • @funkmasterhexbyte1684
    @funkmasterhexbyte1684 8 ปีที่แล้ว +7

    Great talk, looking forward to his analysis on the loop-constructor anomaly!

  • @MarekKnapek
    @MarekKnapek 7 ปีที่แล้ว +8

    Almost everyone initializes their engine by a single integer from random generator. Mersenne twister has huge state and almost everybody initializes it with single integer, that gives you only 2^32 different sequences. Correct usage is to gather enough "true" random bits into vector or array and pass this data through seed_seq like interface into mersenne twister constructor.

    • @UCH6H9FiXnPsuMhyIKDOlsZA
      @UCH6H9FiXnPsuMhyIKDOlsZA 5 ปีที่แล้ว +9

      Most of the time you don't need more than four billion or so unique sequences. Correct usage initializes what you need and doesn't waste time initializing what you don't.

    • @davidjohnston4240
      @davidjohnston4240 4 ปีที่แล้ว

      Why not execute the RdRand instruction and get a 64 bit random number directly from the hardware, bypassing libraries and the OS, delivered into the register of your running application in a few nanoseconds.

    • @mytech6779
      @mytech6779 2 ปีที่แล้ว +1

      ​@@UCH6H9FiXnPsuMhyIKDOlsZA Assuming the use of a single initializing integer is simply padded by zeros then the available count is not the problem, the issue is that it is always the exact same 2^32 subset every time.
      To change any 4 bits within an int will produce a set of 16 different integers. If those 4 bit positions are always the last 4 then the output will always be the same set of 16 ints. If the 4 bit positions are selected at random across the full width of the int then you can have 2^28 (268M) different sets of 16 integers.
      In both cases you have 16 unique items, but one method has its coverage limited to a small fixed corner area of the total potential test space, while the other method spreads its coverage over the entire test space.
      A more concrete example. A farmer can tests 100 soil samples every year from his farm.
      Method one is that they are always taken from an even random distribution constrained to a 200foot circle just behind the house. A 200foot circle has room for about 225,000 potential sample points and one could naively say that a pool of 225k provides is more than sufficient for this task.
      Method two is to take the 100 samples in an even random distribution across the entire 600 acre farm. Now the 600acres has 1.9G potential sample points, but 225k is sufficient so this is alone is not relevant to the choice of method. What is relevant is that the 100 samples are representing the entire farm not just the 200ft circle behind the house.
      Put another way, in method one the remaining 599.995 acres are zero padding.

    • @gianni50725
      @gianni50725 2 ปีที่แล้ว

      @@davidjohnston4240 rdrand is slow and only gets slower as CPUs get faster, so you should only use it to seed, but yeah there's no better way to seed than the rdrand instruction

  • @VoidloniXaarii
    @VoidloniXaarii 8 หลายเดือนก่อน

    Amazing, thank you very much ❤😊

  • @JohnDlugosz
    @JohnDlugosz 7 ปีที่แล้ว +1

    at 43:00, I suspect that the speed killer is the DIVIDE operation. Moving that out as a compile-time constant will avoid doing that. Division not only takes a lot of cycles, but it demands a lot of the internal queues and resources, so it inhibits superscaling. On my blog I have a post where I showed how replacing a divide by a bunch of other code made it faster.

  • @suzannesmith5821
    @suzannesmith5821 4 ปีที่แล้ว

    I think minstd_rand uses a modulo operation to generate numbers, which might be why it’s slower than the mersenne twister. The mersenne twister does not use modulo operation.

  • @chonnyung5084
    @chonnyung5084 6 ปีที่แล้ว

    Very informative and entertaining 🙏🏻

  • @pablo_brianese
    @pablo_brianese 4 ปีที่แล้ว

    I can't find the blogspot he mentions at 4:45. I can hear the author of the blog is Ben Deen (perhaps I am wrong, please correct me). Please, if you know the reference give me a hand. Many thanks

    • @farway-417
      @farway-417 3 ปีที่แล้ว

      That would be Ben Dean, you van find his blog here: www.elbeno.com/blog/ but he has written a few articles on : www.elbeno.com/blog/?s=random

    • @danadam0
      @danadam0 3 ปีที่แล้ว

      It is Ben Deane and the blog posts are probably:
      Rules for using : www.elbeno.com/blog/?p=1081
      Lameness Explained: www.elbeno.com/blog/?p=1325

  • @Fetrovsky
    @Fetrovsky 6 ปีที่แล้ว +6

    I really hate it when people show code in a variable-width font.

  • @rbledsaw3
    @rbledsaw3 7 ปีที่แล้ว +3

    Entropy is not the energy/work you can get out of a system. That's Enthalpy. Entropy is the amount of disorder/chaos in a system that is responsible for the energy/data you cannot get out of a system. This is why the term is used when it comes to randomness as entropic chaos is responsible for unpredictability in a system.

    • @rbledsaw3
      @rbledsaw3 7 ปีที่แล้ว +4

      Nevermind. He corrected himself later.

  • @WitherBossEntity
    @WitherBossEntity ปีที่แล้ว

    Please don't avoid re-initializing an rng by making it static, now your code suddenly isn't thread safe. Either put it in a thread-local variable, or if you need more control, pass references and be careful (or use Rust). Also, +1 for PCG instead of MersenneTwister; xoshiro is also great.

    • @the_number_e
      @the_number_e ปีที่แล้ว

      static initialization is thread-safe

  • @n3whous3
    @n3whous3 8 ปีที่แล้ว

    Hopefully nobody will use the inner loop example. It was just an example, but he did not mention, that what is the pitfall of the example. This was just for perf test.

  • @PaulFisher
    @PaulFisher 4 ปีที่แล้ว +1

    This unfortunately repeats some myths about /dev/urandom. It is not meaningfully “less random” than /dev/random from a cryptographic perspective, meaning the fact that it blocks is only a disadvantage.
    As a corollary, essentially any modern non-embedded system will never “run out” of randomness, and especially not to the point where it should throw. In my experience, and in ≈all the code that I’ve seen, that should be treated as a catastrophic, unrecoverable failure.
    See: www.2uo.de/myths-about-urandom/

  • @jackgenewtf
    @jackgenewtf 3 ปีที่แล้ว

    Here's another hint: If you need randomness in unit testing, you're probably doing property-based testing, and should just use a property-based testing framework. Interesting talk though.

  • @lunedefroid8817
    @lunedefroid8817 6 ปีที่แล้ว +1

    Duckduckgo is still being use confirmed

  • @ArthurSchoppenweghauer
    @ArthurSchoppenweghauer 6 หลายเดือนก่อน

    Why is random number generation in c++ such a shit show?