Why Information Theory is Important - Computerphile

แชร์
ฝัง
  • เผยแพร่เมื่อ 24 พ.ค. 2022
  • Zip files & error correction depend on information theory, Tim Muller takes us through how Claude Shannon's early Computer Science work is still essential today!
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

ความคิดเห็น • 147

  • @mba4677
    @mba4677 ปีที่แล้ว +310

    "a bit"
    "a bit more"
    after years living with the pack of geniuses, he had slowly become one

    • @laurenpinschannels
      @laurenpinschannels ปีที่แล้ว +5

      ah yes I recognize this sense of genius. it's the same one people use when I say that doors can be opened. "thanks genius" I am so helpful

  • @Ziferten
    @Ziferten 2 ปีที่แล้ว +293

    EE chiming in: you stopped as soon as you got to the good part! Shannon channel capacity, equalization, error correction, and modulation are my jam. I'd love to see more communications theory on Computerphile!

    • @Mark-dc1su
      @Mark-dc1su 2 ปีที่แล้ว +11

      If anyone wants an extremely accessible intro to these ideas, Ashby's Introduction to Cybernetics is the gold standard.

    • @hellowill
      @hellowill 2 ปีที่แล้ว +4

      Yeah feels like this video was a very simple starter

    • @travelthetropics6190
      @travelthetropics6190 ปีที่แล้ว +1

      Greetings EE! those are the first topics on our "communications theory" subject back at Uni.

    • @OnionKnight541
      @OnionKnight541 ปีที่แล้ว +1

      Hey! What channel is that stuff on ? I'm still a bit confused by IT

    • @mokovec
      @mokovec ปีที่แล้ว +2

      Look at the older videos on this channel - prof. Brailsworth already covered a lot of the details and history.

  • @louisnemzer6801
    @louisnemzer6801 2 ปีที่แล้ว +204

    This is the best unscripted math joke I can remember!
    How surprised are you?
    >A bit
    One bit?

    • @JavierSalcedoC
      @JavierSalcedoC 2 ปีที่แล้ว +69

      _Flips 2 coins_ "And now, how surprised are you?"
      "A bit more"
      *exactly*

    • @068LAICEPS
      @068LAICEPS ปีที่แล้ว +1

      I noticed during the video but after reading here now I am laughing

  • @LostTheGame6
    @LostTheGame6 2 ปีที่แล้ว +94

    The way I like to do that conclusion would be to say : ok let's describe a population where everyone plays once.
    In the case of the coin flip, if a million people play, you need to, on average, give the name of 500k people who got tails (or heads). Otherwise your description is incomplete.
    In the case of the lottery, you can just say "no one won", or just give the name of the winner. So, you can clearly see how much more information is needed in the first case.

    • @MrKohlenstoff
      @MrKohlenstoff ปีที่แล้ว +2

      That's a nice explanation!

    • @sanferrera
      @sanferrera ปีที่แล้ว

      Very nice, indeed!

    • @NathanY0ung
      @NathanY0ung ปีที่แล้ว +1

      This makes me think of something like an ability to correctly guess. For a coin flip, which requires more information, it's harder to guess the outcome than the wining of a lottery.

  • @roninpawn
    @roninpawn 2 ปีที่แล้ว +77

    Nice. This explanation ties so elegantly to the hierarchy of text-compression. While I've, many times, been told its mathematically provable that there is no more efficient method... This relatively simple explanation leaves me feeling like I understand HOW it is mathematically provable.

  • @gaptastic
    @gaptastic ปีที่แล้ว +19

    I'm not gonna lie, I didn't think this video was going to be interesting, but man, it's making me think about other applications. Thank you!

  • @Double-Negative
    @Double-Negative 2 ปีที่แล้ว +55

    The reason we use the logarithm is because it turns multiplication into addition.
    The chances of 2 independent events X and Y happening is P(X)*P(Y)
    if entropy(X) = -log(P(X))
    entropy(X and Y) = -log(P(X)*P(Y)) = -log(P(X))-log(P(Y)) = entropy(X) + entropy(Y)

    • @PetrSojnek
      @PetrSojnek 2 ปีที่แล้ว +22

      isn't that more of a result of using logarithm, instead of reason of using logarithm? It feels like using logarithm for better scaling was still the primary factor.

    • @entropie-3622
      @entropie-3622 ปีที่แล้ว +7

      @@PetrSojnek There are lots and lots of choices for functions that model diminishing returns, but only the log functions will turn multiplication into addition.
      Considering how often independent events show up in probabilistic theory it makes a lot of sense to use the log function for this specific property and it will yield all kinds of nice results that you would not see if you were to use another diminishing returns model.
      If we go by the heuristic of it representing information this property is fairly integral.
      Because you would expect that the total information for multiple independent events should come out as the sum of the information about the singular events.

    • @GustavoOliveira-gp6nr
      @GustavoOliveira-gp6nr ปีที่แล้ว +1

      Exactly, the choice of the log function is more due to the addition property than about diminishing returns.
      Also, it is totally related to the number of bits it uses to code a sequence of fair coins using binary digits. 1 more digit on a sequence changes the sequence probability by a factor of 2 while adding exactly 1 more bit of information, which works well with the logarithm formula.

    • @temperedwell6295
      @temperedwell6295 ปีที่แล้ว +1

      The reason for using logarithm to base 2 is that there are 2^N different words of length N formed with the alphabet {H,T}; i.e., length of word =log_2 number of words. The reason for the minus sign is so that N gives a measure of the amount of information.

  • @CristobalRuiz
    @CristobalRuiz 2 ปีที่แล้ว +4

    Been seeing lots of documentary videos about Shannon lately. Thanks for sharing.

  • @elimgarak3597
    @elimgarak3597 2 ปีที่แล้ว +41

    I believe Popper made this connection between probability and information a bit earlier on his Logik Der Forschung (1934 Shannon's first paper was written in 1949). That's why he says that we ough to search for "bold" theories, that is, theories with low probability and thus more content. Except, at first, he used a simpler formula: Content(H) = 1-P(H), where H is a scientific hypothesis.
    Philosopher's role on the history of logic and computer science is a bit underrated and obscured imo (see for example, Russell's type theory).
    Btw, excellent explanation. Please, bring this guy more often.

    • @yash1152
      @yash1152 ปีที่แล้ว +3

      thanks a lot for bringing philosophy up in here 😇

    • @Rudxain
      @Rudxain ปีที่แล้ว

      This reminds me of quantum superposition

  • @agma
    @agma ปีที่แล้ว +8

    The bit puns totally got me 🤣

  • @travelthetropics6190
    @travelthetropics6190 ปีที่แล้ว +10

    This and Nyquist-Shannon sampling theorem are two of the buildings block of communication as we know today. So we can say even this video is brought to us by those two :D

  • @Jader7777
    @Jader7777 2 ปีที่แล้ว +8

    Coffee machine right next to computer speaks louder than any theory in this video.

  • @scitortubeyou
    @scitortubeyou 2 ปีที่แล้ว +35

    "million-to-one chances happen nine times out of ten" - Terry Pratchett

    • @-eurosplitsofficalclanchan6057
      @-eurosplitsofficalclanchan6057 2 ปีที่แล้ว +2

      how does that work?

    • @AntonoirJacques
      @AntonoirJacques 2 ปีที่แล้ว +6

      @@-eurosplitsofficalclanchan6057 By being a joke?

    • @IceMetalPunk
      @IceMetalPunk 2 ปีที่แล้ว +5

      "Thinking your one-in-a-million chance event is a miracle is underestimating the sheer number of things.... that there are...." -Tim Minchin

    • @davidsmind
      @davidsmind 2 ปีที่แล้ว +2

      Given enough time and iterations million to one chances happen 100% of the time

    • @hhurtta
      @hhurtta ปีที่แล้ว +4

      @@-eurosplitsofficalclanchan6057 Terry Pratchett knew human behavior and reasoning really well. We tend to exaggerate a lot, we have trouble comprehending large numbers, and we are usually very bad at calculating probabilities. Hence we often say one-in-a-million chance when it's actually much lower. On the other hand, one-in-a-million events do occur much more often than we intuitively expect, when iterating enough, like brute forcing guessing a 5 letter password (abt 1 in 12 millions).

  • @drskelebone
    @drskelebone 2 ปีที่แล้ว +6

    Either I missed a note, there's a note upcoming, or there is no note stating that these are log_2 logarithms, not natural or common logarithms.@
    @5:08. "upcoming" is the winner, giving me log_2(1/3) ~= 1.585 bits of information.

  • @TheFuktastic
    @TheFuktastic ปีที่แล้ว

    Beautiful explanation!

  • @DeanHorak
    @DeanHorak 2 ปีที่แล้ว +3

    Greenbar! Haven’t seen that kind of paper used in years.

  • @elixpo
    @elixpo ปีที่แล้ว

    This explanation was really awesome

  • @gdclemo
    @gdclemo ปีที่แล้ว +5

    You really need to cover arithmetic coding, as this makes the relationship between Shannon entropy and compression limits much more obvious. I'm guessing this will be in a followup video?

  • @clearz3600
    @clearz3600 ปีที่แล้ว +1

    Alice and Bob are sitting at a bar when Alice pulls out a coin, flips it and says heads or tails.
    Bob calls out heads while looking on in anticipation.
    Alice reveals the coin to be indeed heads and asks how surprised are you.
    A bit proclaims Bob.

  • @adzmarsh
    @adzmarsh ปีที่แล้ว

    I listened to it all. I hit the like button.
    I did not understand it.
    I loved it

  • @MrVontar
    @MrVontar ปีที่แล้ว

    stanford has a page about the entropy in the english language, it is interesting as well

  • @sean_vikoren
    @sean_vikoren 2 ปีที่แล้ว +1

    I find my best intuition of Shannon Entropy flows from Chaos Math.
    Plus I get to stare at clouds while pretending to work.

  • @tlrndk123
    @tlrndk123 7 หลายเดือนก่อน

    the comments in this video are surprisingly informative

  • @CarlJohnson-jj9ic
    @CarlJohnson-jj9ic ปีที่แล้ว

    Boolean algebra is awesome!!!: Person(Flip(2), Coin(Heads,Tails)) = Event(Choice1, Choice2) == (H+T)^2 == (H+T)(H+T) == H^2 + 2HT + T^2 (notice coefficient orderings) where the constant coefficient is the frequency of the outcome and the exponent or order is the amount of times the identity is present in the outcome. This preserves lots of the algebraic axioms which are largely present in expanding operations. If you try to separate out the object and states from agents using denomination of any one of the elements, you can start to be able to combine relationships and quantities with standard algebra words with positional notation(I like abstraction be used as the second quadrant, like exponents are in the first, to resolve differences of range in reduction operations from derivatives and such) polynomial equations to develop rich descriptions of the real world and thus we may characterize geometrically the natural paths of systems and their components. These become extraordinarily useful when you consider quantum states and number generators which basically describe the probability of events in a world space which allows one to rationally derive the required relationships elsewhere, events or agents involved by stating with a probability based on seemingly disjoint phenomena, i.e. coincident and if we employ a sophisticated field ordering, we can look at velocities of gravity to discern what the future will bring. Boolean algebra is awesome! Right up there with the placeholder-value string system using classification of identities.

  • @Mark-dc1su
    @Mark-dc1su 2 ปีที่แล้ว +2

    I'm reading Ashby at the moment and we recently covered Entropy. He was very heavy handed with making sure we understood that the measure of Entropy is only applicable when the states are Markovian, or that the state the system is currently in is only influenced by the state immediately preceding it. Does this still hold?

    • @ConnorMcCormick
      @ConnorMcCormick 2 ปีที่แล้ว +2

      You can relax the markovian assumption if you know more about your environment. You can still compute the entropy of a POMDP, it just requires guesses at the underlying generative models + your confidence in those models

  • @DrewNorthup
    @DrewNorthup 2 ปีที่แล้ว

    The DFB penny is a great touch

  • @TheArrogantMonk
    @TheArrogantMonk 2 ปีที่แล้ว +2

    Extremely clever bit on such a fascinating subject!

  • @David-id6jw
    @David-id6jw 2 ปีที่แล้ว

    How much information/entropy is needed to encode the position of an electron in quantum theory (either before or after measurement)? What about the rest of its properties? More generally, how much information is necessary to describe any given object? And what impact does that information have on the rest of the universe?

    • @ANSIcode
      @ANSIcode ปีที่แล้ว +1

      Surely, you don't expect to get an answer to that here in a TH-cam comment? Maybe start with the wiki article on "Quantum Information"...

  • @Juurus
    @Juurus ปีที่แล้ว +1

    I like how there's almost every source of caffeine on the same computer desk.

  • @assepa
    @assepa 2 ปีที่แล้ว

    Nice workplace setup, having a coffee machine next to your screen 😀

  • @068LAICEPS
    @068LAICEPS ปีที่แล้ว

    Information Theory and Claude Shannon 😍

  • @danielg9275
    @danielg9275 2 ปีที่แล้ว +2

    It is indeed

  • @YouPlague
    @YouPlague ปีที่แล้ว +1

    I already knew everything he talked about, but boy this was such a nice concise way of presenting it to laymen!

  • @TheNitramlxl
    @TheNitramlxl ปีที่แล้ว +1

    A coffee machine on the desk 🤯this is end level stuff

  • @user-fd9rx8dh9b
    @user-fd9rx8dh9b 9 หลายเดือนก่อน

    Hey, I wrote an article using information theory, I was hoping I could share it and receive some feedback?

  • @sanderbos4243
    @sanderbos4243 ปีที่แล้ว

    I loved this

  • @Lokesh-ct8vt
    @Lokesh-ct8vt ปีที่แล้ว +3

    Question : is this entropy in anyway related to the thermodynamic one?

    • @temperedwell6295
      @temperedwell6295 ปีที่แล้ว +3

      I am no expert, so please correct me if I am wrong. As I understand, entropy was first introduced by Carnot, Clausius, and Kelvin as a macroscopic quantity whose differential temperature is integrated with respect to to give energy. Boltzman was the first to relate macroscopic quantities of thermodynamics, i.e., heat and entropy to what is happing on the molecular level. He discovered that entropy is related to the number of microstates associated to a macrostate, and as such is a measure of disorder of the system of molecules. Nyquist, Hartley, and Shannon extended Boltzman's work by replacing statistics on microsystems of molecules to statistics on messages formed from a finite set of symbols.

    • @danielbrockerttravel
      @danielbrockerttravel 2 หลายเดือนก่อน

      Related but not identical because the thermodynamic one still hasn't been worked out and because Shannon never defined meaning. I strongly suspect that solving those two will allow for a unification.

  • @laurenpinschannels
    @laurenpinschannels ปีที่แล้ว +1

    if you don't specify what base of log you mean, it's base NaN

  • @Veptis
    @Veptis ปีที่แล้ว

    Variance as the derivation of the expected value is the interesting concept of statistics, entropy as the amount of information is the interesting concept.of information theory.
    But I feel like they kinda do the same.

  • @oussamalaouadi8521
    @oussamalaouadi8521 2 ปีที่แล้ว +8

    I guess information theory is - historically - a subset of communications theory which is a subset of EE.

    • @sean_vikoren
      @sean_vikoren 2 ปีที่แล้ว +8

      Nice try. Alert! Electrical Engineer in building, get him!

    • @eastasiansarewhitesbutduet9825
      @eastasiansarewhitesbutduet9825 ปีที่แล้ว +2

      Not really. Well, EE is a subset of Physics.

    • @oussamalaouadi8521
      @oussamalaouadi8521 ปีที่แล้ว

      @@eastasiansarewhitesbutduet9825
      Yes EE is a subset of Physics.
      Information theory was coined solving EE problems ( transmission of information, communication channel characterisation and capacity, minimum compression limit, theoritical model for transmission.. etc) , and Shannon himself was an EE.
      Despite the extended use of information theory in many fields such as computer science and statistics and physics, it's historically an EE thing.

    • @nHans
      @nHans ปีที่แล้ว +2

      ​@@oussamalaouadi8521 Dude! Engineering is nobody's subset! It's an independent and a highly rewarding profession-and it predates science by several millennia.
      Engineering *_uses_* science. It also uses modern management, finance, economics, market research, law, insurance, math, computing and other fields. That doesn't make it a "subset" of any of those fields.

  • @nathanbrader7591
    @nathanbrader7591 2 ปีที่แล้ว +12

    3:41 "So 1 in 2 is an odds of 2, 1 in 10 is an odds of 10" That's not right: If the probability is 1 in x then the odds is (1/x)/(1-(1/x)). So, 1 in 2 is an odds of 1 and 1 in 10 is an odds of 1/9.

    • @patrolin
      @patrolin 2 ปีที่แล้ว +2

      yes, probability 1/10 = odds 1:9

    • @BergenVestHK
      @BergenVestHK 2 ปีที่แล้ว +4

      Depends on the system, I guess. Where I am from, we would say that the odds are 10, when the probability is 1/10. I know you could also call it "one-to-nine" (1:9), but that's not in common use here. Odds of 10 would be correct here.

    • @nathanbrader7591
      @nathanbrader7591 ปีที่แล้ว

      @@BergenVestHK Interesting. Where are you from?

    • @BergenVestHK
      @BergenVestHK ปีที่แล้ว

      @@nathanbrader7591 I'm from Norway. I just googled "odds systems", and found that there are supposedly three main types of odds: "fractional (British) odds, decimal (European) odds, and moneyline (American) odds".
      I must say, that seeing as Computerphile is UK based, I do agree with you. I am a little surprised that they didn't use the fractional system in this video.
      However, I see that Tim, the talker in this video, previously studied in Luxembourg and the Netherlands, so perhaps he imported the European decimal odds systems from there. :-)

    • @nathanbrader7591
      @nathanbrader7591 ปีที่แล้ว +2

      @@BergenVestHK Thanks for this. That explains his usage which I take to be intentionally informal for an audience perhaps more familiar with gambling lingo. I'd expect (hope) that with a more formal discussion, the term "odds" would be reserved for the fractional form as it is used in statistics.

  • @arinc9
    @arinc9 2 ปีที่แล้ว

    I understood not much because of my bad math but this was fun to watch

  • @pedro_8240
    @pedro_8240 ปีที่แล้ว

    6:58 in absolute terms, no, not really, but when you start taking into consideration the chances of just randomly getting your hands on a winning ticket, without actively looking for a ticket, any ticket, that's a whole other story.

  • @dixztube
    @dixztube ปีที่แล้ว

    I got the talis tails one on a guess and now I understand the allure of gambling and casinos it’s fun psychologically

  • @johnhammer8668
    @johnhammer8668 2 ปีที่แล้ว

    how can a bit be floating point

  • @h0w1347
    @h0w1347 2 ปีที่แล้ว

    thanks

  • @filipo4114
    @filipo4114 ปีที่แล้ว

    1:54 - "A bit more." - "That's right - one bit more" ;D

  • @sdutta8
    @sdutta8 14 วันที่ผ่านมา

    We claim Shannon as a communication theorist, rather than a computer theorist, but concede with Shakespeare: what’s in a name.

  • @inuwara6293
    @inuwara6293 2 ปีที่แล้ว

    Wow 👍Very interesting

  • @jimjackson4256
    @jimjackson4256 8 หลายเดือนก่อน

    Actually I wouldn’t be surprised at any combination of heads and tails.If it was purely random why would any combination be surprising?

  • @desmondbrown5508
    @desmondbrown5508 2 ปีที่แล้ว

    What is the known compression minimum size for things like RAW text or RAW image files? I'm very curious. I wish they'd have given some examples of known quantities of common file types.

    • @damianocaprari6991
      @damianocaprari6991 ปีที่แล้ว +4

      It's not really the file type, rather the file contents that determine it's ideal minimum size.
      At the end of the day, files are simply a collection of bits. Wheter they represent text, images, video or more.

    • @Madsy9
      @Madsy9 ปีที่แล้ว

      @@damianocaprari6991 The content *and* the compressor and decompressor. Different file formats use different compression algorithms or different combinations of them. And lossy compression algorithms often care a great deal about the structure of the data (image, audio, ..).

  • @retropaganda8442
    @retropaganda8442 2 ปีที่แล้ว +1

    4:02 Surprise, the paper has changed! ;p

  • @abiabi6733
    @abiabi6733 ปีที่แล้ว

    wait, so this is base on probability?

  • @AntiWanted
    @AntiWanted ปีที่แล้ว

    Nice

  • @CalvinHikes
    @CalvinHikes ปีที่แล้ว

    I'm just good enough at math to not play the lottery.

  • @sedrickalcantara9588
    @sedrickalcantara9588 2 ปีที่แล้ว

    Shoutout to Thanos and Nebula in the thumbnail

  • @blayral
    @blayral ปีที่แล้ว

    i said head for the first throw, tail-tail for the second. i'm 3 bits surprised...

  • @Andrewsarcus
    @Andrewsarcus ปีที่แล้ว

    Explain TLA+

  • @pedropeixoto5532
    @pedropeixoto5532 ปีที่แล้ว

    It is really maddening when someone calls Shannon a Computer Scientist. It would be a terrible anachronism if Electrical Engineering didn't exist!
    He was really (a mathematician and) an Electrical Engineer and not only The father of Information Theory, but The father of Computer Engineering (as a subarea of Electronics Engeneering), i.e., the first to systematize the analysis of logic circuits for implementing computers in his famous masters thesis, "A Symbolic Analysis of Relay and Switching Circuits", before gifting us with Information Theory.
    CS diverges from EE in the sense EE cares about the computing "primitives". Quoting Brian Harvey:
    "Computer Science is not about computers and it is not a science [...] a more appropriate term would be 'Software Engineering'".
    Finally, I think CS is beaultiful and has a father that is below no one, Turing.

  • @user-js5tk2xz6v
    @user-js5tk2xz6v 2 ปีที่แล้ว

    So there is one arbitrary equation and I don't understand form where it came and also what is it's purpose.
    And once he said that 0.0000000X is minimal amount of bits ,but then he says he needs 1 bit for information about wining and 0 for losing, so it seems the minimal amount of bits to store information is always 1, so how can it be smaller than 1 ?

    • @shigotoh
      @shigotoh 2 ปีที่แล้ว +1

      A value of 0.01 means that you can store on average 100 instances of such information in 1 bit. It is true that when storing only one piece of information it cannot use less than one bit.

    • @hhill5489
      @hhill5489 2 ปีที่แล้ว

      You typically take the ceiling of that function output when thinking practically about it, or for computers. Essentially, the information contained was that miniscule number, but realistically you still need 1 bit to represent it. For an event that is guaranteed, or probablity 100% /1.0, there is 0 information gained by its observance....therefore it takes zero bits to represent that sort of event.

    • @codegeek98
      @codegeek98 ปีที่แล้ว

      You only have fractional bits in _practice_ with amortization (or reliably if the draws are batched).

  • @joey199412
    @joey199412 ปีที่แล้ว +1

    Amazing video, title should have been something else because I was expecting something mundane, not to have my mind blown and look at computation differently forever.

  • @levmarcus8198
    @levmarcus8198 ปีที่แล้ว

    I want an expresso machine right on my desk.

  • @juliennapoli
    @juliennapoli ปีที่แล้ว +1

    Can we imagine a binary lottery where you bet on a 16bits séquence of 0 an 1 ?

  • @GordonjSmith1
    @GordonjSmith1 2 ปีที่แล้ว +2

    I am not sure that the understanding of 'information theory' has been moved forward by this vlog, which is unusual for Computerphile. In 'digital terms' it might have been better to explain Claude Shannon's paper first, but from an 'Information professional's perspective' this was not an easy watch.

  • @Maynard0504
    @Maynard0504 ปีที่แล้ว

    I have the same coffee machine

  • @Wyvernnnn
    @Wyvernnnn 2 ปีที่แล้ว +15

    The formula log(1/p(n)) was explained as if it was arbitrary, it’s not

    • @OffTheWeb
      @OffTheWeb 2 ปีที่แล้ว +2

      experiment with it yourself.

  • @anorak9383
    @anorak9383 2 ปีที่แล้ว +2

    Eighth

  • @liambarber9050
    @liambarber9050 ปีที่แล้ว

    My suprisal was very high @4:58

  • @KX36
    @KX36 ปีที่แล้ว

    after all that you could have at least given us some lottery numbers at the end

  • @ilovedatfruitybooty9546
    @ilovedatfruitybooty9546 2 ปีที่แล้ว

    7:11

  • @eliavrad2845
    @eliavrad2845 2 ปีที่แล้ว

    The "reasonable intuition" about this formula is that, if there are two independent things, such as a coin flip and a lottery ticket, the information about them should be a sort of sum
    H(surprise about a coinflip and a lottery result)=H(surprise about coinflip result)+H(surprise about lottery result)
    but the probabilities should be multiplication
    p(head and win lottery)=p(head)p(win)
    and the best way to get from multiplication to addition is a log
    Log(p(head)p(win))=Log(p(head)) + Log(p(win))

  • @hypothebai4634
    @hypothebai4634 ปีที่แล้ว

    So, Claude Shannon was a figure in communications electronics - not computer science. And, in fact, the main use of the Shannon Limit was in RF modulation (which is not part of computer science).

  • @hypothebai4634
    @hypothebai4634 ปีที่แล้ว

    The logs that Shannon originally used were natural logs (base e) for obvious reasons.

  • @jamsenbanch
    @jamsenbanch ปีที่แล้ว

    It makes me uncomfortable when people flip coins and don’t catch them

  • @GordonjSmith1
    @GordonjSmith1 2 ปีที่แล้ว +3

    Let me add a 'thought experiment'. Some people spend money every week on the Lottery, their chance of winning is very small. So what is the difference between a 'smart' investment strategy' and an 'information' based strategy? Answer: Rational investors will consider their chances of winning and conclude that for every dollar extra they invest (say from one dollar to two dollars) their chance will increase proportionally. An 'Information engaged' person will see that the chance of winning is entirely remote, and increasing the investment hardly improves the chances, in this case they know that in order to 'win' they need to be 'in', but even the smallest amount spent is nearly as likely to win as those who place more bets. No !! Scream the 'numbers' people, but 'Yes'!!! scream anyone who has considered the opposite case. The chance of winning is so small that the increase in paying for more Lotto numbers really does not do that much to improve the payback from entering, better to be 'just in' than 'in for a lot'...

  • @filda2005
    @filda2005 ปีที่แล้ว

    8:34 No one really no one has been rolling on the floor?
    LOOL and in addition the cold blood face to it. It's like visa card, you can't buy that with money.

  • @rmsgrey
    @rmsgrey 2 ปีที่แล้ว +2

    "We will talk about the lottery in one minute".
    Three minutes and 50 seconds later...

  • @karavanidet
    @karavanidet 10 หลายเดือนก่อน

    Very difficult :)

  • @danielbrockerttravel
    @danielbrockerttravel 2 หลายเดือนก่อน

    I cannot believe that philosophers, who always annoying go on about what stuff 'really means' never thought to try to update Shannon's theory to include meaning. Shannon very purposefully excludes meaning from his analysis of information. Which means it provides an incomplete picture.
    In order for information to be surprising, it has to say something about a system that a recipient doesn't know. This provides a clue as to what meaning is- a configuration of a system. If a system configuration is already known, then no information about it will be surprising to the recipient. If the system configuration changes, then the amount of surprise the information contains will increase in proportion.
    In order for information to be informative there must be meanings to communicate, which means that meaning is ontologically prior to information.
    All of reality is composed of networks and these networks exhibit patterns. In networks with enough variety of patterns to be codable, you create the preconditions for information.

  • @mcjgenius
    @mcjgenius ปีที่แล้ว

    wow ty🦩

  • @pgriggs2112
    @pgriggs2112 2 ปีที่แล้ว

    Lies! I zip my zip files to save even more space!

  • @thomassylvester9484
    @thomassylvester9484 ปีที่แล้ว

    “Expected amount of surprisal” seems like quite an oxymoron.

  • @atrus3823
    @atrus3823 ปีที่แล้ว

    This explains why they don't announce the losers!

  • @TheCellarGuardian
    @TheCellarGuardian ปีที่แล้ว +1

    Great video! But terribile title... Of course it's important!

  • @zxuiji
    @zxuiji 2 ปีที่แล้ว +1

    Hate to be pedantic but a coin flip has more than 2 possible outcomes, there's the edge after all, it's the reason why getting either side is not a flat 50%
    Likewise with dice, they have edges and corners, they can also be an outcome, it's just made rather unlikely due to the air circulation and the lack of resistance vs the the full drag of the landing zone, by full drag I mean the earth dragging it along while rotating and by lack of resistance I mean that not enough air molecules not slam into it through their own drag state, thereby allowing it to just roll over/under the few that do

    • @galliman123
      @galliman123 2 ปีที่แล้ว +1

      Except you just rule those out and skew the probability 🙃

    • @roninpawn
      @roninpawn 2 ปีที่แล้ว +1

      There is no indication, whatsoever, that you "hate to be pedantic" about this. ;)

    • @zxuiji
      @zxuiji 2 ปีที่แล้ว

      @@roninpawn ever heard of OCD, it's similar, I couldn't ignore the compulsion to correct the info

    • @zxuiji
      @zxuiji 2 ปีที่แล้ว

      @@galliman123 except that gives erroneous results, the bane of experiments and utilization

    • @JansthcirlU
      @JansthcirlU ปีที่แล้ว

      doing statistics is all about confidence intervals, the reason why you're allowed to ignore those edge cases is that they only negligibly affect the odds of those events you are interested in

  • @BAMBAMBAMBAMBAM-
    @BAMBAMBAMBAMBAM- 6 หลายเดือนก่อน

    A bit 😂

  • @artic0203
    @artic0203 ปีที่แล้ว

    i solved AI join me now before we run out of time

  • @ThomasSirianniEsq
    @ThomasSirianniEsq 7 หลายเดือนก่อน

    Wow. Reminds me how stupid I am

  • @kofiamoako3098
    @kofiamoako3098 ปีที่แล้ว

    So no jokes in the comments??

  • @elijahromer6544
    @elijahromer6544 2 ปีที่แล้ว

    IN FIRST

  • @atsourno
    @atsourno 2 ปีที่แล้ว +3

    First 🤓

    • @Ellipsis115
      @Ellipsis115 2 ปีที่แล้ว +1

      @@takotime NEEEEEEEEEEEEEEEEEERDS

    • @atsourno
      @atsourno 2 ปีที่แล้ว

      @Rubi ❤️

  • @muskduh
    @muskduh ปีที่แล้ว

    thanks