What is a bit? (Information Theory)

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ต.ค. 2024

ความคิดเห็น • 55

  • @ArtOfTheProblem
    @ArtOfTheProblem  3 ปีที่แล้ว

    Link to series playlist: th-cam.com/play/PLbg3ZX2pWlgKDVFNwn9B63UhYJVIerzHL.html

  • @specialkender
    @specialkender 5 ปีที่แล้ว +8

    This series is GOLD. GOLD I'M SAYING!

    • @ugestacoolie5998
      @ugestacoolie5998 9 หลายเดือนก่อน

      it really is, I wished I known more interesting playlists like these

  • @davidmuller7274
    @davidmuller7274 9 ปีที่แล้ว +7

    Btw. For anyone wondering, why it is H:
    I believe it's not "H" but originally a capital greek "Eta" η for "Entropy",... which is written as H but is still a "capital Eta". :D

  • @ustpCZ
    @ustpCZ 9 ปีที่แล้ว +8

    5:56 "at most 29 binary digits" is incorrect. If you are extremely unlucky you will have to send 30 digits (message "fffff") if you are lucky, you will have to send only 24 digits ("ddddd"). 29 is average number of digits.

  • @Boxcow45
    @Boxcow45 8 ปีที่แล้ว +8

    Hey, that's a binary search!

  • @orhantorunn
    @orhantorunn หลายเดือนก่อน +1

    I encountered the term 'bit' and information theory a lot as a musical composition major when reading empiricial musicology research papers; this video explains it like from zero. That's really great, thank you.

    • @ArtOfTheProblem
      @ArtOfTheProblem  หลายเดือนก่อน +1

      glad you found this, thanks for sharing

  • @alonsorobots
    @alonsorobots 2 ปีที่แล้ว +1

    What an amazing series. All things we learn through education (ideas) are just tools in our tool belt to know how to influence the world around us. Grounding information theory literally in TOOLS that shaped history is a good way to get us to care about them. Following a semi chronological order of how things were invented is such a natural way to slowly follow along. Thank you for allowing more to compress the world around them.

  • @ArtOfTheProblem
    @ArtOfTheProblem  11 ปีที่แล้ว

    Very, very good catch. I should have mentioned that the cards are chosen with replacement...which means this isn't a "true" poker hand - just 5 selections from a set of 52.

  • @IdleGod
    @IdleGod 11 ปีที่แล้ว +2

    I love how the W and M are switched in the tiles. Also, great video; these go a long way to explaining these complex topics.

  • @joshuaronisjr
    @joshuaronisjr 6 ปีที่แล้ว +1

    The average calculation isn't correct though. Let's say we have three letters
    'A' 'B' 'C'
    And each message is only one letter long. So let's say we represent them like this:
    A:0
    B:00
    C:01
    We need 2 bits for B, 2 bits for C, and only 1 bit for A. Therefore the average # of bits is (1 + 2 + 2) / 3choices, or 1.666666
    bits per letter on average.
    Not log2(3), which gives 1.585....
    ...?

    • @meribold
      @meribold 5 ปีที่แล้ว

      I think 1.585 bits is the average we can achieve as the message length approaches infinity. If we send a message of length 10000 composed of the letters A, B, and C, there are 3^10000 possible messages. The base 2 logarithm of 3^10000 is
      log2(3^10000) ~= 15849.6
      We can round this up to 15850 so we definitely have enough bits for all possible messages (2^15850 > 3^10000). On average, this means 15850/10000 = 1.5850 bits per letter.
      Maybe it's more intuitive to think about a message of length 2. We can already get a better per-character average than ~1.667 here by optimally encoding all 9 two-letter combinations.

  • @brianjblairdotcom
    @brianjblairdotcom 11 ปีที่แล้ว

    Whoops got cut of. Video. The answer is sending words and maybe phrases instead of letters.

  • @plekkchand
    @plekkchand 6 ปีที่แล้ว +1

    ET- cetera, not ECK-cetera.

  • @CoreyOgburn
    @CoreyOgburn 11 ปีที่แล้ว +1

    This sounds like it's warming up to a lesson on entropy.

  • @StephenRoseDuo
    @StephenRoseDuo 8 ปีที่แล้ว +4

    CANADA MAKING US PROUD!

  • @MrFischvogel
    @MrFischvogel 10 หลายเดือนก่อน

    This is so good content. Unbelievable. I am very, very grateful to you for this traceable explanation of complex concepts.

    • @ArtOfTheProblem
      @ArtOfTheProblem  10 หลายเดือนก่อน

      thanks! was just thinking about this series as the video i'm releasing next week will mention it (it's on LLM's) stay tuned

  • @87knox
    @87knox 11 ปีที่แล้ว

    Is 4.7 the minimum number of bits per letter, or the average number of bits per letter? You seem to change your mind at least twice every sentence, on average.

    • @ZonkoKongo
      @ZonkoKongo 7 ปีที่แล้ว

      87knox average

  • @cocothetimeless8382
    @cocothetimeless8382 3 ปีที่แล้ว +1

    dude you deserve more views

  • @SOSTacoJohnson
    @SOSTacoJohnson 11 ปีที่แล้ว

    At the end you posed a question. I'm guessing the answer was to compile all the words in the english language and make an algorhythm that would know the entire set of possible subsequent letters based upon the beginning pattern. Is there another way to limit the letters to less than 4-5? Also, I was thinking the number of signals is important, u can use a pause to end the letter, but isnt the speed also important? & if so, the wouldn't you need a number to declare the end of the letter/unit?

  • @tsunamio7750
    @tsunamio7750 ปีที่แล้ว

    5:47 Error detected:
    Worst case, every letter is 5 questions: 6letters x 5questions = 30symbols
    Best case, every letter is 4 questions: 6letters x 4 questions = 24symbols
    Average case, 4.7 letters per question: 6letters x 4.7 questions = 28.2 symbols

  • @ericX97
    @ericX97 11 ปีที่แล้ว

    In the poker hand example you did not account for the fact that when you pick a card from a deck, it is no longer there, thus reducing the message space and consequently the average number of questions.

  • @malchar2
    @malchar2 11 ปีที่แล้ว

    If the message is known to be an English word, and you receive the letter "T" then it's more likely that the next letter will be "H", which might spell the common word "THE". At each node in the decision tree, the letters should be arranged between the two branches so that you have a 50% chance of ending up on either branch. Having an equal quantity of letters on each branch doesn't really make sense, it should be based on how likely they are to follow the previous letter or even previous words.

  • @distantuncertain630
    @distantuncertain630 7 ปีที่แล้ว

    I'm sorry, three things in this video don't make any sense to me.
    In the example with the six-letter word, Brit shows the most efficient way to determine the letter (eliminating about half of the letters with every question). With this method you will never need to ask more than five questions, and sometimes four will be enough. There is no better way of doing it. Based on that he then goes on to calculate the average amount of questions you'd need to ask for one letter - and here's my first point: If you use the described method and you go threw every possbile letter, you will find that for exactly 6 of them you'll only need to ask four questions ( for example a, d, g, n, q, t, if you always choose the smaller one of a pair of three). The 20 remaining letters require five questions. Now if you calculate the average you get (6*4 + 20*5)/26 = 4.769... questions per letter. But Brit does the following and gets a different answer: He uses the formula to calculate the message space based on the differences per symbol and the symbol rate. Message space = (differences per symbol) to the power of (symbol rate). Basically what he calculates is "How many times do I have to sent a bit (two differences per symbol) to be able to construct 26 different messages (i.e. 26 different combinations of those bits) ?" And his answer is log base 2 of 26 = 7.004... . But then he says: "So, on average, approximately 4.7 questions will be needed per letter at minimum [...]" But besides the first two digits his number is different than what I just calculated. Now, did I make a mistake in calculating the average number of questions per letter, or did he simply calculate a different thing?
    Also, does saying "on average at minimum" make any sense? It's either the average or the minimum, isn't it? And he then goes on to calculate 6*4.7 = 28.2 and states that "Bob can expect to ask, at minimum, 28.2 questions, which means Alice will need to send, at most, 29 binary digits." But clearly, the minimum number of questions would be 24 if you happen to only need four questions for every letter. And Alice will need to sent, at most, 30 bits, if all the letters require five questions. So this is the second thing I don't understand.
    And finally, does using the formula for the message space in this case make any sense at all? Because you can't sent a bit 4.7 times. Either 4 times or 5 times. And if it's not the average amount of bits, then what is the point of this calculation?
    I really hope some of you can understand my confusion and maybe help me out ':)
    (Btw it's also possible that I just didnn't understand what he says correctly because english is not my native language, but I don't think that's the case)

  • @vazixLT
    @vazixLT 11 ปีที่แล้ว

    There should be a base fee included, because you need to ask a few questions first to know how long will be the message that comes (if you don't have an agreement of how long the pauses have to be) and also what type of questions you need to ask. At first you would have needed to ask 2 questions do know if the information sent will represent H/T, letters or cards. These points are a bit advanced, if most people would learn and understand at least the concept presented here, it would be very cool

  • @DanCrystalis
    @DanCrystalis 11 ปีที่แล้ว

    I would assume, when sending words, that for the larger ones you would get to a point at which the combination of bits could only represent that word and that specific word alone. You could then save on info by 'shortening' the amount of bits that the word would otherwise consists of, so you could save quite some money and information if your message is long enough.

  • @rotorickle
    @rotorickle 11 ปีที่แล้ว

    At 9:32 the 'M' tile and the 'W' tile are switched. That 'Z' tile also looks suspiciously like another 'N' tile. However, at 1:21, the tiles all look normal. What happened?
    Also, I love this series and am looking forward to the next installment.

  • @henriquedeoliveira7835
    @henriquedeoliveira7835 11 ปีที่แล้ว

    Also, the calculation you presented is the required one to transmit both the cards in the hand AND the order that the cards appear in the hand. If you don't care about the order, as you normally wouldn't in a true poker hand, the number of possibilities is smaller.

  • @nubSawace
    @nubSawace 11 ปีที่แล้ว

    wait... isn't the code with which you decipher the message considered "information"? shouldn't she be charging for that?

  • @DamesPoker
    @DamesPoker 11 ปีที่แล้ว

    Never mind, I just noticed the date of the upload, I will be eagerly anticipating the last 2 episodes :). I have a feeling they have something to do with entropy...

  • @StefanTheFink
    @StefanTheFink 5 ปีที่แล้ว

    My english is not perfect, what exactly does "message space" mean?

  • @ThanadejR
    @ThanadejR 3 ปีที่แล้ว

    This channel is real gem. Thank you so much.

  • @korth66
    @korth66 11 ปีที่แล้ว

    Will you upload the next 2 videos? Very interesting topic.

  • @ericX97
    @ericX97 11 ปีที่แล้ว

    It's the principle behind it that counts.

  • @ericX97
    @ericX97 11 ปีที่แล้ว

    Thank you. I'm surprised I caught it myself.

  • @DamesPoker
    @DamesPoker 11 ปีที่แล้ว

    Where is 11 and 12????????????????????

  • @16D
    @16D 11 ปีที่แล้ว

    I'm guessing transfer protocols?

  • @brianjblairdotcom
    @brianjblairdotcom 11 ปีที่แล้ว

    First view.Great

  • @naboogom2733
    @naboogom2733 7 ปีที่แล้ว

    this was just great

  • @glenrigby5675
    @glenrigby5675 11 ปีที่แล้ว

    fees in canadian!!!!!

  • @Ryanwazhere4490
    @Ryanwazhere4490 11 ปีที่แล้ว

    Khhhhhhhaaaaaaann!!

  • @mijnmuziek454
    @mijnmuziek454 7 ปีที่แล้ว

    Stolen from Khan academy?

  • @ssfox8860
    @ssfox8860 4 ปีที่แล้ว +1

    WHO THE HELL DISLIKED THIS VIDEO. Its a great video

  • @AnnanFay
    @AnnanFay 8 ปีที่แล้ว +2

    I'm pretty sure you only need 22 bits to send a poker hand?
    You cannot get duplicate cards which reduces the amount of data you need to send!

    • @anuarandom
      @anuarandom 8 ปีที่แล้ว +1

      Can you show us how did you get the 22 bits for a poker hand, please?

    • @meribold
      @meribold 5 ปีที่แล้ว +1

      That's a good point, but how do you arrive at 22 bits? Since log2(52)+log2(51)+log2(50)+log2(49)+log2(48) ~= 28.2, we only reduce the number of bits from 28.5 to 28.2.

  • @NiftyFingers
    @NiftyFingers 11 ปีที่แล้ว

    Everything in information boils down to the bit though. You can have nats and hartleys as well. Nats are when you use euler's number instead of 2, and hartleys are when you use 10 instead of 2. If you give me a quantity of information, say 5 nats, that's (1 over ln2) bits, or about 1.44 bits.
    Words and phrases are multiple bits, so there's really no difference. How do you send a phrase over a wire by the way? 1V = "to", 1.1V = "the", 1.2V = "an"... engineers would say what a pain that would be.