How File Compression Works (The Basics)

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ก.ค. 2016
  • Lossless file compression might seem complex, mysterious, or hard to implement. My goal with this video is to demystify it. After watching this, you should have a better idea of what happens whenever you zip a file, browse the web, or upload documents to the cloud.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 80

  • @TheProgRock
    @TheProgRock 7 ปีที่แล้ว +30

    Nobody has been able to explain file compression to me....
    ....until this moment.
    Thanx buddy

  • @monkeyjuju7441
    @monkeyjuju7441 7 ปีที่แล้ว +61

    for a brief moment I really thought this guy was Bighead from Silicon Valley lol

    • @sitalsitoula6536
      @sitalsitoula6536 4 ปีที่แล้ว +2

      Channel Name is also machead..maybe his brother

    • @kingcoding3587
      @kingcoding3587 3 ปีที่แล้ว

      😂Hhh me too

    • @facundopitton9011
      @facundopitton9011 3 ปีที่แล้ว

      i jusy end silicon vallet s2 and decidec to investigate compresión

  • @christopherauito7262
    @christopherauito7262 10 หลายเดือนก่อน +1

    Better than any instructor I’ve ever had at explaining this.

  • @noonansean1979
    @noonansean1979 7 ปีที่แล้ว

    I really enjoy your channel, you should really make more videos, the world needs them.

  • @colewilkes7149
    @colewilkes7149 ปีที่แล้ว +1

    What an excellent example. Very easy to understand, thank you

  • @MDweller
    @MDweller 4 ปีที่แล้ว +4

    Thankyou man. been thinking about how file compression works for almost twelve years. couldnt figure out. thought to look it up today. understood everything in one video here. 👍

  • @Darieee
    @Darieee 6 ปีที่แล้ว +1

    That red wall is awesome ! Nice explanation !

  • @pouyahosseinzadeh985
    @pouyahosseinzadeh985 7 ปีที่แล้ว

    I really enjoyed and fully understood it. thanks you!

  • @FreezeFrame175
    @FreezeFrame175 5 ปีที่แล้ว +1

    quite good video and very easy to follow. thx

  • @tomas3399
    @tomas3399 4 ปีที่แล้ว +4

    If dreamworks is ever looking to make a live action how to train your dragon , give them a call

  • @NiceGuyShaders
    @NiceGuyShaders 2 ปีที่แล้ว

    Thank you man! You made it really clear.

  • @mitchwar2065
    @mitchwar2065 5 ปีที่แล้ว

    Great video, thanks for your time !

  • @Keyfaze
    @Keyfaze 8 ปีที่แล้ว +3

    Very helpful, I wonder why this doesn't have more views.

    • @LeSaboteur3981
      @LeSaboteur3981 3 ปีที่แล้ว

      so true, i am happy to have found this video (4 years later)

  • @fruitenjoyer4248
    @fruitenjoyer4248 ปีที่แล้ว

    great video your good at explaining stuff thank you

  • @prodengineer
    @prodengineer 8 ปีที่แล้ว +4

    Good video mate

  • @ragrazila
    @ragrazila 4 ปีที่แล้ว

    Thanks, it was very helpful.

  • @elliotth6042
    @elliotth6042 4 ปีที่แล้ว

    Very helpful video!

  • @Christopher_Cole
    @Christopher_Cole 2 ปีที่แล้ว

    Awesome! Thank you

  • @LeSaboteur3981
    @LeSaboteur3981 3 ปีที่แล้ว +2

    very well done! this was so helpful! but watching this in 2021, i just hope you have a better camera by now 😉

  • @luisalves1706
    @luisalves1706 3 ปีที่แล้ว

    Thank you.
    Good idea.

  • @ammaribrahim5756
    @ammaribrahim5756 5 ปีที่แล้ว

    Amazing bro......Thank you from Arabia

  • @adboshop
    @adboshop 2 ปีที่แล้ว

    Very good. Thx.

  • @georgensa3942
    @georgensa3942 5 ปีที่แล้ว

    that's very awesome and nice explanation..... like that

  • @nconrad4504
    @nconrad4504 2 ปีที่แล้ว

    Thank you so much.

  • @mc4444
    @mc4444 8 ปีที่แล้ว +2

    I always like to think about how entropy kinda flows from one thing to another. In the first example the uncertainty about the next character was maximal so the code had to carry all of the information. As soon as we find out, through some other channel, that some characters are more likely the average code gets smaller because in needs to carry less information. Of course theres also entropy in the rules for decoding the stream.

    • @macheads101
      @macheads101  8 ปีที่แล้ว +1

      I am glad that you mentioned entropy (technically cross-entropy in this case). I am considering making a follow up video where I discuss entropy in more detail. My motivation is that information entropy is extremely important for many sub-fields of machine learning, so I'd like to have something to refer people to if I ever do an ML series.

  • @haaey1197
    @haaey1197 6 ปีที่แล้ว

    Thanks, Nice wallpaper btw

  • @darksideofthetube
    @darksideofthetube 4 ปีที่แล้ว

    thanks man, really good explanation

  • @MattsgotaMac
    @MattsgotaMac 7 ปีที่แล้ว

    Great Video! Thank you!

  • @JuckReis
    @JuckReis 3 ปีที่แล้ว

    Thanks!

  • @stephenm6309
    @stephenm6309 7 ปีที่แล้ว +10

    So your saying scrabble is basically compression

    • @ElPsyKongroo
      @ElPsyKongroo 7 ปีที่แล้ว +1

      Stephen Miller actually decent analogy and joke, you would make a good teacher

  • @keghnfeem4154
    @keghnfeem4154 8 ปีที่แล้ว

    Wow this is great, because i have been working in with nibbles or two bit number
    system for a long time, to test my compression algorithms.
    Nice Huffman encoding of DNA. I just realized that i get the best compression
    in the two bit number system. Maybe that why mother nature use it?
    And it is good for testing chaos and randomness theories.

    • @macheads101
      @macheads101  8 ปีที่แล้ว +1

      I loved the two-bit example for its simplicity, and I chose the "lumpy" probabilities specifically to ensure that the Huffman coding was optimal.
      Hmm, it's funny, nature doesn't really utilize DNA efficiently. My understanding is that the four letters in DNA actually encode twenty amino acids (three letters of DNA => 1 amino acid) so the code is actually quite suboptimal (multiple 3-letter codes create the same amino acids; the code is degenerate).

  • @IllumTheMessage
    @IllumTheMessage 8 ปีที่แล้ว

    good stuff

  • @obasaoluwaseun5415
    @obasaoluwaseun5415 6 ปีที่แล้ว

    Nice and explanatory video but I have a question. Is it possible to compress a 15kb data to 5bit?

    • @Mark-kt5mh
      @Mark-kt5mh 6 หลายเดือนก่อน

      The minimum amount of data (in total bits) required to represent an arbitrary piece of information depends entirely on the entropy (or randomness) of the source information. In your question, the minimum entropy in a 15kb source of information would be 1 or 0 repeated 120,000. At least 19 bits are required to represent those two example pieces of information.

  • @MrJustletmejoin
    @MrJustletmejoin 4 ปีที่แล้ว +1

    Can someone explain to me why you cant just make everything smaller? He said at 10:36 that you can't just make everything smaller. is there some rule/law that won't let you compress everything as small as possible?

    • @taheralipatrawala7300
      @taheralipatrawala7300 ปีที่แล้ว

      if you make everything smaller there are chances that other big encodings are gonna be represented as those smaller encodings. If you think carefully, it might make sense.

  • @altobyy4855
    @altobyy4855 8 ปีที่แล้ว +4

    you are awesome..

  • @pedro.britto
    @pedro.britto 7 ปีที่แล้ว +1

    Awesome Explanation! Thanks a lot! Can you recommend a book or article on compression algorithms?

    • @macheads101
      @macheads101  7 ปีที่แล้ว +2

      I don't know about a book, but a few wikipedia pages will probably help you out. I'd look up Huffman coding and information entropy to get started.

    • @pedro.britto
      @pedro.britto 7 ปีที่แล้ว

      Will do. Thanks again!

  • @tompov227
    @tompov227 8 ปีที่แล้ว +1

    Incredible ^_^

  • @aliemad322
    @aliemad322 4 ปีที่แล้ว

    thnx

  • @prakh1250
    @prakh1250 2 ปีที่แล้ว

    this is fucking awesome.

  • @misnad
    @misnad 8 หลายเดือนก่อน

    Thanks for the video. : )
    I wonder why you stooped making videos. Hope you are doing good.

  • @moritzbraun5034
    @moritzbraun5034 6 ปีที่แล้ว

    the towell box in the backround XD

  • @Sabiancym
    @Sabiancym 7 ปีที่แล้ว +6

    Wow, it's the only Apple user in the world who isn't completely computer illiterate.

    • @MC4K
      @MC4K 7 ปีที่แล้ว

      Sabiancym lol I want to say it's not true. But your right!😂

    • @ElPsyKongroo
      @ElPsyKongroo 7 ปีที่แล้ว

      Sabiancym surprisingly the computational chemistry research team at my college all used macs. They didn't use gui either, but with Linux

  • @jitsusingh1601
    @jitsusingh1601 8 ปีที่แล้ว

    Are you in college nowadays? I've seen all your terminal videos with the girly voice.
    I was discouraged from starting to learn computers at 22 because I thought I was too old to master it now. But I started anyway from your terminal series because I was always a bit curious about that; now I'm 23 and I have been making constant headway in learning this stuff.

    • @macheads101
      @macheads101  8 ปีที่แล้ว +1

      I love hearing stories like yours! I am currently 19 and in college. When I made those terminal videos, I was probably 11 or 12. Good times.

  • @alibarsgultekin8748
    @alibarsgultekin8748 7 ปีที่แล้ว

    Steve?

  • @joshuafishman9002
    @joshuafishman9002 8 ปีที่แล้ว

    So like are you a programmer or something?

    • @macheads101
      @macheads101  8 ปีที่แล้ว +1

      haha i guess you wouldn't know from this video

  • @WayneModz
    @WayneModz 6 ปีที่แล้ว

    I get what youre saying but how would we then know when 110 is in fact A and not CG clearly we have maintain a bit length, 8bit 16bit etc... 000 010 110 111 maybe im just being extra technical but im sure this how you have expected us to interpret it.

    • @rsmith155
      @rsmith155 5 ปีที่แล้ว

      Wayne Modz I'm interested in the answer to this also. Would be a much bigger issue with the while alphabet too

    • @ferrari884
      @ferrari884 5 ปีที่แล้ว +2

      I’m confused where you are seeing CG in the 110 sequence. In his example, CG should be coded as 100.

  • @Yauton
    @Yauton 4 ปีที่แล้ว

    I still don't understand how this apply's into computer binary's of a file.... will just leave with this idea for the next years and try to crack myself hahahah

  • @olee_7277
    @olee_7277 5 ปีที่แล้ว +9

    I know this is sexist but kleenex box always look suspicious in a dude's room lol

  • @slaviboy
    @slaviboy 6 ปีที่แล้ว +1

    Steve Jobs? :D

  • @lewys9204
    @lewys9204 3 ปีที่แล้ว

    my laptop has 108gb spare out of 360gb... after compressing my data I've now got 190gb 🤯🤯 spare but also buying another SSD 250gb - shitty laptops hey lol

  • @ellamarie2258
    @ellamarie2258 7 ปีที่แล้ว

    this is complicated , not for people without experience.

    • @ElPsyKongroo
      @ElPsyKongroo 7 ปีที่แล้ว

      Ella Marie lol what how?

    • @minglee9288
      @minglee9288 5 ปีที่แล้ว +1

      not sure if a inexperienced person would/should look up stuff like this in the first place

    • @user-pl7tf9gv8e
      @user-pl7tf9gv8e 3 ปีที่แล้ว

      Just stop cursing if you wanna learn

  • @christopherdalyii381
    @christopherdalyii381 7 ปีที่แล้ว

    Why is this titled "How File Compression" works when you're speaking of DNA->Binary encoding?
    Maybe you should re-title it to "How DNA Is Used to Encode Binary Data."
    Because this doesn't seem to have anything to do with file compression (like WinZip, WInRar, or 7Zip).

    • @pedro.britto
      @pedro.britto 7 ปีที่แล้ว +2

      In the lowest level, data is stored using only 1s and 0s. Creating libraries and substituting one value for a certain arrangement of bits, the most common ones using fewer bits than others not that common, can - and most likely will, reduce the number of bits. This is how file compression works.

    • @christopherdalyii381
      @christopherdalyii381 7 ปีที่แล้ว

      Pedro Britto Right.
      But why does the video include DNA, technically different topic matter of a different field.

    • @msctbeats
      @msctbeats 7 ปีที่แล้ว +1

      Because he's using an example so that you can understand practical application of compression. The field of file compression is not limited to Winzip and winrar universal compression freeware.