Thanks a lot for taking the time to make this very informative (and entertaining) video! When I count I personally find it useful to count like this (either in my head or aloud): 1, 2, 3, 4, 5, 6, 7, 8, 9, *10,* 1, 2, 3, 4, 5, 6, 7, 8, 9, *20,* 1, 2, 3, 4, 5, 6, 7, 8, 9, *30* ... Maybe I'm just weird, but I find it easier to pronounce and keep track of my counting when I do it like that :)
amlivinginhell our data is = THE ESSENTIAL FEATURE 8 represents the number of bits to symbolize ASCII codes. ASCII codes are 8 bits in the memory for example "A" is ASCII = "65" it equals "0110 0101" = total 8 bits. For every letters you need 21*8 bits to save data. Open notepad and try Alt+(NumPad 6 + 5) you will see A in the text area. If you want to code all alphabet and code your data follow this--> You have 26 letters + space = 27 letters in English alphabet. You need 5 bits to represent 27 letters (2^5=32>27) minimum 5 bits required. If you want to code only the letters (the worst case) in your data (THE ESSENTIAL FEATURE), you have T-H-E-S-N-I-A-L-F-U-R- space = total unique 12 letters you need at least 4 bits ( 2^4=16) .
Nice. But how do you decode this thing if you don't know how does the original tree look like? How do you go from a string of numbers like 001100101 to real letters, asumming that the string is all you have? And... if you need to include the tree at the beginnning in order to make the decoding possible - do you still save some space?
Missing a '0' in the spot where you messed up the 'A' currently reads 1010 = I and then '000' which isn't anything should be '0000'. Thanks tho! Great video! so 67 is the total length
One question I have is - how do you transmit the Trie? It seems necessary to have that for decoding, and since the Trie only applies to the message being sent, shouldn't that count in the overall representation. The coded bits alone are not enough to transmit the code and have it understood by the receiver.
+Aaron Harris Have a try at downloading the free pdf of Sedgewick's Algorithms in C. It has a good section on Huffman and some sample code with a very efficient representation of the trie.
+Hunter Johnson Same difficulty here! In order to be decoded, the encoded string would have to inform the other PC how it was encoded, right? At least, how many times a character appears in the tree? Thank you so much again for the video and for the explanation, it really helped!!
The overhead of the tree is compensated for by encoding larger data. The tree will not be larger (or not much) with larger data set encoded. But you are right, for small examples, the encoded data will be larger than the original. Which is why you don't zip (or gzip) small files...
+Hunter Johnson Great explanation, thank you so much!! Now, what if I want to perform a compression in a string and send it to another terminal? In order to be decoded, this string would have to inform the other PC how it was encoded, right? At least, how many times a character appears in the tree? Could you help me to figure it out? Thank you so much again!!
+Ulisses Piassa Thanks for liking the video! The receiver needs a copy of the data structure we build to do the encoding, namely the trie (en.wikipedia.org/wiki/Trie). You could use one of several different trie representations.
Ulisses I developed a software based on Huffman Coding. Here's the C code: github.com/RaisinTen/HuffCrypt If you find it interesting, give it a star! Hope this helps! :)
I'm confused about the decoding? When reconstructing the message from the tree (or trie) and you read the Huffman code, how do you know to read 001 rather than 0011 or 00110 to retrieve that character for each character?
Great explanation buddy! It wud b better if u cud elaborate the decoding section too comfortably but anyway great work! thanks a ton for teachin huffman coding.. :)
Firstly, there is an error in this video, the right code is 001,1001,01,1111,01,0001,0001,01,1100,001,1010,0000,1011,1111,1000,01,0000,001,1110,1101,01.Your code is "THE ESSTISE NTATURE"Another question is that you should write the trie in code.
all that is simple and easy, but what if you transmit this code, but the system you transmit it does not know how the tree was made? it would be just a pile of ... numbers. Do people embed the information of the tree into the file/array? or is this coding can only be program/codec/library specific which has all the trees and it decodes it as their own default values?
You send it with the encoded data. So the total data sent will be larger then the original for smal files. Which is consistent with gzip and zip, if you try. But for larger data encoded, the overhead of the tree will not grow, and thus the data will still be compressed. So, don't compress small files.
But is this really correct ? I see sometimes you put higher numbers on left sometimes on the right traversal. In my opinion this is incorrect to huffman algorithm.
Hi Sir one question how you decide that A and S to put at in left side not in right side of space please let me know this requires for my exam on 29 nov thanks in advance
what file type could i save this as if it is saved in a text document each 1 or 0 is encoded into ascii xD I tried saving it as a .bin but it still took a byte per digit?
The file system are encoded in bytes, so you have to shift the bits together so you can write them in groups of 8 bits = 1 byte. As the encoded data may not end up evenly divided by 8, you have to tell how many bits you are using. And of course, you need to send over the tree (or a table of the code for each encoded byte).
Hunter, couldn't you have done that before you started the video? That was the most boring thing I've ever seen in my entire life. Haha. That was pretty interesting. Thanks!
Hi Mateusz Picheta ... You need to pair r and w first! Then combine this tree with w, assuming that there are no more elements of frequency 1 to consider. The tree must be binary...no node can have 3 children.
Thank you for taking the time to make this video. Really helped with my algorithms class!
Thanks a lot for taking the time to make this very informative (and entertaining) video! When I count I personally find it useful to count like this (either in my head or aloud): 1, 2, 3, 4, 5, 6, 7, 8, 9, *10,* 1, 2, 3, 4, 5, 6, 7, 8, 9, *20,* 1, 2, 3, 4, 5, 6, 7, 8, 9, *30* ...
Maybe I'm just weird, but I find it easier to pronounce and keep track of my counting when I do it like that :)
Totally new to Huffman code. Pointing out that the highest frequency letters have the shortest code was a very nice touch - I would have missed it.
Thanks a lot for a very clear and easy to digest explanation of this topic! :)
"You can fast forward this video if you want" ... don't worry, you've been talking at X2 speed the whole time lol
"Huffman is whipping it" :D wonderful explanation btw!! thanx
Well done. This is the best Huffman video.
The way you explained!!! xDD lol I'm in love!!!
RIP Headphone users! Your "S" Killed my ears... But thanks for the video!
+Elia Oggian I didn't notice before reading your comment. Thank you
Very true. I had to readjust my equalizer and went for all bass up plus all treble down. :P Thanks for the video BTW.
Thanks for pointing out, I hate it.
There are desser plugins out there :)
Good job thank you! But where is the "N" of the "ESSENTIAL" while coding at time 13:40. Total length should be 70 not 66.
Abdullah Erhan Akkaya oh, good call! I missed it!
amlivinginhell our data is = THE ESSENTIAL FEATURE
8 represents the number of bits to symbolize ASCII codes. ASCII codes are 8 bits in the memory for example "A" is ASCII = "65" it equals "0110 0101" = total 8 bits. For every letters you need 21*8 bits to save data. Open notepad and try Alt+(NumPad 6 + 5) you will see A in the text area.
If you want to code all alphabet and code your data follow this--> You have 26 letters + space = 27 letters in English alphabet. You need 5 bits to represent 27 letters (2^5=32>27) minimum 5 bits required.
If you want to code only the letters (the worst case) in your data (THE ESSENTIAL FEATURE), you have
T-H-E-S-N-I-A-L-F-U-R- space = total unique 12 letters you need at least 4 bits ( 2^4=16) .
How much of a spike do you see during exam season in views? xD
very nice video! the prefix free thing feels like magic indeed.
thanx hunter great job...really AWSOME video...
+pallavi gupta hi
Thank you so much i've seen alot of Videos but your was the most obvious
Really a good job on explaining! Thanks a lot!
thanks for the video, really helped me out. Your voice is very nice
Thank you Hunter. You explain it very well.
Thanks a lot. This was life saver for my exam :P
great explanation, thanks a lot!! It really helps with my algorithm studies :)
Great video for my Colorado Technical University "Computer Algorithms" course. But didn't you miss the 2nd occurence of the apace in your tree?
4:20 Shouldn't the A-4-S be on the right side? I mean 2
Excellent explanation!
Excellent video! Thank you!
Nice. But how do you decode this thing if you don't know how does the original tree look like? How do you go from a string of numbers like 001100101 to real letters, asumming that the string is all you have? And... if you need to include the tree at the beginnning in order to make the decoding possible - do you still save some space?
Thank you Hunter)the best explanation.
greatest explanation ever!
Good Job. Thanks for the explanation
Thank you so much! This helped a lot!
Great video!! You made it easy to understand, even though this Huffman being the most boring thing I've ever seen. hehe Thank you very much.
Nice. Well done, mate.
Pretty cool stuff, thanks !
Missing a '0' in the spot where you messed up the 'A' currently reads 1010 = I and then '000' which isn't anything should be '0000'. Thanks tho! Great video! so 67 is the total length
One question I have is - how do you transmit the Trie? It seems necessary to have that for decoding, and since the Trie only applies to the message being sent, shouldn't that count in the overall representation. The coded bits alone are not enough to transmit the code and have it understood by the receiver.
+Aaron Harris Have a try at downloading the free pdf of Sedgewick's Algorithms in C. It has a good section on Huffman and some sample code with a very efficient representation of the trie.
+Hunter Johnson Same difficulty here! In order to be decoded, the encoded string would have to inform the other PC how it was encoded, right? At least, how many times a character appears in the tree? Thank you so much again for the video and for the explanation, it really helped!!
The overhead of the tree is compensated for by encoding larger data. The tree will not be larger (or not much) with larger data set encoded.
But you are right, for small examples, the encoded data will be larger than the original. Which is why you don't zip (or gzip) small files...
beautiful
I really appreciate this.. thank you very much
+Hunter Johnson Great explanation, thank you so much!! Now, what if I want to perform a compression in a string and send it to another terminal? In order to be decoded, this string would have to inform the other PC how it was encoded, right? At least, how many times a character appears in the tree? Could you help me to figure it out? Thank you so much again!!
+Ulisses Piassa
Thanks for liking the video! The receiver needs a copy of the data structure we build to do the encoding, namely the trie (en.wikipedia.org/wiki/Trie). You could use one of several different trie representations.
+Hunter Johnson Great, I'm gonna take a look in it. Thanks again!
Ulisses I developed a software based on Huffman Coding. Here's the C code: github.com/RaisinTen/HuffCrypt
If you find it interesting, give it a star! Hope this helps! :)
nice clear example, cheers
you can share coding about huffman
at 16:20 were did you get the times 8
I'm confused about the decoding? When reconstructing the message from the tree (or trie) and you read the Huffman code, how do you know to read 001 rather than 0011 or 00110 to retrieve that character for each character?
it would be really nice if you just tried to explain what you were trying to do in advance dude
Why in this case do you not end up with the highest frequency character being a single bit to encode? I seem to often get that when I do encoding...
Great video!
Great explanation buddy! It wud b better if u cud elaborate the decoding section too comfortably but anyway great work! thanks a ton for teachin huffman coding.. :)
Firstly, there is an error in this video, the right code is 001,1001,01,1111,01,0001,0001,01,1100,001,1010,0000,1011,1111,1000,01,0000,001,1110,1101,01.Your code is "THE ESSTISE NTATURE"Another question is that you should write the trie in code.
i like how you count. i do the same
all that is simple and easy, but what if you transmit this code, but the system you transmit it does not know how the tree was made? it would be just a pile of ... numbers. Do people embed the information of the tree into the file/array? or is this coding can only be program/codec/library specific which has all the trees and it decodes it as their own default values?
Besides, thanks for your share.
AWESOME MAN THANK YOU SO MUCH
господи почему ты объясняешь так хорошо капец лол
Very good! Thanks a lot!
mast bhai
well explained. thank you!
Thanks for the video!
Nice work man , keep it up (Y)
Thanks for the video.
The total should be 71 bits, i think you miss 4 bits on N letter and 1 zero bit on first A.
how does decoder know of the tree (trie)?
This is not encryption, this is coding.Even when u ASCII code it, to decode you need to have the ASCII table with you :)
You send it with the encoded data. So the total data sent will be larger then the original for smal files. Which is consistent with gzip and zip, if you try.
But for larger data encoded, the overhead of the tree will not grow, and thus the data will still be compressed.
So, don't compress small files.
might sound like a dumb question but how do you save a tree in a file so that when you read it the next time, the tree is loaded properly?
But is this really correct ? I see sometimes you put higher numbers on left sometimes on the right traversal. In my opinion this is incorrect to huffman algorithm.
Hi Sir one question how you decide that A and S to put at in left side not in right side of space please let me know this requires for my exam on 29 nov thanks in advance
+Dipak Borse you can put it either way ;)
what file type could i save this as if it is saved in a text document each 1 or 0 is encoded into ascii xD
I tried saving it as a .bin but it still took a byte per digit?
The file system are encoded in bytes, so you have to shift the bits together so you can write them in groups of 8 bits = 1 byte.
As the encoded data may not end up evenly divided by 8, you have to tell how many bits you are using.
And of course, you need to send over the tree (or a table of the code for each encoded byte).
Hunter, couldn't you have done that before you started the video? That was the most boring thing I've ever seen in my entire life.
Haha. That was pretty interesting. Thanks!
Thanks Buddy!
awesome
nice one man!
thx
There's a mistake in the video... when I press 1 while it's playing, you say "three". (PS. Thanks for the vid)
i got it now thank you!
Nice!
Big man-sized 5
Hi Mateusz Picheta ... You need to pair r and w first! Then combine this tree with w, assuming that there are no more elements of frequency 1 to consider. The tree must be binary...no node can have 3 children.
Hunter Johnson please can you help me
.?
If you have Facebook
Nice!! Thank you!.
thank you man. I got it.
Thanks a lot !!
thank you
Thanks alot :)
Ok.. why does it work?
scratch that, you can kinda see it
tanks sir
Man it was helpful but i gota say that you are funny though :D
was here because of my last minute study for tomorrow paper. wish me luck everybody.
THX!
Naiiicee!!
Excellent! P.S. You are far from stupid. :)
Pls use language indo.
Very boring lol haha
Thanks!