This was exactly the baby step I needed to get me on my way with entropy. Far too many people try to explain it by going straight to the equation. There's no intuition in that. Brilliant explanation. I finally understand it.
how does one make something so complicated into something so intuitive that others can finally see the picture. your explanation itself is an amazing feat.
@@SerranoAcademy Repetition (redundancy) is dual to variation -- music. Certainty is dual to uncertainty -- the Heisenberg certainty/uncertainty principle. Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics. Randomness (entropy) is dual to order (predictability) -- "Always two there are" -- Yoda.
Good video! Minor correction of calculations: at 5:50, the probability of getting the same configuration is 0.25. This is because there are only 4 possible configurations of the balls (there is only one blue ball, and only four slots, so only 4 places the blue ball can be). This can also be calculated by selecting red balls first multiplying 0.75 * 0.66667 * 0.5 = 0.25. Similarly, at 6:58, the probability is 1/6 because there are 6 possible configurations. We can calculate the probability by multiplying (2/4) * (1/3) = (2/12) = (1/6) ~= 0.166667.
I have been scared of delving into entropy in detail for so long because the first time I studied it, it wasn’t a good experience. All I want to say is THANK YOU!!!!!! I should have been supplementing the udacity ND lesson videos with these since the beginning.
Excellent explanation, very clear and concise! I have always pondered the significance of the log in cross-entropy loss function. The explanation (particularly: "products are small and volatile, sums are good") completely clears this up.
I'm studying Decision Tree (Machine Learning Algorithm) and it uses Entropy to efficiently build the tree. I finally understand the details. Thank you!!
Thanks for the relationship between knowledge and entropy, that was very helpful. Your explanation of statistics is also good! Though, I am only half way through the video at this point, I will finish it! Thanks
I find sum(p*log(p^-1)) more intuitive. Inverse p (i.e. 1/P) is the ratio of total samples to this sample. If you ask perfect questions you'll ask log(1/p) questions. Entropy is then the sum of these values, each multiplied by the probability of each, which is how much it contributes to the total entropy.
hi Luis, nice to meet you, I am reading the book of Deep learning of Ian Godfellow, and I needed to view your video for understand the chapter, 3.13 information theory. thanks very much.
Wow, thank you, man. I needed that information! There are many ways to teach the same stuff! That number of question stuff is great! It's good to have more than one way to measure something!
In the last minute of the video, he explains that using Log base 2 corresponds to the level of a decision tree, which is the number of questions you'd have to ask to determine a value.
You made a mistake/approximation by saying the entropy is equal to the number of question needed to be asked in order to find out which letter it is. If I do a scenario with only three letters, all equiprobable, the entropy is about 1.59 but the average number of question needed to find out the correct letter is about 1.66. Your presentation gives a great way to gain an intuitive feeling about the entropy, but maybe you should include a small disclaimer on this point.
This is the best explanation I have come across for a long time, Can you please answer how can we use entropy to find the uncertainty of a naive Bayesian classifier with let's say 4 feature variables and a binomial class variable?
Actually, there is something wrong here. the entropy and information in information theory are representing the same thing which is how much information we will get after decoding the random message, so in case of the balls in the box if all are the same color we have no information after decoding the message as its probability to be red =1 hence low entropy and low information.
Luis, Thank you so much for this brilliant elucidation of information theory & entropy. Merely as an avocation, I have been toying around with a pet evolutionary theory about belief systems and societies. In order to test it - if that is even possible - I felt I needed to develop some sort of computer program as a model. Since I have very little programming experience and only mediocre math skills, I have been teaching myself both (with a lot of help from the web). It was purely by accident that I stumbled upon Claude Shannon and information theory, and I immediately became fascinated with the topic, and have a hunch that it may somehow be relevant to my own research. Regardless, I am now interested in it for its own sake. I had a an ephemeral understanding of how all the facets (probability, logs, choices, etc.) were all related mathematically, but it wasn't until after watching your video that I believe I fully grok the concept. At one point early on, I found myself shouting, "if he brings up yes/no questions, I know I understand this!" And then you did. It was such a wonderful moment for someone who finds math so challenging, and it is greatly appreciated! I shall check out your other videos later. You're a very good teacher!
Great vid, I definately learned something. It could be improved by shortening it a bit by not going so deeply into the details of the "dummy" examples. Like, for the example with four red balls at 6:25, just say the probability of guessing all four balls correct is 100. No one needs an explanation of why that is. Another example is with the tree of questions. Saying "now, only one more question is needed" at 16:23 is enough, we don't need to go into details about those last questions and their outcome. But, the aproach was very nice, I liked it. Just some tips for improvement for future vids.
Hi. Thanks a million times for simplifying a very complicated topic. Kindly find time n post a simplified tutorial on mcmc.... I am overwhelmed by your unique communication skills. Markov chain Monte Carlo. God bless you.
552 secs sequencing and entropy, milk that is perfectly "random in the coffe vs. seperated milk and coffee. remember the averge number is the hardest to get due to movement or variance. so the average person is the hardest thing to be.
Syntropy is dual to increasing entropy -- The 4th law of thermodynamics! Thesis is dual to anti-thesis -- The time independent Hegelian dialectic. Schrodinger's cat: Alive (thesis, being) is dual to not alive (anti-thesis, non being) -- Hegel's cat. Syntropy is the process of optimizing your predictions to track targets or teleological physics. Teleological physics (syntropy) is dual to non teleological physics (entropy, information).
Although a good description of informatic entropy, the analogy used at the beginning of a phase change doesn't describe thermodynamic entropy very well. The reason why ice melting constitutes an increase in entropy in this case is because it is in an open thermodynamic system with its environment. Heat has been transferred from the room (a closed system) to the ice. It is this irreversible movement of heat from the room that constitutes the increase in entropy since the average temperature of the room and the ice has decreased and will continue decreasing until it reaches a stable equilibrium. Indeed, we would not arrive at gas if there was not sufficient potential energy in the room. While Boltzmann entropy is similar, the similarity lies in the fact that this transfer of heat understood on a macro level is translation of the probability of this energy distribution on a micro level. Entropy is then a measure of the extent to which the particles are in a probable microstate.
This was exactly the baby step I needed to get me on my way with entropy. Far too many people try to explain it by going straight to the equation. There's no intuition in that. Brilliant explanation. I finally understand it.
Sean Walsh feel the same way.
an amazing teacher is an invaluable thing
how does one make something so complicated into something so intuitive that others can finally see the picture. your explanation itself is an amazing feat.
With great knowledge comes low entropy
Hahaaa, love it!!!
lol
@@SerranoAcademy Repetition (redundancy) is dual to variation -- music.
Certainty is dual to uncertainty -- the Heisenberg certainty/uncertainty principle.
Syntropy (prediction) is dual to increasing entropy -- the 4th law of thermodynamics.
Randomness (entropy) is dual to order (predictability) -- "Always two there are" -- Yoda.
And low entropy is easier to rig
You win
Luis, you are such an incredibly gifted teacher and so meticulous in your explanations. Thank you for your hard work.
Good video! Minor correction of calculations: at 5:50, the probability of getting the same configuration is 0.25. This is because there are only 4 possible configurations of the balls (there is only one blue ball, and only four slots, so only 4 places the blue ball can be). This can also be calculated by selecting red balls first multiplying 0.75 * 0.66667 * 0.5 = 0.25.
Similarly, at 6:58, the probability is 1/6 because there are 6 possible configurations. We can calculate the probability by multiplying (2/4) * (1/3) = (2/12) = (1/6) ~= 0.166667.
I have been scared of delving into entropy in detail for so long because the first time I studied it, it wasn’t a good experience. All I want to say is THANK YOU!!!!!! I should have been supplementing the udacity ND lesson videos with these since the beginning.
Excellent explanation, very clear and concise! I have always pondered the significance of the log in cross-entropy loss function. The explanation (particularly: "products are small and volatile, sums are good") completely clears this up.
Excellent! Great explanation. Enjoyable video (except YT’s endless, annoying ads). Thank you for composing and posting.
Great work. Compared to my textbook you explained it 100 times better, Thank you.
Thank you so much. This was the only video in youtube that clarified all my doubts regarding the topic of entropy.
i wish i had this lecture during college examination.....still it's nice to finally understand the intuition behind the formulas i already knew.
Teaching should be like this, from practice to theory - no the other way around!
2nd time I found this video and loved it both times. Much better description than the prof at the uni I am at!!!
I'm studying Decision Tree (Machine Learning Algorithm) and it uses Entropy to efficiently build the tree. I finally understand the details. Thank you!!
Great clarity. Have never got this idea about the Shannon Entropy. Thank you. Great work!
I needed this video to get me up to speed on entropy. Great job Luis!
Thanks for the relationship between knowledge and entropy, that was very helpful. Your explanation of statistics is also good! Though, I am only half way through the video at this point, I will finish it!
Thanks
Luis, You have a great way of explaining. At times , I like your videos more than even some highly rated professors
This was one of the best explanations on entropy. Thanks
At 13:44 it's not 0.000488 but 0.00006103515 ! There is a computation error. The entropy is correct, 1.75.
Thank you for the correction! Yes, you're right.
Can you make a part 2 with the full proof, not just the intuition behind the formula? Your explanation's amazing & would love to see a part 2.
Confession: I was a math kiddy; I know to use it but I often missed the deeper meaning and intuition. Your videos are turning me into a math hacker.
Wow! Awesome, so books and encyclopedias and biographies of Shannon to understand what you just clearly explained! Thank You!
Easy and Great explanation! Thank you very much, Luis
What a great explanation ! I wish I had a teacher like you Luis, everything wold be way easier ! Thanks a lot
The best explanation about Shannon entropy that I have ever heard. Thanks!
I find sum(p*log(p^-1)) more intuitive.
Inverse p (i.e. 1/P) is the ratio of total samples to this sample. If you ask perfect questions you'll ask log(1/p) questions. Entropy is then the sum of these values, each multiplied by the probability of each, which is how much it contributes to the total entropy.
This video is helping to keep me floating in my Data Science course; thank you so much for your time!
Thanks a lot Luis, just had an exam about this Wednesday and your video helped me a lot to understand the whole concept.
Great video! Now I understand what Claude Shannon discovered and how useful and essential maths are in Computer Science.
Gracias. Muito claro Senhor. I have been struggling to wrap my head around this and you just made it easy. Thank you.
I love the explanation of the negative sign in the Entropy Equation many people wonder
You are the best.Such a great explanation.Better than lots of text books.
hi Luis, nice to meet you, I am reading the book of Deep learning of Ian Godfellow, and I needed to view your video for understand the chapter, 3.13 information theory. thanks very much.
Very good explanation - hope to hear more of your videos
your explain is perfect. Even though I am not good at listening english. I can understand everything :)
Hola Luis,
estupendo, espectacular, excelente!
That moment when you realize you don't need to search for another video because you got it from the first time.
What I'm trying to say is Thank You!
Wow, thank you, man.
I needed that information!
There are many ways to teach the same stuff!
That number of question stuff is great! It's good to have more than one way to measure something!
Wow ..... I wish more people could teach like you this is so insightful
It's very helpful for me to introduce the concept of entropy to students. Thank you for your clear presentation of entropy.
Thank you for excellent explanation of entropy concept first... Then reach to final equation step-by-step it is really good and simple way
Excellent presentation for an otherwise complex concept.
Its always hard to understand the equations but u made it so simple :-)
Luis, you really are a great communicator. Looking forward to your other explanations.
Great Video! I really liked the intuitive approach. My professors was waaaay messier.
I watched it straight through. Very good.
Great explanation, greetings from Brazil!
Best explanation I found so far
So, after watching the video, the entropy for giving you thumbs up and subcribing to your channel was 0 - i.e. great explanation!
Great explanation. But I think what’s still missing is an explanation of why we use log base 2....didn’t quite get that
In the last minute of the video, he explains that using Log base 2 corresponds to the level of a decision tree, which is the number of questions you'd have to ask to determine a value.
You made a mistake/approximation by saying the entropy is equal to the number of question needed to be asked in order to find out which letter it is. If I do a scenario with only three letters, all equiprobable, the entropy is about 1.59 but the average number of question needed to find out the correct letter is about 1.66.
Your presentation gives a great way to gain an intuitive feeling about the entropy, but maybe you should include a small disclaimer on this point.
What a great explanation! And so i subscribed😊
Really, you have given us outstanding information.
Brilliant lecture! I learn so much with this explanation. Thanks from Brazil :)
Superb step by step explanation
This is the best explanation I have come across for a long time, Can you please answer how can we use entropy to find the uncertainty of a naive Bayesian classifier with let's say 4 feature variables and a binomial class variable?
Very nice video. Insightful, inutuitive and very well explained. Thank you!
this is a really great explanation, thanks so much for sharing mate!
Best instructor there is! Thanks
Actually, there is something wrong here. the entropy and information in information theory are representing the same thing which is how much information we will get after decoding the random message, so in case of the balls in the box if all are the same color we have no information after decoding the message as its probability to be red =1 hence low entropy and low information.
Luis, Thank you so much for this brilliant elucidation of information theory & entropy. Merely as an avocation, I have been toying around with a pet evolutionary theory about belief systems and societies. In order to test it - if that is even possible - I felt I needed to develop some sort of computer program as a model. Since I have very little programming experience and only mediocre math skills, I have been teaching myself both (with a lot of help from the web). It was purely by accident that I stumbled upon Claude Shannon and information theory, and I immediately became fascinated with the topic, and have a hunch that it may somehow be relevant to my own research. Regardless, I am now interested in it for its own sake. I had a an ephemeral understanding of how all the facets (probability, logs, choices, etc.) were all related mathematically, but it wasn't until after watching your video that I believe I fully grok the concept. At one point early on, I found myself shouting, "if he brings up yes/no questions, I know I understand this!" And then you did. It was such a wonderful moment for someone who finds math so challenging, and it is greatly appreciated! I shall check out your other videos later. You're a very good teacher!
For your work, I would look into some of the work by Loet Leydesdorf.
@@Faustus_de_Reiz Thank you! I shall.
Thank you for the very good video. Easiest to understand so far.
Lovely explanation...Superb
Thank you so much for a such a easy explanation...respect from india...
best, as always ❤️ thank you Luis❤️
Thank you very much for this beautiful and clear explanation!
That was highly intuitive, thank you, sir, I appreciate the effort behind this.
Thanks..Got the intuition behind Entropy
Great vid, I definately learned something. It could be improved by shortening it a bit by not going so deeply into the details of the "dummy" examples. Like, for the example with four red balls at 6:25, just say the probability of guessing all four balls correct is 100. No one needs an explanation of why that is. Another example is with the tree of questions. Saying "now, only one more question is needed" at 16:23 is enough, we don't need to go into details about those last questions and their outcome.
But, the aproach was very nice, I liked it. Just some tips for improvement for future vids.
wow, another great and insightful presentation . really helps to build intuition
Superb explanation. I like your teaching style. Thank you very much :-)
very lucid explanation - excellent, intuitive build-up to Shannon's theorem from scratch
Very clever explanation of mighty ENTROPY.
Easy and excellent explain, Please do for loss and cost function as well (convex)
Lovely job Luis! Very very good!
Hi Serrano do you have complete playlist of Information theory.
I think that the Huffman compression that you use and the end of the video is near the entropy value but not exactly the same
You explanation was crystal clear, if possible share some real time examples of data mining where entropy, gini index are used
Thanks for this perfect explanation 👏👏👏👍
At 13:34 the product does not equal 0.000488. It is approximately 0.000061035. You are missing the last 1/8 factor.
Hi.
Thanks a million times for simplifying a very complicated topic.
Kindly find time n post a simplified tutorial on mcmc....
I am overwhelmed by your unique communication skills.
Markov chain Monte Carlo.
God bless you.
Help us smash Markov chain Monte Carlo
you killed it. Great video
Great video!! Thank You. Would be great to add some explanation for information gain (as for example used for feature selection)
552 secs sequencing and entropy, milk that is perfectly "random in the coffe vs. seperated milk and coffee. remember the averge number is the hardest to get due to movement or variance.
so the average person is the hardest thing to be.
I have learned so much from your teaching. Thank you.
Syntropy is dual to increasing entropy -- The 4th law of thermodynamics!
Thesis is dual to anti-thesis -- The time independent Hegelian dialectic.
Schrodinger's cat: Alive (thesis, being) is dual to not alive (anti-thesis, non being) -- Hegel's cat.
Syntropy is the process of optimizing your predictions to track targets or teleological physics.
Teleological physics (syntropy) is dual to non teleological physics (entropy, information).
You are a gem in teaching sir...u can teach to any student,in any university across the world with this skills💯💯
If Shannon were alive, he would enjoy seeing such a perfect explanation for his theory. Many thanks.
Wow. Amazing video.
Great Intuition Luis
Thank you so much. You are such a good teacher, really :D :D :D
For the first time in my life i understand the real meaning of the Entropy
Although a good description of informatic entropy, the analogy used at the beginning of a phase change doesn't describe thermodynamic entropy very well. The reason why ice melting constitutes an increase in entropy in this case is because it is in an open thermodynamic system with its environment. Heat has been transferred from the room (a closed system) to the ice. It is this irreversible movement of heat from the room that constitutes the increase in entropy since the average temperature of the room and the ice has decreased and will continue decreasing until it reaches a stable equilibrium. Indeed, we would not arrive at gas if there was not sufficient potential energy in the room. While Boltzmann entropy is similar, the similarity lies in the fact that this transfer of heat understood on a macro level is translation of the probability of this energy distribution on a micro level. Entropy is then a measure of the extent to which the particles are in a probable microstate.
You are awesome!. I hope you will make more videos like this
Best video this week
十分谢谢! Thank you very much, Luis.
Best explanation of entropy. Thanks.
Mr. Luis Serrano III great job in Neural Network and Claude Shannon Entropy.
Excelente explicación! Gracias por compartirla.