i've never seen a well-worded explanation of temperature (as a casual ML enjoyer) but seeing the sigmoid morph with temperature and the relationship between stochastic and deterministic was such an awesome learning moment, thank you!
Hi Artem, I just want to say that in 3 weeks I begin my graduate degree in neuroscience, and it was your channel that inspired me to begin this journey two years ago. Keep up the good work, and I look forward to the inspiration for years to come.
@@joeybasile545 I haven't started yet, but the program is non-traditional, where instead of registering for classes, there is a dedicated period for lectures everyday that will cover all aspects of neuroscience, followed by lab rotations and research training. Subjects included are neuroanatomy, computational modeling, molecular biology and neurogenetics, vision, audition, and then for the labs, there are courses in EEG, microscopy, and cytochemisty, and this is about half of all the subjects covered. It truly is a comprehensive program, which upon completion will feed me right into a PhD track depending on what areas I have excelled in. My background is in math and computer science, so I am hoping to focus on the computational side of things, but who knows where I will eventually end up!
Quick change of name! For a while, I thought you knew of the Prize in forehand when I scrolled through my list of "saved for later". And watched it now, awesome Work!
With the 2024 Nobel Prize in physics awarded to the pioneering works introducing the Hopfield Network and Boltzmann Machines, your latest videos explaining exactly these topics were just timely enough to help us build a great understanding of such important tools :)
I never knew my background in Physics would make understanding this topic such a breeze. It is bizarre how in this world areas that look so different can be so close.
I get chills everytime someone tries to explain the differences between data's states and generator's states. The former is surface level while the later is highly abstracted. It says something about the many redundancies of the reality we live in and how there exists a general abstraction (math formalisms for example), or maybe that's just how we observe reality while being part of reality
@@judehammoud5959 I would argue a better conceptualization is to consider platonic ideals. Underlying distributions are like the platonic ideals of high dimensional manifolds, but we only every have access to their projections onto our reality.
This video was one of the bright spots of my day. It was well-crafted, reminded me of my work on ladder RBMs long, long ago, and got me thinking about how modern machines could build on these methods, and vice versa.
You have a knack to explain things in an understandable way without dumbing them down too much, thanks! Finally I know what the temperature setting actually does in a neural network, funny how analoguous it is to physical temperature :)
This guy is awesome. I can't explain how much more intelligent I feel after watching your video. Thank you so much for taking out time to educate people like us.
This is incredible stuff once again. You have pretty much covered everything I’m interested in neuroscience with insight I never possesed. I researched brain criticality and modeling but now on a boring day job. Glad we have people like you!
These are the first neural networks I learned. Fascinating and deep connections to statistical mechanics/thermal physics. Best class I ever took in uni.
The trouble with true creativity is intention. It's easy for humans to recognize things that we ourselves can produce and extrapolate patterns and impose experience and emotion on them but fundamentally if Randomness is the only thing driving the adaptation rather than transitive expression it is no more creative than a wind chime. You can think of it as the training data representing the tuning of each resonator and though we might FIND beauty in the emergent patterns, it is no more creative than its design and tuning, both requiring explicit human intervention. These models fed their own results very quickly deform into incomprehensible static
To add to this; the false equivalency and under emphasis of the human involvement in tuning is a large proponent of the demonstrably harmful supposition of replacing humans with machines; ignoring the value judgment that is imposed at every level of refinement. I deplore you to refrain from such false equivalencies as it's currently being used in attempts to undermine just about every creative field from engineering to writing to graphic design and would better be described as a sampling tool. These misconceptions have real world implications that are doing demonstrable societal harm. Take for example that even now I am fighting with the predictive text elements attempting to re orchestrate my unorthodox sentence structure and subsequently undermining the intent of my writing; that such a machine would have no insight into. It cannot understand meaning outside of Association and lacks any capability of truly understanding the emergent contradictions of language. So please stop describing these slot machines as creatives when its success is fundamentally built on confirmation bias.
Hilariously, your writing style is awkward and unnecessarily formal rather than creative. One would think computers would be just fine with such a style.
@@conduit242 for real, it's hard enough being autistic without my computer trying to fuck with me. We're both on the outside hear you'd think we'd be working together 🤣 but it's not the formality it's the variance that tends to fuck with predictive text. the tone was just to have assert a sincere formality to it. Like the larger issue of mechanization in Creative fields is a serious problem, full stop; and I think it's important the language we choose when we're talking about it
I love the wind chime analogy, that's a really cool conceptual analogy. I disagree with the basic premise that creativity requires intention, for example I'd say evolution is very creative but has no underlying "intention". It's also very well understood that human consciousness (and creativity) are fundamentally built on bias, indeed one can only learn if there is a bias to exploit. A very simple example of this is w.r.t identifying similarity of objects, we say a red cup is more similar to a blue cup than it is to a chair, however this requires a bias towards human every-day items. What I mean is that if we had to put a number on the similarity of blue cup and red cup, we could say they are 90% similar, while a chair is only 10%. Soon you run into trouble with this method, because how do you quantify how different a chair and a cup are from the ocean? what about a crimson cup? what about bacteria? what about a black hole? What about a cermanic red cup? What you see is that you need ever increasing detail, and you metric of similarity simply explodes or collapses to non-sense. Humans exploit bias to be able to think, to be able to logically classify items and objects and produce creative solutions.
I recently stumbled upon your channel, and it's absolutely fascinating! It ignites my curiosity and explains things in a way that awakens my inner child. Keep up the fantastic work!
22:42 it seems like the contrastive Hebbian is about rewarding true positives while also punishing false positives to allow more generalization without necessarily over fitting. 😎
Awesome video! I love this channel! I have a question, which I hope someone will clarify for me: if Boltzmann Machines are unsupervised, how do we know what data is meaningful (like number digits) and what data is just noise, so that we sculpt valleys around the meaningful patterns in the energy landscape? Similarly, in the weight update rule: updating iteratively works on maximizing the probability of the training data, equivalent to minimizing the energy of patterns, but the rule itself assumes we have to know beforehand what the patterns are (because of data - model). Can anyone help with an answer?
I suspect the evolved culture into which we all were born was perhaps entirely responsible for impressing meaning onto the buzzing blooming confusion of our earliest months as growing neural networks. ish
It is highly intuitive that the average kinetic energy in Boltzmman machine realise as scaling probabilistic triggering and the shape of that sigmoid curve an anti Fermi level statistics , appears as pattern synthesis. No doubt your excellent presentation and clear demonstration make you giant of learning channel on NOBEL DECISION . Good channel, thank you.
0:23 oh god. Reading that chatGPT answer hurts. That is equivalent to asking for a pasta recipe and seeing the answer starting with 1) start a greasefire in the kettle 2) for eight to ten minutes, pour water on it
What you could do for visualization is plot a distribution of x for the digits above them like a mountain that looks like an 8 is different from that of a 2 etc
I’m heavy into physics. I recall reading up on something how it’s possible to be moving very fast and never age. I believe if we do speed up our pc to be able to process data chips that allows these physics to happen within a pc…. I wonder … just what can happen .
Given our current understanding of Quantum Mechanics and energy levels being quantized, is the statement @8:08 true (is it constant with the same amount)?
I was worried i was missing something at 8:43 . Nevertheless, great vid, gonna continue watching now:) Thank you for making these explanations PS: appreciate the 3b1b music and style;)
Awesome video, thanks! Could the stochastic “hallucination” phase be related to hippocampal replay training cortical networks (“hidden” layer) during sleep?
Our brains activate neurons based on probabilities. Those are created by particles that follow laws pretty close to what is explored in thermodynamics and statistical mechanics. There is nothing more fitting than creating models that tend to mimic those aspects. Our computers are absolutely better than humans for problems we already know the equations. Because we know the uncertainty of every number in a computer. But for new problems, a probabilistic approach is very good.
Your video was great. Very clear explanation. Would have liked you to discuss training algos like grad descent or sim annealing. Next video? Giving the physics nobel prize for this is the equivalent of giving a nobel prize to someone for failing statmech. This is just an ad hoc application of an incorrect statistical distribution due to incorrect choice of algebra but compensating for the problems this introduces by throwing extra dimensions and parameters at the problem... it's basically the same flawed thought process that brought us string theory. Too bad anyone left in the academy that knew this is emeritus AF. ...smh.
Tbf fair we have automous driving, we have cars that can pilot themselves without incident. And that was achieved well before generative art models came along. The issue is not a self driving car, it is a self driving car in an environment heavily populated with people that are driving, walking, and people changing the environment as well as natural pheonomea chaing it spountansiously at nearly every moment. A much more difficult problem to solve than a vehical piloting itself without incident once those factors have been controled for as much as possible.
A physicist comment: if I understand your presentation correctly, the original Hopfield algorithm is the zero temperature limit of the Boltzmann Machine. The hidden levels, I would guess, are just an efficiency enhancement. i.e., there would be a large enough No-hidden-layers network of equivalent performance to any network with hidden layers. Most likely someone proved such theorem already.
This hypnotic video rendered me briefly unconscious several times so I'll have to watch again but the impression I got from this first viewing, in regard to hidden-layers, was that they maintain memories in a kind of holographic way that might not be available in a no-hidden-layers network.
This is a good video, but the history given in the first few minutes is completely hallucinated. Associative memories are as old as van Neumann architectures, and thinking like humans has always been the first goal of researchers. Calculating exact trajectories was a useful stepping stone.
The math doesn't add up at 8:48. If -ln[p]/ε = T, then e^(ln[p] * ΔE / ε) = e^(ln[p] / ε * ΔE) = e^(-(-ln[p] / ε) * ΔE) = e^(-T * ΔE), and not e^(-ΔE / T)
Join Shortform for awesome book guides and get 5 days of unlimited access! Get 20% off at shortform.com/artem
First the Hopfield Network video and now this?! And only a month apart? I cannot thank you enough for the value that you've added to this platform
Thank you!
AI's not the only one hallucinating, can't believe the rate and quality at which Artem is publishing these videos, thank you so much!
i've never seen a well-worded explanation of temperature (as a casual ML enjoyer) but seeing the sigmoid morph with temperature and the relationship between stochastic and deterministic was such an awesome learning moment, thank you!
Hi Artem,
I just want to say that in 3 weeks I begin my graduate degree in neuroscience, and it was your channel that inspired me to begin this journey two years ago. Keep up the good work, and I look forward to the inspiration for years to come.
What classes are you taking right now?
@@joeybasile545it’s summer holiday
Wow, congrats!!
Good Luck. And when things get tough, and they will… endeavor to persevere.
@@joeybasile545 I haven't started yet, but the program is non-traditional, where instead of registering for classes, there is a dedicated period for lectures everyday that will cover all aspects of neuroscience, followed by lab rotations and research training. Subjects included are neuroanatomy, computational modeling, molecular biology and neurogenetics, vision, audition, and then for the labs, there are courses in EEG, microscopy, and cytochemisty, and this is about half of all the subjects covered. It truly is a comprehensive program, which upon completion will feed me right into a PhD track depending on what areas I have excelled in. My background is in math and computer science, so I am hoping to focus on the computational side of things, but who knows where I will eventually end up!
Quick change of name! For a while, I thought you knew of the Prize in forehand when I scrolled through my list of "saved for later". And watched it now, awesome Work!
Watching your "AI & Machine Learning" playlist feels like binge watching my favorite show. Hope you continue them. You are an amazing teacher
With the 2024 Nobel Prize in physics awarded to the pioneering works introducing the Hopfield Network and Boltzmann Machines, your latest videos explaining exactly these topics were just timely enough to help us build a great understanding of such important tools :)
I never knew my background in Physics would make understanding this topic such a breeze. It is bizarre how in this world areas that look so different can be so close.
I get chills everytime someone tries to explain the differences between data's states and generator's states. The former is surface level while the later is highly abstracted. It says something about the many redundancies of the reality we live in and how there exists a general abstraction (math formalisms for example), or maybe that's just how we observe reality while being part of reality
theory of constructed emotion / active inference ;)
@@judehammoud5959 I would argue a better conceptualization is to consider platonic ideals. Underlying distributions are like the platonic ideals of high dimensional manifolds, but we only every have access to their projections onto our reality.
Always happy to watch your uploads. The Boltzmann distribution is something that I think is often misunderstood. So thank you for this video!
Wow you really nicely explained what Boltzmann machines are and where they come from, and the animation in super pretty ! Thank you Mr Kirsanov
This video was one of the bright spots of my day. It was well-crafted, reminded me of my work on ladder RBMs long, long ago, and got me thinking about how modern machines could build on these methods, and vice versa.
Okay, you taught me about boltzman distribution better than my school physics teacher and it wasn't even the main point of what you were trying to do
Fantastic video. A small typo however at 08:41. There you denote -ln[p]/epsilon = T. It should be: -epsilon/ln[p] = T.
Thanks! Good catch!
Is this the relationship that relates temperature with differentials of energy and entropy?
You have a knack to explain things in an understandable way without dumbing them down too much, thanks! Finally I know what the temperature setting actually does in a neural network, funny how analoguous it is to physical temperature :)
This guy is awesome. I can't explain how much more intelligent I feel after watching your video. Thank you so much for taking out time to educate people like us.
This is incredible stuff once again. You have pretty much covered everything I’m interested in neuroscience with insight I never possesed. I researched brain criticality and modeling but now on a boring day job. Glad we have people like you!
Thank you!!
These are the first neural networks I learned. Fascinating and deep connections to statistical mechanics/thermal physics. Best class I ever took in uni.
It is really great to visually explain such complex and valuable information in such an understandable way!
Great video! Amazing visualizations and clarity of explanation!
My sleep-deprived layperson brain is so engrossed in the high level concepts that I got hung up on 32 × 32 = 1024
The trouble with true creativity is intention. It's easy for humans to recognize things that we ourselves can produce and extrapolate patterns and impose experience and emotion on them but fundamentally if Randomness is the only thing driving the adaptation rather than transitive expression it is no more creative than a wind chime. You can think of it as the training data representing the tuning of each resonator and though we might FIND beauty in the emergent patterns, it is no more creative than its design and tuning, both requiring explicit human intervention. These models fed their own results very quickly deform into incomprehensible static
To add to this; the false equivalency and under emphasis of the human involvement in tuning is a large proponent of the demonstrably harmful supposition of replacing humans with machines; ignoring the value judgment that is imposed at every level of refinement. I deplore you to refrain from such false equivalencies as it's currently being used in attempts to undermine just about every creative field from engineering to writing to graphic design and would better be described as a sampling tool. These misconceptions have real world implications that are doing demonstrable societal harm. Take for example that even now I am fighting with the predictive text elements attempting to re orchestrate my unorthodox sentence structure and subsequently undermining the intent of my writing; that such a machine would have no insight into. It cannot understand meaning outside of Association and lacks any capability of truly understanding the emergent contradictions of language. So please stop describing these slot machines as creatives when its success is fundamentally built on confirmation bias.
😮
Hilariously, your writing style is awkward and unnecessarily formal rather than creative. One would think computers would be just fine with such a style.
@@conduit242 for real, it's hard enough being autistic without my computer trying to fuck with me. We're both on the outside hear you'd think we'd be working together 🤣 but it's not the formality it's the variance that tends to fuck with predictive text. the tone was just to have assert a sincere formality to it. Like the larger issue of mechanization in Creative fields is a serious problem, full stop; and I think it's important the language we choose when we're talking about it
I love the wind chime analogy, that's a really cool conceptual analogy.
I disagree with the basic premise that creativity requires intention, for example I'd say evolution is very creative but has no underlying "intention".
It's also very well understood that human consciousness (and creativity) are fundamentally built on bias, indeed one can only learn if there is a bias to exploit. A very simple example of this is w.r.t identifying similarity of objects, we say a red cup is more similar to a blue cup than it is to a chair, however this requires a bias towards human every-day items.
What I mean is that if we had to put a number on the similarity of blue cup and red cup, we could say they are 90% similar, while a chair is only 10%. Soon you run into trouble with this method, because how do you quantify how different a chair and a cup are from the ocean? what about a crimson cup? what about bacteria? what about a black hole? What about a cermanic red cup?
What you see is that you need ever increasing detail, and you metric of similarity simply explodes or collapses to non-sense.
Humans exploit bias to be able to think, to be able to logically classify items and objects and produce creative solutions.
Fantastic. Absolutely phenomenal work here.
This is insane! I love your videos on this channel! I’m just waiting for your channel to exponentially boom to a million subscribers.
So glad for another upload! You have no idea how fast I clicked!
I recently stumbled upon your channel, and it's absolutely fascinating! It ignites my curiosity and explains things in a way that awakens my inner child. Keep up the fantastic work!
Wow, thank you so much!
I eventually grasped the notion of RBM. Thx
happy to see you in the US!! Hope you thrive here
Good time to rename this video to "The generative model that won nobel prize in physics 2024"
ahaha, good point!
beautiful explanation.
Wonderful animations and impeccable explannations. Thank you so much.
Great content, dark background and graphics !! Keep it coming
amazing, detailed and easy to understand. thank you so much
22:42 it seems like the contrastive Hebbian is about rewarding true positives while also punishing false positives to allow more generalization without necessarily over fitting. 😎
Awesome video! I love this channel! I have a question, which I hope someone will clarify for me: if Boltzmann Machines are unsupervised, how do we know what data is meaningful (like number digits) and what data is just noise, so that we sculpt valleys around the meaningful patterns in the energy landscape? Similarly, in the weight update rule: updating iteratively works on maximizing the probability of the training data, equivalent to minimizing the energy of patterns, but the rule itself assumes we have to know beforehand what the patterns are (because of data - model). Can anyone help with an answer?
I suspect the evolved culture
into which we all were born
was perhaps entirely responsible
for impressing meaning onto
the buzzing blooming confusion of our earliest months
as growing neural networks. ish
My 2nd physics class adjunct prof told me his fave subject was statistical physics, now I get it 🙏
Best explainers, hands down
Marvelous. Thank you. I almost forgotten how delicious mathematic is.
This video is amazing man 🔥
It is highly intuitive that the average kinetic energy in Boltzmman machine realise as scaling probabilistic triggering and the shape of that sigmoid curve an anti Fermi level statistics , appears as pattern synthesis.
No doubt your excellent presentation and clear demonstration make you giant of learning channel on NOBEL DECISION .
Good channel, thank you.
0:23 oh god. Reading that chatGPT answer hurts. That is equivalent to asking for a pasta recipe and seeing the answer starting with
1) start a greasefire in the kettle
2) for eight to ten minutes, pour water on it
Nice animation and love the first generative models!
Can you do a course on Markov / Semi Markov / Hidden Markov / Semi Hidden Markov models please.
What you could do for visualization is plot a distribution of x for the digits above them like a mountain that looks like an 8 is different from that of a 2 etc
Great video. There's a small typo around 9:15. ln(1/p)/epsilon would rather be 1/T.
I like the passion I feel from you in your videos! I just wanted to inform you that there is am small typo at 15:06 in the bottom right corner
Super great explanation !
I've learn today, as many times before. Always finish the video before leaving an angry comment.
I’m heavy into physics.
I recall reading up on something how it’s possible to be moving very fast and never age.
I believe if we do speed up our pc to be able to process data chips that allows these physics to happen within a pc…. I wonder … just what can happen .
Great videos
Man your videos are just awesome, and I finally understood the boltzman formula xD
the 3b1b of neuroscience an ML, thx for the videos!
Thank you❤such an extraordinary presentation with relations and simplicity
Given our current understanding of Quantum Mechanics and energy levels being quantized, is the statement @8:08 true (is it constant with the same amount)?
I was worried i was missing something at 8:43 . Nevertheless, great vid, gonna continue watching now:) Thank you for making these explanations
PS: appreciate the 3b1b music and style;)
Thank you, this is very interesting. Keep up the good work.
Thanks amazing work I love this topic :)
Awesome video, thanks!
Could the stochastic “hallucination” phase be related to hippocampal replay training cortical networks (“hidden” layer) during sleep?
Our brains activate neurons based on probabilities. Those are created by particles that follow laws pretty close to what is explored in thermodynamics and statistical mechanics. There is nothing more fitting than creating models that tend to mimic those aspects. Our computers are absolutely better than humans for problems we already know the equations. Because we know the uncertainty of every number in a computer. But for new problems, a probabilistic approach is very good.
Your video was great. Very clear explanation. Would have liked you to discuss training algos like grad descent or sim annealing. Next video?
Giving the physics nobel prize for this is the equivalent of giving a nobel prize to someone for failing statmech. This is just an ad hoc application of an incorrect statistical distribution due to incorrect choice of algebra but compensating for the problems this introduces by throwing extra dimensions and parameters at the problem... it's basically the same flawed thought process that brought us string theory. Too bad anyone left in the academy that knew this is emeritus AF. ...smh.
9:05 I guess you meant -ln p / epsilon = 1/T. But that is minor, I like the video
ive been searching for this comment, was wondering if im missing something. Thank you kind stranger
Sweet Vid… Rock On!
Just came across to this. Top tier content
yea you are absolutely goated
Insanely high quality content
Wow, so fascinating 😍
Great video!
great video
Tbf fair we have automous driving, we have cars that can pilot themselves without incident.
And that was achieved well before generative art models came along.
The issue is not a self driving car, it is a self driving car in an environment heavily populated with people that are driving, walking, and people changing the environment as well as natural pheonomea chaing it spountansiously at nearly every moment.
A much more difficult problem to solve than a vehical piloting itself without incident once those factors have been controled for as much as possible.
Amazing keep going 👍🏼
great stuff, keep up Your good work
Good work.
What temperature optimizes for the highest range of perplexity values?
Thank you I learned something
great content❤❤❤
14:00 This is strongly related to simulated annealing optimizacion method
Thank you
How do you create your animations? This is awesome.
After Effects + Python + Blender :)
I have a video about it that might help: th-cam.com/video/yaa13eehgzo/w-d-xo.htmlsi=EcoTIRW9Qhnnb9xS
13:14 Softmax wasn’t mentioned?
Thanks ❤️
🙂¡Gracias!
🫶
lol the new title made me watch it again on accident😊
A physicist comment: if I understand your presentation correctly, the original Hopfield algorithm is the zero temperature limit of the Boltzmann Machine.
The hidden levels, I would guess, are just an efficiency enhancement. i.e., there would be a large enough No-hidden-layers network of equivalent performance to any network with hidden layers. Most likely someone proved such theorem already.
This hypnotic video rendered me briefly unconscious several times
so I'll have to watch again but
the impression I got from this first viewing, in regard to hidden-layers,
was that they maintain memories in a kind of holographic way that
might not be available in a no-hidden-layers network.
Yes many thanks.
At 2x speed it sounded like you said "what sparked this sh*t" 😂
AI is even winning our awards.
Wonderful❤
And that Hinton said was the wrong path and now only a historical curiosity. Good for him he used the name of a Physic to baptize these models.
As a Boltzmann Brain in a fever dream, I found this video very insightful into my waking nightmare.
I fn Love ur channel buddie
20:47 spelling error. Thank you
I was expecting a Brilliant Sponsorship 😂
As a (struggling) physics majored student, I feel uncomfortable at given so much explanation on why partition function is written that way
This is a good video, but the history given in the first few minutes is completely hallucinated. Associative memories are as old as van Neumann architectures, and thinking like humans has always been the first goal of researchers. Calculating exact trajectories was a useful stepping stone.
Here after nobel prize winners announcement)
Videos are great. My attention span is just 10 minutes.
Dr. Chandra. Will I dream?😁
Can You Share Video Code?
The math doesn't add up at 8:48. If -ln[p]/ε = T, then e^(ln[p] * ΔE / ε) = e^(ln[p] / ε * ΔE) = e^(-(-ln[p] / ε) * ΔE) = e^(-T * ΔE), and not e^(-ΔE / T)
@ArtemKirsanov