Hey, I've been working on this for several years now, one of my fav reseach areas, which I hope to see becoming applied really soon. Glad to see your interest!
this is so called ant colony optimization (ACO) or Particle swarm optimization (PSO) in the computational intelligence, the artificial neurl network and deep learning are also belong to the computional intelligence, but ACO and PSO are not popular any more.
It's a really interesting topic, and one more best example of swarm intelligence is slime mold, which is used to design most efficient subway station in tokyo
@@mrguiltyfoolits worse then other methods, we can literally approximate any behaviour with enough data using more modern (and relatively efficient) techniques (aka back propogation). This one is reliant on the granularity of the swarm (inefficient). Its like using apples to describe addition, we can do it without the apples now lol. Well thats somewhat of an unfair comparison, there are still use cases for it (lack of data, natural abstraction for a problem) but its not popular.
Training neural networks involves solving an optimization problem. Early neural network researchers chose gradient descent to solve this optimization problem and everyone else since then has continued to use gradient descent because it’s very simple and it gets the job done. The optimization problem in neural networks has many local minima that are all roughly equally as good as each other, so there hasn’t been any reason to adopt more complicated or robust optimization techniques like swarms. Neural networks don’t need to find the best solution. Any local minimum will do.
Excellent. As a an old philosopher of this business you managed to dodge every common conceptual error in this business which swarms with misunderstandings.
That was a much more interesting way to relate the problem than I've seen done before. I also like the nature shots as a lead up. Really breaks the tedium of an otherwise boring subject.
are you kidding me? I was hoping for some real world examples, instead I got a tedious logical explanation with no soul. SO LAME. THANK YOU FOR YOUR HARD WORK.
the reason it is so interesting is not because it is better than Adam, GD, etc. it is so interesting because it massively parallelizes the search with "low energy" expenditure. There are much more efficient algorithms fir high dimensional space though, far better than Adam or GD
Really hoping this video fully addresses its title, cuz I spent all the time learning to implement this sh** and I’m struggling to find applications other than bragging about how 1337 I am
The inertia + memory vector makes no sense. Not only would they cancel each other out, it also won't make an agent revisit the original area. It just makes agents slower to converge on the global best position.
He mentions that those vectors can have different weights. So you can tweak the algorithm to favor either the inertia, best social score or the memory. So there are versions of the algorithm where the inertia and memory vectors don't cancel out.
They only cancel out on the first step away from the personal best. If the particle has travelled away since then, the inertia and memory vectors can have different angles too.
oh i get it, pascals triangle numbers represent 2/3 = 6666.... but it also represents 1/2 of a circumference, but it also represents a whole number = one= 1 universe... yeah i know this is beyond crazy math scientists, but it is accurate...
how does this method compare to something like a genetic algorithm? Under what assumptions would this outperform (converge faster) than a genetic algorithm?
Does this really scale? Rather than 3 warehouses in 2D, what if it was W warehouses in N dimensions? Like 100 in 100? It seems like there's a lot of arbitrary choices in the fitness function, or is there theoretical grounding?
From what I gathered from my limited experience, these swarm algorithms can be amazing in complex optimization problems (so rather than finding the minimum, it's finding minimums maximums, midpoints, etc), however it's scaling is pretty poor. Backpropagation is just insanely efficient, while this basically calculate pairwise distances, then uses those to create vectors, then has a memory component, plus a global memory component, for multiple points. The multiples multiply quickly. As far as the fitness function. You must define the actual optimization more explicitly than most ML applications, which is theory based. Weighing the vectors is similar to learning rate with gradient descent. There's no one size fits all answer, but there's rules of thumb that are generally good.
@@jeffreychandler8418 Thanks for explaining. I looked a little more into it too and even the trade offs involved in nearest neighbour search are quite nuanced, figuring out for a given problem how much to invest in precomputing a graph/tree/reduced dimensionality approximation first, or just do N comparisons every step for each particle.
@@luke.perkin.inventor that is the fun part of optimization, it is an endless rabbit hole of odd nuances. Like I've worked on computing nearest neighbors to predict stuff and used a lot of little tricks to avoid expensive pairwise calculations, sorts, etc.
please make the background music at least half as loud for future videos, or even quieter. i want to watch the video, but it's too draining to try to hear (and underestand/process) your narration from underneath that background music, so i quit watching the video.
@@b001 a swarm of watchers nudging you towards an answer! I suggest looking into adding a bit of sidechain compression. it would make the music move aside in response to your voice, increasing its importance and focus, but leaving the ambiance untouched.
I agree. However, it is a difficult problem for the creator since it is so dependent on the listener's ears. It is bizarre that after 19 years of TH-cam they still don't allow posting of a multiple track audio so the listener can adjust the background music themselves.
The logic is flawed from over simplification! Thus, the principle does not consider evolutionary and biological factors shaping this behavior. We still do not understand this behavior we'll enough to apply policy optimization towards AIML
Hey, I've been working on this for several years now, one of my fav reseach areas, which I hope to see becoming applied really soon. Glad to see your interest!
Hi, are you working on the problem or swarm algorithms?
I'm curious about your general experience/takes about these.
this is so called ant colony optimization (ACO) or Particle swarm optimization (PSO) in the computational intelligence, the artificial neurl network and deep learning are also belong to the computional intelligence, but ACO and PSO are not popular any more.
It's a really interesting topic, and one more best example of swarm intelligence is slime mold, which is used to design most efficient subway station in tokyo
Is there any reason why it is not popular anymore?
@@mrguiltyfool Probably like most techniques that aren't popular in their respective fields
@@mrguiltyfoolits worse then other methods, we can literally approximate any behaviour with enough data using more modern (and relatively efficient) techniques (aka back propogation). This one is reliant on the granularity of the swarm (inefficient). Its like using apples to describe addition, we can do it without the apples now lol. Well thats somewhat of an unfair comparison, there are still use cases for it (lack of data, natural abstraction for a problem) but its not popular.
Training neural networks involves solving an optimization problem. Early neural network researchers chose gradient descent to solve this optimization problem and everyone else since then has continued to use gradient descent because it’s very simple and it gets the job done. The optimization problem in neural networks has many local minima that are all roughly equally as good as each other, so there hasn’t been any reason to adopt more complicated or robust optimization techniques like swarms. Neural networks don’t need to find the best solution. Any local minimum will do.
crazy good content and animations. thank you
Such an awesome concept, thanks for sharing!
Great video. (Kinda just commenting to boost your videos in the algorithm, because these videos deserve more views)
Excellent. As a an old philosopher of this business you managed to dodge every common conceptual error in this business which swarms with misunderstandings.
That was a much more interesting way to relate the problem than I've seen done before. I also like the nature shots as a lead up. Really breaks the tedium of an otherwise boring subject.
without a doubt one of the best channels on youtube. This is premium quality content.
are you kidding me? I was hoping for some real world examples, instead I got a tedious logical explanation with no soul. SO LAME. THANK YOU FOR YOUR HARD WORK.
When I sign up to Brilliant I’ll use your link, you’ve earned it
How does this only have 2000 views? This is such a high quality video.
This sounds like doing gradient descent multiple times just with extra steps
You explain better than my lecturer. Thanks 🎉
Still waiting to know what color theme he is using, it looks incredible
Synthwave’84, no glow
@@b001 Thank you so much !!! and congrats on the video btw, such a great topic and great animations, keep going !!
the reason it is so interesting is not because it is better than Adam, GD, etc. it is so interesting because it massively parallelizes the search with "low energy" expenditure. There are much more efficient algorithms fir high dimensional space though, far better than Adam or GD
Somehow it reminded me of grid search, random search and Bayesian search
BOOL FINALLY DROPPED!
sound some similarity like KNN calculation
Great video, it's makes hard things simple!
Really hoping this video fully addresses its title, cuz I spent all the time learning to implement this sh** and I’m struggling to find applications other than bragging about how 1337 I am
Cool video, thank you ❤
The inertia + memory vector makes no sense. Not only would they cancel each other out, it also won't make an agent revisit the original area. It just makes agents slower to converge on the global best position.
He mentions that those vectors can have different weights. So you can tweak the algorithm to favor either the inertia, best social score or the memory. So there are versions of the algorithm where the inertia and memory vectors don't cancel out.
They only cancel out on the first step away from the personal best. If the particle has travelled away since then, the inertia and memory vectors can have different angles too.
Hi, What did you use to make the video, or animation with in this video ?
Swarm School
Swarm optimization is unfortunately not feasible for very large search spaces
Brilliant
oh i get it, pascals triangle numbers represent 2/3 = 6666.... but it also represents 1/2 of a circumference, but it also represents a whole number = one= 1 universe... yeah i know this is beyond crazy math scientists, but it is accurate...
can swarms solve the problem of odd perfect numbers? (OPNs)
what bgm you used ?
how does this method compare to something like a genetic algorithm?
Under what assumptions would this outperform (converge faster) than a genetic algorithm?
Being the closest to the green squares in the given examples are also being the farthest away from them. Was that intentional? 😂
5:30 - 5:48 I thought can this run on a neural network?
Does this really scale? Rather than 3 warehouses in 2D, what if it was W warehouses in N dimensions? Like 100 in 100? It seems like there's a lot of arbitrary choices in the fitness function, or is there theoretical grounding?
From what I gathered from my limited experience, these swarm algorithms can be amazing in complex optimization problems (so rather than finding the minimum, it's finding minimums maximums, midpoints, etc), however it's scaling is pretty poor. Backpropagation is just insanely efficient, while this basically calculate pairwise distances, then uses those to create vectors, then has a memory component, plus a global memory component, for multiple points. The multiples multiply quickly.
As far as the fitness function. You must define the actual optimization more explicitly than most ML applications, which is theory based. Weighing the vectors is similar to learning rate with gradient descent. There's no one size fits all answer, but there's rules of thumb that are generally good.
@@jeffreychandler8418 Thanks for explaining. I looked a little more into it too and even the trade offs involved in nearest neighbour search are quite nuanced, figuring out for a given problem how much to invest in precomputing a graph/tree/reduced dimensionality approximation first, or just do N comparisons every step for each particle.
@@luke.perkin.inventor that is the fun part of optimization, it is an endless rabbit hole of odd nuances. Like I've worked on computing nearest neighbors to predict stuff and used a lot of little tricks to avoid expensive pairwise calculations, sorts, etc.
Wow
please make the background music at least half as loud for future videos, or even quieter. i want to watch the video, but it's too draining to try to hear (and underestand/process) your narration from underneath that background music, so i quit watching the video.
Noted. After all these years I’m still learning and struggling to find the right audio levels, and video ambience. Thanks for the feedback!
Fwiw, I personally didn’t mind this level of background audio
@@iamtraditi4075I did find it quite distracting, probably not as much as OP though.
@@b001 a swarm of watchers nudging you towards an answer!
I suggest looking into adding a bit of sidechain compression. it would make the music move aside in response to your voice, increasing its importance and focus, but leaving the ambiance untouched.
I agree. However, it is a difficult problem for the creator since it is so dependent on the listener's ears. It is bizarre that after 19 years of TH-cam they still don't allow posting of a multiple track audio so the listener can adjust the background music themselves.
first!
second!
Enough ais and you can generate a realistic chunk of a 3 dimensional object in a simulation.
The logic is flawed from over simplification! Thus, the principle does not consider evolutionary and biological factors shaping this behavior. We still do not understand this behavior we'll enough to apply policy optimization towards AIML