I like how single order memory was able to remember a promise made to him earlier (memory of a PREVIOUS event)😂😂 ...but couldn't remember a single PREVIOUS note
+HTM School One thing I've noticed that is absent from the theory is attention. How is attention reflected in the theory? The neuroscience seems to indicate that it's resource allocation, is that reflected here?
We have not tackled attention yet. It is a key component of intelligence theory, and Jeff mentions it in a lot of his presentations. But we don't fully understand the mechanisms of attention, and we honestly do not require them at this point. So we put that off. I don't think the implementation details of attention will affect current HTM Theory, but in the future it will need to be addressed.
When is the new episode coming? I am working in Network Security Domain and trying to incorporate HTM there for better security perspectives. Any possible support in this context is more than welcome. Thanks. Your videos are a great effort towards focused learning.
Coming in December. Been working on conference talks, interviews, and animations lately. You can see some of the content I'll be presenting here: th-cam.com/video/-h-cz7yY-G8/w-d-xo.html. Otherwise, join our forums! discourse.numenta.org/
Can anyone please explain how choosing the winner cell in a bursting column works biologically? In the case that there is a previously active cell connected to the winner cell, I suppose that the previously active cell must have somehow primed the winning cell to fire (is there some inhibition on the other cells in the bursting column going on?), but how is that accomplished if the permanence wasn't high enough? And even more confusing to me is case that there are no segments connecting the bursting column to a previously active column. How is the winner cell with the fewest segments found? Then the video says that we increase the permanence values on the segments with the previously active column. But there were none! So we grow new synapses? OK, but then, is it biologically feasible to grow a synapse to an arbitrary minicolumn within a layer, not taking into account the physical distance between the minicolumns? I checked the paper "A Theory of Sequence Memory in Neocortex" and there are is citations or justifications when describing these particular aspects of TM. It would be nice to have the correspondence to neurobiology documented somewhere - it makes the theory much more credible, especially to neuroscience newbies.
Great questions! Too much detail to answer in a TH-cam comment. Please join our forums, where you'll find a lot of pre-existing resources on bursting: discourse.numenta.org/search?q=bursting
Could it be used in generative mode too? E.g. to replay a previously learned sequence with some variations that are still similar to the original learned ones?
Yes, but the trick is know what sequences are currently matching the input, and how well they are matching. It is tricky identify and labeling these sequences. In the brain, we do this in a different layer of cortex in a pooling operation that resolves objects over time. This is still a research area, but you’ll see more about it in later episodes (cortical circuitry)
great video just like the previous ones but i didn't get something: if more than one cell was predicted in a column ( like predicting one spatial pattern in two different contexts ), which one would be the winner cell?
Neocortex comprises millions (?) of columns. Each column comprises 7 (?) layers. Additionally neocortex has regions (like stains on a napkin). (Although I have no idea how signals get from one region to another -- does the brain make use of the fact that the neocortex is squished up to pass signals between distant locations?). These cortical columns ... How do they compare with the 'bursting' columns in the HTM-school vids? Do the vids focus on modelling just one particular layer?
You are getting "Cortical Columns" and "minicolumns" mixed up. Cortical Columns are a large structure that includes all the layers in your cortex. Minicolumns are small structures that exist within the layers, allowing temporal memory to occur. I'll have a lot more details about "Cortical Columns" coming in the next episode.
Cool, I was not aware of that (I am not a neuroscientist!). So I started a thread on HTM Forum to ask the experts: discourse.numenta.org/t/is-bursting-related-to-p300/2649
Thanks Matt it was wonderful as always. Unfortunately I have not understood the winner cell concept completely. Is there any other source to explain its process more deeply?
Just out of curiosity, whats the point of some columns not sharing any connections to other columns? I see how this would be the case in our brains just by the somewhat arbitrary nature of the neurons dendrites, but in terms of software why not just assume that all distal connections can go to every other column? Is there a benefit to having the neurons "grow" new connections? Thanks.
+Benjamin Jordan yes there is a huge benefit. It means you can forget old things and learn new things without retraining. It enabled unsupervised learning.
Hey, I got lost at the end...around minute 6:30. What's the purpose of choosing a winner cell if you're going to strengthen all the connections of all the cells in the column? Does the chosen winner get any special treatment as far as strengthening connections is concerned?
We strengthen the synapses between all the cells in the column toward previous winner cells because we don't yet have a prediction for this sequence yet. We have not learned it. The transition could potentially be any cell in the column, because none of them currently have strong enough permanences to previous winner cells to become predictive. As we continue to see this sequence repeated (or even very similar sequences repeated), eventually one of the cell's synapses will become permanent to force one cell to become predictive, thus breaking out of the bursting for that sequence.
HTM School, Then why choose a winner cell in that column if you are going to strengthen all segments. Because say on the next time step a different cell almost predicts the last time step...wouldn't that then be the winner?
Yes, that cell is the winner, and it gets strengthened, maybe enough to become connected and predictive on the next time step. Remember that spatial patterns can be subtly different, but still semantically similar. These patterns might be noisy, so this mechanism cannot go chasing after noise. Using the winner cells this way, which might eventual turn into synaptic connections, allows temporal generalization.
HTM School So if no cells in a bursting column have connections to the winner cells in the previous time step we choose the winner cell with the fewest connections. But then what? What do we then do to this winner cell. We are going to grow the segments of all the cells in the bursting column so what makes this one special?
Judging from your detailed questions, you might want some answers more detailed than I can give you here in TH-cam comments. Here are some papers linked from numenta.com/temporal-memory-algorithm/ you should read.
So when a column is activated ( determined by the overlap_threshold ) only one cell within that column is activated. Could you please explain a little more the differences between mini column activation and cell activation.
Is it understood how this sequence memory turns into "higher order sequence memory"? the feedback neurons? if so, can you give me some description or some links on it?
i meant sequence inside sequence. as i understood the number of contexts this algorithm can remember, is of linear order to the number of cells per minicolumn. and if 32 cells makes 32 different contexts possible, in problem like natural language, that makes only one or two previous words, which definitely doesn't get the job done. so don't we need to recognize where we are in the larger sequence? i remember a paper by Numenta telling that it can predict not only the next letter, but to some extent, the next syllable and the next word.
It is not limited in the way you are speaking. Remember that there are potentially thousands of minicolumns, each looking at different spatial aspects of the input. They all have different receptive fields. Each one is looking at a specific aspect of the input space and recognizing those spatial patterns temporally. Each column is limited to the amount of temporal contexts one input can be recognized, but working together they put together a much richer picture of the spatio-temporal space.
yes, but considering the language example again, is the spatial pattern of 'A' any different from the spatial pattern of 'A' ? and if 32 contexts make.. one letter (that's the exact number in my mother tongue :) ) so.. what we can do with that? does it give us anything other than a one letter context? and besides that, what the feedback segments are for :) is that understood?
A and A' spatial patterns will be the same. BUT there are not only 32 contexts for this spatial pattern. Each minicolumn will see a different part of "A" because they have different receptive fields for the input space. Each will create different temporal contexts for the receptive field of "A" that it sees. One column might recognize the bar across the letter. Other letters will also have a bar (like H). "A" is only recognized when many minicolumns predict that A is coming next, each looking at a different receptive field of the spatial input space.
Did you get to the end where I talked about sensorimotor integration? Watch the video I linked to for more info. It is fascinating what your brain can do with unions of SDRs. Btw what is CNC?
thanks! I missed the videos, I will check them out! oh and Computer Numerical Control, the system for controlling mills, lathes, 3D printers, everything; people are still the best bet for getting nigh impossible tolerances, due to their feel, for their machinery..
It can be, depending on how many neurons are in the structure. We typically create a layer with 2048 minicolumns with 32 cells per column (65,536 cells). And we get 20ms per time cycle on our development laptops consistently.
This computation looks highly parallelizable to me...I suspect computation time could be brought down extensively with GPUs. Imagine assigning each cell a GPU core with a specified function to perform at each time step, perhaps.
Yes, it's hokey. Who cares. You are doing a great job. Good to hear that this will be your focus over the next year.
I like how single order memory was able to remember a promise made to him earlier (memory of a PREVIOUS event)😂😂
...but couldn't remember a single PREVIOUS note
😂😂
Amazing video series - good visualizations to show the concepts and with self humor. Please keep on the good work!
This just gets better and better! Really finding it very easy to keep going as its not just dry lectures.
You sir deserve a hat tip "what a great explanation".
Funny video but very educative! Great job!
This is great looking forward to more. In the meantime I think I may now have a shot at following the papers.
That was the purpose! Good luck with the papers. More videos coming about Cortical Columns in a month or so.
Matt, you are doing a great job! Keep going, please.
I have been waiting so long for this!
very simple explanation of a complex concept.
+HTM School One thing I've noticed that is absent from the theory is attention. How is attention reflected in the theory? The neuroscience seems to indicate that it's resource allocation, is that reflected here?
We have not tackled attention yet. It is a key component of intelligence theory, and Jeff mentions it in a lot of his presentations. But we don't fully understand the mechanisms of attention, and we honestly do not require them at this point. So we put that off. I don't think the implementation details of attention will affect current HTM Theory, but in the future it will need to be addressed.
SO GREAT!!!
When is the new episode coming? I am working in Network Security Domain and trying to incorporate HTM there for better security perspectives. Any possible support in this context is more than welcome. Thanks. Your videos are a great effort towards focused learning.
Coming in December. Been working on conference talks, interviews, and animations lately. You can see some of the content I'll be presenting here: th-cam.com/video/-h-cz7yY-G8/w-d-xo.html. Otherwise, join our forums! discourse.numenta.org/
Can anyone please explain how choosing the winner cell in a bursting column works biologically? In the case that there is a previously active cell connected to the winner cell, I suppose that the previously active cell must have somehow primed the winning cell to fire (is there some inhibition on the other cells in the bursting column going on?), but how is that accomplished if the permanence wasn't high enough?
And even more confusing to me is case that there are no segments connecting the bursting column to a previously active column. How is the winner cell with the fewest segments found?
Then the video says that we increase the permanence values on the segments with the previously active column. But there were none! So we grow new synapses? OK, but then, is it biologically feasible to grow a synapse to an arbitrary minicolumn within a layer, not taking into account the physical distance between the minicolumns?
I checked the paper "A Theory of Sequence Memory in Neocortex" and there are is citations or justifications when describing these particular aspects of TM. It would be nice to have the correspondence to neurobiology documented somewhere - it makes the theory much more credible, especially to neuroscience newbies.
Great questions! Too much detail to answer in a TH-cam comment. Please join our forums, where you'll find a lot of pre-existing resources on bursting: discourse.numenta.org/search?q=bursting
Could it be used in generative mode too? E.g. to replay a previously learned sequence with some variations that are still similar to the original learned ones?
Yes, but the trick is know what sequences are currently matching the input, and how well they are matching. It is tricky identify and labeling these sequences. In the brain, we do this in a different layer of cortex in a pooling operation that resolves objects over time. This is still a research area, but you’ll see more about it in later episodes (cortical circuitry)
so good!
great video just like the previous ones
but i didn't get something:
if more than one cell was predicted in a column ( like predicting one spatial pattern in two different contexts ), which one would be the winner cell?
They could both be winner cells.
Neocortex comprises millions (?) of columns. Each column comprises 7 (?) layers.
Additionally neocortex has regions (like stains on a napkin). (Although I have no idea how signals get from one region to another -- does the brain make use of the fact that the neocortex is squished up to pass signals between distant locations?).
These cortical columns ... How do they compare with the 'bursting' columns in the HTM-school vids? Do the vids focus on modelling just one particular layer?
You are getting "Cortical Columns" and "minicolumns" mixed up. Cortical Columns are a large structure that includes all the layers in your cortex. Minicolumns are small structures that exist within the layers, allowing temporal memory to occur. I'll have a lot more details about "Cortical Columns" coming in the next episode.
How do we get output from the columns? Is bursting related to the p300?
The active and predictive cells are the output of a layer. This is the data that passes up to higher layers of cortex. I don't know what p300 is?
en.wikipedia.org/wiki/P300_(neuroscience) It fires when you see something new, or see something similar. It looks similar to how bursting fires.
Cool, I was not aware of that (I am not a neuroscientist!). So I started a thread on HTM Forum to ask the experts: discourse.numenta.org/t/is-bursting-related-to-p300/2649
Thanks Matt it was wonderful as always. Unfortunately I have not understood the winner cell concept completely. Is there any other source to explain its process more deeply?
Try HTM Forum: discourse.numenta.org/search?q=Temporal%20memory%20winner%20cell
Thank you
Just out of curiosity, whats the point of some columns not sharing any connections to other columns? I see how this would be the case in our brains just by the somewhat arbitrary nature of the neurons dendrites, but in terms of software why not just assume that all distal connections can go to every other column? Is there a benefit to having the neurons "grow" new connections? Thanks.
+Benjamin Jordan yes there is a huge benefit. It means you can forget old things and learn new things without retraining. It enabled unsupervised learning.
Hey,
I got lost at the end...around minute 6:30. What's the purpose of choosing a winner cell if you're going to strengthen all the connections of all the cells in the column? Does the chosen winner get any special treatment as far as strengthening connections is concerned?
We strengthen the synapses between all the cells in the column toward previous winner cells because we don't yet have a prediction for this sequence yet. We have not learned it. The transition could potentially be any cell in the column, because none of them currently have strong enough permanences to previous winner cells to become predictive. As we continue to see this sequence repeated (or even very similar sequences repeated), eventually one of the cell's synapses will become permanent to force one cell to become predictive, thus breaking out of the bursting for that sequence.
HTM School, Then why choose a winner cell in that column if you are going to strengthen all segments. Because say on the next time step a different cell almost predicts the last time step...wouldn't that then be the winner?
Yes, that cell is the winner, and it gets strengthened, maybe enough to become connected and predictive on the next time step. Remember that spatial patterns can be subtly different, but still semantically similar. These patterns might be noisy, so this mechanism cannot go chasing after noise. Using the winner cells this way, which might eventual turn into synaptic connections, allows temporal generalization.
HTM School So if no cells in a bursting column have connections to the winner cells in the previous time step we choose the winner cell with the fewest connections. But then what? What do we then do to this winner cell. We are going to grow the segments of all the cells in the bursting column so what makes this one special?
Judging from your detailed questions, you might want some answers more detailed than I can give you here in TH-cam comments. Here are some papers linked from numenta.com/temporal-memory-algorithm/ you should read.
how many cells within a minicolumn must be activated to consider that the mini column is active ? Btw, really funny video ;)
So when a column is activated ( determined by the overlap_threshold ) only one cell within that column is activated. Could you please explain a little more the differences between mini column activation and cell activation.
Is it understood how this sequence memory turns into "higher order sequence memory"? the feedback neurons? if so, can you give me some description or some links on it?
I'm not sure I understand. Having multiple cells in each minicolumn makes this a high order memory. There is no apical feedback in this example.
i meant sequence inside sequence. as i understood the number of contexts this algorithm can remember, is of linear order to the number of cells per minicolumn. and if 32 cells makes 32 different contexts possible, in problem like natural language, that makes only one or two previous words, which definitely doesn't get the job done. so don't we need to recognize where we are in the larger sequence?
i remember a paper by Numenta telling that it can predict not only the next letter, but to some extent, the next syllable and the next word.
It is not limited in the way you are speaking. Remember that there are potentially thousands of minicolumns, each looking at different spatial aspects of the input. They all have different receptive fields. Each one is looking at a specific aspect of the input space and recognizing those spatial patterns temporally. Each column is limited to the amount of temporal contexts one input can be recognized, but working together they put together a much richer picture of the spatio-temporal space.
yes, but considering the language example again, is the spatial pattern of 'A' any different from the spatial pattern of 'A' ? and if 32 contexts make.. one letter (that's the exact number in my mother tongue :) ) so.. what we can do with that? does it give us anything other than a one letter context?
and besides that, what the feedback segments are for :)
is that understood?
A and A' spatial patterns will be the same. BUT there are not only 32 contexts for this spatial pattern. Each minicolumn will see a different part of "A" because they have different receptive fields for the input space. Each will create different temporal contexts for the receptive field of "A" that it sees. One column might recognize the bar across the letter. Other letters will also have a bar (like H). "A" is only recognized when many minicolumns predict that A is coming next, each looking at a different receptive field of the spatial input space.
02:58 Wait, what do you mean by "before", Single-Order Memory? :q
yes!
Woo!
It's a tad echo-y.
Thanks for the feedback. I'm working with new tools (Adobe) and figuring out how to use them.
That pun: echo -> feedback. :D But yes, the solution is to use a headset or dampen the room acoustics.
Neural Networks look like a crappy solution (even though it works beautifully for many use cases) compared to HTM.
anyone else have a lightbulb moment for CNC sensorimotor feedback? lol
Did you get to the end where I talked about sensorimotor integration? Watch the video I linked to for more info. It is fascinating what your brain can do with unions of SDRs. Btw what is CNC?
thanks! I missed the videos, I will check them out! oh and Computer Numerical Control, the system for controlling mills, lathes, 3D printers, everything; people are still the best bet for getting nigh impossible tolerances, due to their feel, for their machinery..
Cool, that is what I thought. You'll like the upcoming SMI videos! I think there will be a lot of robotics applications at some point in the future.
That must be very computationally expensive.
It can be, depending on how many neurons are in the structure. We typically create a layer with 2048 minicolumns with 32 cells per column (65,536 cells). And we get 20ms per time cycle on our development laptops consistently.
This computation looks highly parallelizable to me...I suspect computation time could be brought down extensively with GPUs. Imagine assigning each cell a GPU core with a specified function to perform at each time step, perhaps.
lol