@@ZorenStudio55 All I know is if we get carbon printers that would be capable of making objects or lifeforms/AI exoskeletons... HP is going to take all our money for the cartridges.
Meta's Large Concept Models for thought abstraction Google's Titans for human-like memory Sakana's Transformers² for more human-like Mixture of Experts DeepSeek's Reinforcement Learning on LLMs for automatic self-learning 2025 sure is gonna be interesting huh--
Thats the first video i found from this YT channel and i never meet some one who really understands the AI topic so deeply and can explain it so well. Cant tell you, how much stuff i have learned through this single video. Looking so hard forward for your other videos! Insane work!
Timestamps (Powered by Merlin AI) 00:05 - Google's Titans AI architecture significantly advances memory and learning capabilities. 02:28 - Google's Titans architecture addresses the limitations of existing Transformer models. 07:07 - AI models can learn and adapt during test time with Titans technology. 09:17 - AI incorporates long-term memory inspired by human learning mechanisms. 13:57 - Google's TITANS architecture introduces advanced memory mechanisms for AI. 16:17 - AI utilizes different memory types for enhanced information processing. 20:26 - AI advances in complex data analysis but faces slower training times. 22:32 - New AI memory mechanisms improve context understanding for conversational tasks. 26:50 - Google's Titans paper signals a new wave of evolving AI models. 28:57 - AI evolves like the human brain through neuroplasticity, adapting in real time. 33:10 - AI classifies tasks using three adaptation methods. 35:20 - Transformer Squared models adapt in real time to enhance AI performance. 39:41 - Emerging AI models may surpass current Transformer capabilities.
I want to point out two things about the Sakana T² architecure: 1. From my understanding, it's not really continual learning: the updated weights aren't stored or overwritten beyond the current prompt. It just boosts and suppresses different neural cliques for the individual prompt and not longer, making it more like a "task focus" than continual learning. 2. It can be applied to any Transformer model. While Titans is an iteration on existing architectures, the core layer of its memory-as-context variant is very similar to a Transformer. Which makes me wonder if it would be relatively simple to combine T² with Titans and get actual continual learning *plus* the boost of task focus?
i'm not sure the author care about such details though. Nor does the audience i believe. Thank you for pointing it out, for guys like. Though i'll try to make youtube not recommend this type of channel in the future. It feels like the author doesn't have a very clear picture of what he is talking about.
@@IceMetalPunk You are a good human. i'm not. The channel name, and the narration makes one believe the thing is leggit, when it clearly is not. Maybe it's a young youtuber that will build competency after a while, but i don't have the time to be mislead like that. It's not the worst out there, but still pisses me.
What a time to be alive right? I actually thought of this when I read the papers weeks ago. This is why I always say open source is the best because we all can help build better systems.
@@automatescellulaires8543Why do you say such words? This guy has a good channel and helps lots of people understand things like this. Maybe start your TH-cam channel instead? Seems you know better I would also like to learn more too
Now all that needs to happen is Titan and transformer^2 to come together so by memory it can compare efficiency of the answer and alter the structure of the nodes, weights, and balances accordingly. Basically, an artificial soul.
Thank you for providing such valuable information and for demystifying AI, along with the latest updates and models. Your clear and accessible explanations have been incredibly helpful, and I am learning a great deal from them. I truly appreciate it.
unlike many AI vids these days, i found your visual content every bit as helpful in conveying your content as was your script. (i tried listening passively from another room and decided to restart the entire video whilst giving it my full attention and squinting at the screen.) in other words, as always, thank you! your detailed research and refusal to dumb things down is *much* appreciated!
It is very interesting how these Titans can gather memory out of new events triggering them. If they add the stategies used on the Transformer² for making the Titans more flexible, it would become much more its own living and evolving thing, eventually.
Brilliant summary, definitely the future, neuroplasticity in both content and form: weights and architecture. The new generation of AI architectures will definitely allow for the continuous adaptation of their modular constituents combined with dynamic memory. Being also able to pick an existing model from scratch and 'adapt' ir enable it with such features is brilliant, since this will allow for architectural research built upon foundational work, opening this radical new space to new joiners with great ideas but no capital nor millions of gpus available. Coincidentally, this is the core of our own research, and these 2 papers are very much seminal/topical in this space. We call it TnT transformers to the nth, since our architectures are multidimensional tensors of neurons and not flat 2d layers. So the tensor Algebra is definitely more complex, but the idea is very much the same. Thanks for your amazing work. We will be using distilled DeepSeek r1s as prototypes since these open source open weights.
I'm just glad to see the beginning. I'm very old so won't see the end, or perhaps even the middle. But the start of something new, different, and groundbreaking is always exciting!
You only got ten years before we're all driving flying Lamborghinis around the moon! (We will have self-driving technology in space by then but it won't be allowed for another 20 years after due to safety concerns) th-cam.com/video/8TH3gvdaK18/w-d-xo.htmlsi=xTB3gf1mnPNomGQ8
@@GraipVine Hey you never know, with how AI is beating even the optimistic predictions who knows if we will get advancements in health that keep people alive... well god knows for how long. Hopefully it happens, and fast enough for you brother
Tbh, kids born now will probably not even learn the concept of work as we know it. Hell im studying in college rn and i dont know if ill even have to work by the time im done
About time. I've been emailing them on a steady basis with these designs based on modular reasoning and using subsets for particular requirements, as well as breaking down the structure of memory for them to adapt to AI, attention, long-term, persistent, as performed by the human brain. I'm glad they're now doing this process. (Yes, I have heaps of conversations with LLMs involving these concepts from months ago.) Finally, they've listened!
Just wanted to say nice video! Thank you for taking the time to digest this information into a easily understandable format!! Will definitely be staying tune to more of your breakdowns on publications 💪💪
As far as I understand, the LLM (core), the Knowledge Base (Persistent Memory) and the Memory (Long Term Memory) are three separate MLP's that reference each other. The actual LLM doesn't change at inference time, just the Long Term Memory MLP. To have a proper Fully Liquid Transformer you wouldn't have that separation. It would just fine tune at inference time. But current fine-tuning algorithms are far too inefficient to be fast enough for a reflection step during inference time. Sakana AI's Transformer^2 speeds up fine-tuning quite a bit, but it's still too inefficient to be useful at inference time, instead it's used to quickly fine-tune experts for fast MoE generation. Transformer^2 still can't generalize very well because of this inefficiency. The final form of Transformers IMO is proper continuous RL Reflection (i.e continuous thought that generates rewards and punishments based off of a context stream and fine-tunes according the type of reward/punishment) but for that we need smaller, more efficient models, faster fine-tuning and faster Reinforcement Learning algorithms. Also a proper generalizeable multimodal tokenizer.
@@bennyboiii1196obrigado! sempre penso no modelo da Sakana AI, pois a liquid AI em tese pode aprender indefinidamente e é mais eficiente energeticamente
How much access do we have to this outside of Google? Will they avoid releasing it or can we recreate it? They said that they regretted releasing transformer tech so I'm concerned they won't release this.
The limiting factor is sticking with the neural net. Imagine eliminating the neural net. What do you have left over? Can one formulate another kind of associative memory?
I just sent a job application to OpenAI. I don't even live in the US. But they really should offer me this job, 'cause I've got a lot of immediate failsafes that'll ~~prevent~~ postponing AGI awakening, and in the event of us reaching the singularity, sufficiently installed precautions will actually manage to isolate it.
There are two fundamental problems with continuous learning which are not addressed at all by these new papers: 1. Training/fine-tuning is computationally much more expensive than inference. It takes hours or days of training and thousands of varied examples to improve the inference accuracy of a model. It does not work like training a human where you can explain a new inference process once and have the trainee follow your explanation. 2. Training/fine-tuning with new data in absence of the original dataset causes (catastrophic) overfitting to the new dataset. In other words, the AI does not get smarter, it gets stupider because it forgets the old stuff while you keep feeding it the new stuff. To retain the original capability you would need to feed both old and new information at once (but with commercial AI models you don't have access to the old dataset and also again it would be impractical because of computational cost from (1) mentioned above).
Im asking myself, why google is even releasing their research. I mean, they could easily be the number one ai, if they tried. On top of that, they have the most user data.
their information retrieval seems similar (or might be the same) to their infiniattention paper. keeping memory out of attention as a memory of full nn might be good idea
I think the memory organization within the context window is quite obvious, I wonder why there apparently still only very few implementations that do at least something about it. I mean there are plugins for say Silly Tavern that do summarization in the background and then there was MemGPT which disappeared in the end.
So, this would have to be like a "memory" layer built on top of the foundational model itself, right. Like, each individual chat instance would basically incorporate it's own "model augmentation file" or something like that, essentially a stored file for that specific chat session that tweaks the model behavior and neural parameters used for that specific instance/conversation. They couldn't possibly allow all end users to change the influence of the base foundational models memory and understanding - that seems like it would be chaos. Especially if malicious users try to "correct" the models memory over and over to feed it incorrect info.
Imagine already having thought of all this, but not being able to implement it due to the limitations of the people around you. Memory was always the key. It's just that the creators of these models were too blind to see it.
Many AI researchers predicted that FEP (Free Energy Principle) and expanded dimensions, and logical or functional modalities of the attention heads were going to lead to the era of active inference.
This video is obsolete. It was obsolete a week ago. Tommorow's AI breakthrough is already obsolete. At this rate they're gonna run out ofnames and iteration number in a month.
Summing up technology is working hard in how to replicate human mind (memory and thinking). I hope this era will be called 'AI advance for replication' Second stage will be when the improved AI finds innovative ways of improving that, something like 'advance for truly innovation'.
Highlights: -- Titans architecture enhances AI memory, allowing for long-term retention and learning. -- It surpasses existing models like GPT-4 and Llama 3 in performance benchmarks. -- The design is inspired by human memory processes, focusing on surprise and significance. -- Titans manage memory effectively with adaptive forgetting mechanisms to optimise storage. -- Capable of handling context windows larger than 2 million tokens, addressing previous limitations in AI models.
So the 3 different memory methods mentioned in th is in contacts to the other one. it is gate memory and the other one is layer memory. Why not just give the model All three and the ability to choose as needed
Weird Observation: I asked ChatGPT: How many R's are in the word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry"? And it said: The word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry" has 21 R's (which was incorrect). Then I asked ChatGPT: Generate a code in Python that can count the R's in the word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry" and then execute it. And then it gave me the right answer: The word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry" contains 26 R's.
I also want the AI to remember how to get to work without incidents like robbery or car crashes. This must mean that every novel situation needs to be remembered, and we have to decide what qualifies as a memorable event. Do I only remember the first time I see snow? And if you ask me whether there was snow on the way to work today, I wouldn’t know?
I keep seeing X hours of video in 2M tokens. Is that for CLIP-like data from video? I doubt it could mean transcripts because that has nothing to do with visual information. I know it cannot possibly be pixel data because 2M token would disappear in no time.
What model of cat GPT are you using? I can give it two essays and it's able to break it down easily even with the lower console. I don't know if maybe I just know how to use it. I've never gotten that error even with the lower models
Not exactly. It learns the z-vectors for different tasks at training time, then at inference time it simply multiplies those vectors with the different subsets of model weights. It's not learning anything at test time, it's just "boosting" and "silencing" different sets of synapses based on the task.
The goal is to be better, not equal. Forgetting information is an efficiency function of the biological brain. If we could remove that limitation, it would make everything so much easier.
But that's physically impossible. You can't encode infinite information within finite storage. So the goal shouldn't be "remove the need to forget", but simply "expand how much it can remember as much as possible".
Ignoring irrelevant information is what makes the transformer so good. Of course it shouldn't remember everything, that would be less than useless, it'd be counterproductive
The question is can it learn more or just hold onto more information because we have seen to many times where it's just it remembers more so it's learning scam by many of these Ai companies
Won't this just increase the blackbox worry? Also make it harder to moderate models? Why would websites host models that can be rewired to post illegal content?
too late to discover earth, too early to colonize the starts; but just in time for the era of AI. I cannot wait to see what comes, especially with the recent push for AI and deregulation of federal AI guidelines.
Man even though I'm doing that same thing which I worked out myself, their entire new model idea is completely beyond my ability. Still, I'm curious as to what my version accomplishes.
8:44 If the weights in backpropagation do not change while we are using the model..... then why does chatGPT mention, that it might use our data for training purposes, and that if we don't want it to use our data, then we can switch to temporary mode..... or turn off that option in settings ? Btw, Great video as always❤
That's what we've been doing since the invention of perceptrons in the 1940s. It's not a new idea, it's literally the basis of all neural networks. That's literally why they're called "neural networks".
Thing is we don't exactly know how the brain works in many aspects. But AI is also helping us to understand better. For example, a paper based on AI research suggests that self-awareness requires and improves (in a feedback loop) self-predictability (the model knowing what comes from itself and what's "external"), and predictability aids in sustaining social cohesion (which could be useful when integrating different AI models in a cooperative way). And these small insights derived from AI additionally help us create feedback models that deepen our understanding of the human mind and intelligence in general.
AI memory has to be designed safer than human memory because it's possible to manipulate human memory in both directions, add and remove memories with and without consent.
"The Singularity" = AI videos every 5 milliseconds
IMG to reality 😂😂
Like , you upload a woman img , write prompt and you would get that 😂😂😂
@@ZorenStudio55
All I know is if we get carbon printers that would be capable of making objects or lifeforms/AI exoskeletons... HP is going to take all our money for the cartridges.
the AI is already capable of doing that.. it is just that our hardware is unable to render the output at that speed… due to bottlenecks.
video clones every seconds
@@David0gden future is here
40m long video from "AI Search" isnt something i was expecting on my bingo card
😃
Meta's Large Concept Models for thought abstraction
Google's Titans for human-like memory
Sakana's Transformers² for more human-like Mixture of Experts
DeepSeek's Reinforcement Learning on LLMs for automatic self-learning
2025 sure is gonna be interesting huh--
the infinity stones or forming captain planet
@dascodraws6040 yes, it has about the same real world usage as those fictional concepts
and it's only Jan
How do we layer all these different architectures' functionality into a single point of user interaction modularly?
@@limitationsoflanguagethats the smallest problem. building connections is not too difficult.
Thats the first video i found from this YT channel and i never meet some one who really understands the AI topic so deeply and can explain it so well.
Cant tell you, how much stuff i have learned through this single video. Looking so hard forward for your other videos!
Insane work!
Timestamps (Powered by Merlin AI)
00:05 - Google's Titans AI architecture significantly advances memory and learning capabilities.
02:28 - Google's Titans architecture addresses the limitations of existing Transformer models.
07:07 - AI models can learn and adapt during test time with Titans technology.
09:17 - AI incorporates long-term memory inspired by human learning mechanisms.
13:57 - Google's TITANS architecture introduces advanced memory mechanisms for AI.
16:17 - AI utilizes different memory types for enhanced information processing.
20:26 - AI advances in complex data analysis but faces slower training times.
22:32 - New AI memory mechanisms improve context understanding for conversational tasks.
26:50 - Google's Titans paper signals a new wave of evolving AI models.
28:57 - AI evolves like the human brain through neuroplasticity, adapting in real time.
33:10 - AI classifies tasks using three adaptation methods.
35:20 - Transformer Squared models adapt in real time to enhance AI performance.
39:41 - Emerging AI models may surpass current Transformer capabilities.
I want to point out two things about the Sakana T² architecure:
1. From my understanding, it's not really continual learning: the updated weights aren't stored or overwritten beyond the current prompt. It just boosts and suppresses different neural cliques for the individual prompt and not longer, making it more like a "task focus" than continual learning.
2. It can be applied to any Transformer model. While Titans is an iteration on existing architectures, the core layer of its memory-as-context variant is very similar to a Transformer. Which makes me wonder if it would be relatively simple to combine T² with Titans and get actual continual learning *plus* the boost of task focus?
i'm not sure the author care about such details though. Nor does the audience i believe. Thank you for pointing it out, for guys like. Though i'll try to make youtube not recommend this type of channel in the future. It feels like the author doesn't have a very clear picture of what he is talking about.
@@automatescellulaires8543 That's a bit harsh. People make mistakes, I don't assume apathy or malice from the start.
@@IceMetalPunk You are a good human. i'm not. The channel name, and the narration makes one believe the thing is leggit, when it clearly is not. Maybe it's a young youtuber that will build competency after a while, but i don't have the time to be mislead like that. It's not the worst out there, but still pisses me.
What a time to be alive right?
I actually thought of this when I read the papers weeks ago.
This is why I always say open source is the best because we all can help build better systems.
@@automatescellulaires8543Why do you say such words?
This guy has a good channel and helps lots of people understand things like this.
Maybe start your TH-cam channel instead?
Seems you know better
I would also like to learn more too
Now all that needs to happen is Titan and transformer^2 to come together so by memory it can compare efficiency of the answer and alter the structure of the nodes, weights, and balances accordingly. Basically, an artificial soul.
Thank you for providing such valuable information and for demystifying AI, along with the latest updates and models. Your clear and accessible explanations have been incredibly helpful, and I am learning a great deal from them. I truly appreciate it.
You're welcome!
unlike many AI vids these days, i found your visual content every bit as helpful in conveying your content as was your script. (i tried listening passively from another room and decided to restart the entire video whilst giving it my full attention and squinting at the screen.) in other words, as always, thank you! your detailed research and refusal to dumb things down is *much* appreciated!
31:40 - Note that, in "occipital", the first C is hard and the second C is soft. So the word is pronounced as though it were spelt "oksipital".
It is very interesting how these Titans can gather memory out of new events triggering them. If they add the stategies used on the Transformer² for making the Titans more flexible, it would become much more its own living and evolving thing, eventually.
Brilliant summary, definitely the future, neuroplasticity in both content and form: weights and architecture. The new generation of AI architectures will definitely allow for the continuous adaptation of their modular constituents combined with dynamic memory. Being also able to pick an existing model from scratch and 'adapt' ir enable it with such features is brilliant, since this will allow for architectural research built upon foundational work, opening this radical new space to new joiners with great ideas but no capital nor millions of gpus available. Coincidentally, this is the core of our own research, and these 2 papers are very much seminal/topical in this space. We call it TnT transformers to the nth, since our architectures are multidimensional tensors of neurons and not flat 2d layers. So the tensor Algebra is definitely more complex, but the idea is very much the same. Thanks for your amazing work. We will be using distilled DeepSeek r1s as prototypes since these open source open weights.
Honestly this maybe the best time to start your life with all these ai breakthroughs like me going through my teenage life
I'm just glad to see the beginning. I'm very old so won't see the end, or perhaps even the middle. But the start of something new, different, and groundbreaking is always exciting!
I'm glad I'm alive to see this. We are at a pivotal time in history.
You only got ten years before we're all driving flying Lamborghinis around the moon! (We will have self-driving technology in space by then but it won't be allowed for another 20 years after due to safety concerns) th-cam.com/video/8TH3gvdaK18/w-d-xo.htmlsi=xTB3gf1mnPNomGQ8
@@GraipVine Hey you never know, with how AI is beating even the optimistic predictions who knows if we will get advancements in health that keep people alive... well god knows for how long. Hopefully it happens, and fast enough for you brother
Tbh, kids born now will probably not even learn the concept of work as we know it. Hell im studying in college rn and i dont know if ill even have to work by the time im done
About time. I've been emailing them on a steady basis with these designs based on modular reasoning and using subsets for particular requirements, as well as breaking down the structure of memory for them to adapt to AI, attention, long-term, persistent, as performed by the human brain. I'm glad they're now doing this process. (Yes, I have heaps of conversations with LLMs involving these concepts from months ago.) Finally, they've listened!
I’m sure that your emails were definitely what led the genius AI researchers at Google to develop Titan
Send a invoice
So you basically talked to an LLM and then emailed google to tell them how they should research?-
Hi
@@TragicGFuel The same way they use AI to synthesize data and come up with solutions humans would not have dreamed up? Well, duh, yes.
"This changes everything" is the moment AI Search tells us he was an AI this time
I'll say this on 4/1 😉
Thanks for the vid. I appreciate the longform content
Ai never sleeps,
Now ai is evolving, right?
no
sleep when no electricity
@@janebajWaUse humans as batteries
Being a pro in AI for last 15 years, I can tell it is an absolute huge step forward.
Just wanted to say nice video! Thank you for taking the time to digest this information into a easily understandable format!! Will definitely be staying tune to more of your breakdowns on publications 💪💪
You're welcome!
Everyday there is 10 or more videos saying "it will change everything" lol
And of course it's gonna be "insane"...😂
Meanwhile:
"Sorry for any confusion, but as an AI..."
what's the difference between this and liquid neural networks if they can both actively learn?
As far as I understand, the LLM (core), the Knowledge Base (Persistent Memory) and the Memory (Long Term Memory) are three separate MLP's that reference each other. The actual LLM doesn't change at inference time, just the Long Term Memory MLP. To have a proper Fully Liquid Transformer you wouldn't have that separation. It would just fine tune at inference time. But current fine-tuning algorithms are far too inefficient to be fast enough for a reflection step during inference time.
Sakana AI's Transformer^2 speeds up fine-tuning quite a bit, but it's still too inefficient to be useful at inference time, instead it's used to quickly fine-tune experts for fast MoE generation. Transformer^2 still can't generalize very well because of this inefficiency.
The final form of Transformers IMO is proper continuous RL Reflection (i.e continuous thought that generates rewards and punishments based off of a context stream and fine-tunes according the type of reward/punishment) but for that we need smaller, more efficient models, faster fine-tuning and faster Reinforcement Learning algorithms. Also a proper generalizeable multimodal tokenizer.
@@bennyboiii1196obrigado! sempre penso no modelo da Sakana AI, pois a liquid AI em tese pode aprender indefinidamente e é mais eficiente energeticamente
So cool. Keep releasing these paper summary videos.
Thanks
I appreciated the breakdown with easy to understand explanations. Thanks very much!
You're very welcome!
This is an excellent video. Thank you.
You're very welcome!
nice video (I'm not even 20 seconds in)
thanks!
😂@@theAIsearch
How much access do we have to this outside of Google?
Will they avoid releasing it or can we recreate it?
They said that they regretted releasing transformer tech so I'm concerned they won't release this.
The limiting factor is sticking with the neural net. Imagine eliminating the neural net. What do you have left over? Can one formulate another kind of associative memory?
You are amazing. Thank you for making this information accessible for everyone to understand!
You are welcome!
I just sent a job application to OpenAI.
I don't even live in the US. But they really should offer me this job, 'cause I've got a lot of immediate failsafes that'll ~~prevent~~ postponing AGI awakening, and in the event of us reaching the singularity, sufficiently installed precautions will actually manage to isolate it.
Very well written video ❤
Thanks!
Like your videos, love them when they have a github link better =)
Thanks
Thanks for making actual good and in depth content and not being one of those channels that claims ASI is here 3 times a day
You're very welcome!
Well boys, the human race was good while it lasted o7
o7 - our final model!
There are two fundamental problems with continuous learning which are not addressed at all by these new papers:
1. Training/fine-tuning is computationally much more expensive than inference. It takes hours or days of training and thousands of varied examples to improve the inference accuracy of a model. It does not work like training a human where you can explain a new inference process once and have the trainee follow your explanation.
2. Training/fine-tuning with new data in absence of the original dataset causes (catastrophic) overfitting to the new dataset. In other words, the AI does not get smarter, it gets stupider because it forgets the old stuff while you keep feeding it the new stuff. To retain the original capability you would need to feed both old and new information at once (but with commercial AI models you don't have access to the old dataset and also again it would be impractical because of computational cost from (1) mentioned above).
Im asking myself, why google is even releasing their research. I mean, they could easily be the number one ai, if they tried. On top of that, they have the most user data.
Comment so I get the answer too (i will probably just claude the question though)
Comment so I get the answer too (i will probably just claude the question though)
Comment so I get the answer too (i will probably just claude the question though)
Answer so I get the question too (i will probably just claude the comment though)
I also wonder why
based on the charts seems like a very small difference
their information retrieval seems similar (or might be the same) to their infiniattention paper. keeping memory out of attention as a memory of full nn might be good idea
I think the memory organization within the context window is quite obvious, I wonder why there apparently still only very few implementations that do at least something about it. I mean there are plugins for say Silly Tavern that do summarization in the background and then there was MemGPT which disappeared in the end.
Love the non AI generated content, keep it up! ❤
Thanks!
AGI might be closer than we expect
Titans plus transformers² would go crazy
So, this would have to be like a "memory" layer built on top of the foundational model itself, right. Like, each individual chat instance would basically incorporate it's own "model augmentation file" or something like that, essentially a stored file for that specific chat session that tweaks the model behavior and neural parameters used for that specific instance/conversation.
They couldn't possibly allow all end users to change the influence of the base foundational models memory and understanding - that seems like it would be chaos. Especially if malicious users try to "correct" the models memory over and over to feed it incorrect info.
Imagine already having thought of all this, but not being able to implement it due to the limitations of the people around you. Memory was always the key. It's just that the creators of these models were too blind to see it.
"Memory is the key"
-Agent Washington, Red vs. Blue
18:31 Yes, with the supervisor being somewhat comparable to the left side brain. But did it really take so long for people to see it?
memory layers at scale from meta tho, how does that compare?
Its about time. The principle to true AGI is "to Loop information recursively".
Great info.
Thanks!
You should change the order from oldest first to newest first in your ai news/research playlist
Many AI researchers predicted that FEP (Free Energy Principle) and expanded dimensions, and logical or functional modalities of the attention heads were going to lead to the era of active inference.
isn't this Titan the same as " Aha moments " in deepseek R1 zero model ?
If transformer era is going to end I think depends in the performace and real problem solving of the new systems
This video is obsolete. It was obsolete a week ago. Tommorow's AI breakthrough is already obsolete. At this rate they're gonna run out ofnames and iteration number in a month.
yep, already tired of hype
@@teanor-tree hype?
AI is the final frontier of information technology.
if you think it’s hype, you need to learn more.
I’m not sure how this is out of date. Are you aware of a better model architecture that incorporates memory released since the Titans paper?
1:52 the beatles said it was love
Summing up technology is working hard in how to replicate human mind (memory and thinking).
I hope this era will be called 'AI advance for replication'
Second stage will be when the improved AI finds innovative ways of improving that, something like 'advance for truly innovation'.
Highlights:
-- Titans architecture enhances AI memory, allowing for long-term retention and learning.
-- It surpasses existing models like GPT-4 and Llama 3 in performance benchmarks.
-- The design is inspired by human memory processes, focusing on surprise and significance.
-- Titans manage memory effectively with adaptive forgetting mechanisms to optimise storage.
-- Capable of handling context windows larger than 2 million tokens, addressing previous limitations in AI models.
So the 3 different memory methods mentioned in th is in contacts to the other one. it is gate memory and the other one is layer memory. Why not just give the model All three and the ability to choose as needed
im pretty sure they will do that later in different ways.. but first we need to develop the techniques
@theAIsearch: What's the difference between Mixture of Experts and Transformers² ?
Weird Observation:
I asked ChatGPT:
How many R's are in the word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry"?
And it said:
The word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry" has 21 R's (which was incorrect).
Then I asked ChatGPT:
Generate a code in Python that can count the R's in the word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry" and then execute it.
And then it gave me the right answer:
The word "Strawberrrrrrrrrrrrrrrrrrrrrrrrry" contains 26 R's.
Did you use o1 or 4o?
@eprd313 4o
I also want the AI to remember how to get to work without incidents like robbery or car crashes. This must mean that every novel situation needs to be remembered, and we have to decide what qualifies as a memorable event. Do I only remember the first time I see snow? And if you ask me whether there was snow on the way to work today, I wouldn’t know?
I lost count of how many times I read titles including "changes everything" this week.
I keep seeing X hours of video in 2M tokens. Is that for CLIP-like data from video? I doubt it could mean transcripts because that has nothing to do with visual information. I know it cannot possibly be pixel data because 2M token would disappear in no time.
I guess AI's next issue won't be hallucinations, but straight up dementia
wait wait, the titan is out?
We've indeed opened the gates of Tartarus
In the transhumanist tabletop RPG eclipse Phase , humanity has been almost whipped out by super AIs named Titans. Interesting...
We really need some better definitions of what “smart” means. And new benchmarks.
Artificial intelligence is no match for natural stupidity...
I really waiting for release LLMs based on Google Titans.
What model of cat GPT are you using? I can give it two essays and it's able to break it down easily even with the lower console. I don't know if maybe I just know how to use it. I've never gotten that error even with the lower models
Does Transformer² re-tune itself up for every query then? It would be really inefficient when dealing with changing subjects.
Not exactly. It learns the z-vectors for different tasks at training time, then at inference time it simply multiplies those vectors with the different subsets of model weights. It's not learning anything at test time, it's just "boosting" and "silencing" different sets of synapses based on the task.
We need comparisons to o1 and Sonnet 3.5, not 4o and 4o-mini.
The goal is to be better, not equal. Forgetting information is an efficiency function of the biological brain. If we could remove that limitation, it would make everything so much easier.
But that's physically impossible. You can't encode infinite information within finite storage. So the goal shouldn't be "remove the need to forget", but simply "expand how much it can remember as much as possible".
There are also some things you don't want to remember. To be able to forget is sometimes a blessing.
@@Shinteo5 that's the limitation of human being. Why would you want such limitation in AI.
@ That is the way of Skynet. :D
Ignoring irrelevant information is what makes the transformer so good. Of course it shouldn't remember everything, that would be less than useless, it'd be counterproductive
"If you get robber, you will remember ..."
Oh great, so we make that AIs can become traumatized! That sounds like a good idea :)
(I am half joking)
Does it beat reasoning models like deepseek r1?
The question is can it learn more or just hold onto more information because we have seen to many times where it's just it remembers more so it's learning scam by many of these Ai companies
Won't this just increase the blackbox worry?
Also make it harder to moderate models? Why would websites host models that can be rewired to post illegal content?
1.4 million words is quite a bit longer than your typical novel...
It's evolved so much it's reading this...
too late to discover earth, too early to colonize the starts; but just in time for the era of AI. I cannot wait to see what comes, especially with the recent push for AI and deregulation of federal AI guidelines.
Man even though I'm doing that same thing which I worked out myself, their entire new model idea is completely beyond my ability. Still, I'm curious as to what my version accomplishes.
mind=blown
8:44 If the weights in backpropagation do not change while we are using the model..... then why does chatGPT mention, that it might use our data for training purposes, and that if we don't want it to use our data, then we can switch to temporary mode..... or turn off that option in settings ?
Btw, Great video as always❤
They store all your information and use in their next learning cycle for the next version of the model.
As they said, the difference is training in real time vs. training in cumulative batches
@@Djeez2 Ohh okayy got it, thanks for the info 🙂
AI memory mimicking human's forgetfulness has to be the dumbest thing I've heard. 😂
Can't wait, with memory AIs are gonna be insane and perfectly tailored to our needs, no more repeating same stuff over and over
Babe wake up a new AI paradigm just dropped
😃😃😃
Why not use all the techniques? Making a robust, long-lasting ai is good 👍
because they're from separate labs. but once the code is out, i'm sure AI companies will try to merge these techniques together
Who would have thought that just emulating what the brain does was the way to go 🤦
That's what we've been doing since the invention of perceptrons in the 1940s. It's not a new idea, it's literally the basis of all neural networks. That's literally why they're called "neural networks".
Thing is we don't exactly know how the brain works in many aspects. But AI is also helping us to understand better. For example, a paper based on AI research suggests that self-awareness requires and improves (in a feedback loop) self-predictability (the model knowing what comes from itself and what's "external"), and predictability aids in sustaining social cohesion (which could be useful when integrating different AI models in a cooperative way). And these small insights derived from AI additionally help us create feedback models that deepen our understanding of the human mind and intelligence in general.
4:10 o1 has context window of 200k not 128k
thanks for the correction!
explain HOW, the first AI model is near worthless today, but the first 'Crypto' is worth almost 1/2 the 3 trillion mkt cap of all crypto
LLM lotta lil moneys
ima finna bridge
sigh... everything changes everything nowadays. I'm not even stunned by this 😞
how do you find those articles
Is there anything based on Titans or Transformers 2 I can use
not yet. its still very early
so basically anybody can make a model dumber by feeding it false information
lol yes or trolling it
This architecture would be interesting if it's combined with diffusion models : DiTT.
with DiTT we will resolve the issues of short videos
interesting idea!
bro is locked in
just when are they gonna use the crystals?.
Imagine Titan + Transformer2 😂🎉
🤯
Version 3 is the best but 1 and 2 ist intresting.
Feels like another LSTM😂
AI memory has to be designed safer than human memory because it's possible to manipulate human memory in both directions, add and remove memories with and without consent.
NEURO SAMA WILL MORE GOOD 😊😊😊
So how do we separate different users data or ensure that 1 user doesn't feed the AI garbage information causing the AI to spue nonsense
with this, I assume each user will have their own personalized AI