4 Reasons AI in 2024 is On An Exponential: Data, Mamba, and More
ฝัง
- เผยแพร่เมื่อ 9 มิ.ย. 2024
- 4 Reasons I am convinced AI is still on an exponential in 2024. From the crazy potential of data quality, the dramatic possibilities of new architectures like Mamba (summarised here in around 5 minutes!), hybrid architectures, prompt optimisation and more. And regardless of all of that, photo-realism is coming to video.
I'll also explore a 100 year old prediction, on this 2024 New Year in AI...
/ aiexplained
Fan Tweet: / 1741499431474934165
Mamba: arxiv.org/abs/2312.00752
Mamba Tri Dao Interview: • Interviewing Tri Dao a...
Albert Gu: / 1731727682814050570
AI Improvement Paper: arxiv.org/abs/2312.07413
Mistral Interview: • Safety in Numbers: Kee...
Lukasz Kaiser on Inference Time Compute: • Deep Learning Decade a...
W.A.L.T Video: / 1740788207967736047
aicountdown.com/
AI Explained Orca: • Orca: The Model Few Sa...
Let’s Verify: • 'Show Your Working': C...
Q-Star: • Q* - Clues to the Puzzle?
1923 Cartoon: / svpino_a-1923-cartoon-...
Quadratic Complexity: / how-to-apply-transform...
Striped Hyena: www.together.ai/blog/stripedh...
Illustrated Transformer: jalammar.github.io/illustrate...
Language Models as Optimizers: arxiv.org/abs/2309.03409
www.etched.com/
W.A.L.T www.tomsguide.com/news/walt-i...
S4: hazyresearch.stanford.edu/blo...
/ aiexplained Non-Hype, Free Newsletter: signaltonoise.beehiiv.com/ - วิทยาศาสตร์และเทคโนโลยี
The idea of “having a chatbot that remembers your conversations from years ago” is 100% gonna be a thing - having personalized AI ‘friends’ you can talk to from any device is incredibly commercializable, especially when you realize we can have these ‘personalities’ talk to and learn from one another.
incredibly *downloadable for free
The "commercializable" part may come only around the presentation (graphics) - an actual AI capable of such things may very well be available from the open source community, just as llama-based, locally running models that seemed fictional a year ago can be found today.
I would rather play an MMORPG with a bunch of friendly and kind AIs than human players.
@@juandesalgadoI hope so. I would never want to have an AI companion managed by a corporation.
I was yammering to people last night of the concept selling them on gpt, who surprisingly haven't even used yet
-How will Memba call their model when they add memory to it?
-Rememba
By that logic memba should be called iForgor if it was made by Apple 💀
I feel like it's 1985 and I just got my first high-speed connection to the internet. I keep hearing about the AI bubble that will burst, the funny part is they have no idea that it's not a bubble, it's a freight train that will not stop. Thanks for bringing in 2024 with more uplifting news, have a good one!
I So Love Your Analogy "Freight Train That Will Not Stop" PERFECT 💯"Bingo"
Thanks!@@MichaelErnest666
Youre of the few TH-camrs where i actually click on the new notification popups instantly. Thanks for all your hard work and top quality research!
Agree. He is so far also the only one where I watch the whole video every time.
Thanks Josh
@@aiexplained-official seriously, your content is really really good. you know what you are talking about! keep it up please, i liked and subscribed
The only difference between the Elevenlabs voice and you was audio quality. Actually bonkers. This is a godsend for my line of work, dialogue and voice acting is always a major hurdle for anyone looking to do solo short films.
I thought the synthetic voice rushed a little bit.
I certainly hope 2024 is gonna let 2023 look like a "sleepy" year.
I don’t think 2024 will be that big. 2023 won’t be replicated.
@imperfectmammal2566 It won't be replicated, it will be exceeded. The improvement rate of AI will be much quicker than we think.
@@Daniel-xh9ot That's what we all hope.
@@imperfectmammal2566 Do you live under a rock or what?
@@imperfectmammal2566 Why? there is no indication for this to be true, if anything there is reason to believe in exponential growth, as we have seen over the last two years.
Your AI generated audio is scary good and someone will surely make a TH-cam channel that posts 100 explainer videos per day with this.
The future is generated just because it will become thousand times cheaper, better and faster than any human.
Sounds awful
@@aiexplained-official Well, with the right audio gen AI it'll sound great.
100%
@@aiexplained-officialwhich AI Audio tool did you choose for this video?
It's a bit like a Space Shuttle launch: Tremendously exciting, but you can't help worrying that the whole thing could blow up. Thanks for all the insights into ways to leverage greater performance, but as you say nothing's more important than data. I still wonder about how quality data can be strained from the much-polluted ocean of the internet. Happy New Year, Phillip
And this is when bias started creating the dystopia we are worried about...(the "quality" defined by young white silicon valley bros....you know the kind ,real mature with in depth understanding of society and human nature/relationships,with altruistic motives...#not)..
I m worried that noone pays attention to the repercussions that this will have.
@@georgesos
Still better than woke AI.
As for understanding human nature, those who run social media platforms likely know the most about it.
I'm not saying we are all good, but it could be far worse.
@@RedScotlandand the sky is green
The human progress graph is from Wait But Why's AI article from almost a decade ago.
This was the article that made me realize the AI revolution is really happening.
I love waitbutwhy! The vision and the graphics! Along with the AI post, my favourites are the one about Elon Musk and the one about procrastination.
Yeah, that was a great post :)
same, that really woke me up to it, instead of it being an abstract thing
Media creation will be crazy with AI generated videos, images and music. Even TH-cam is starting to roll out generative AI.
TH-cam (shorts) is also getting overrun by AI generated audio lmao
We need a prominent "generated by AI" tag on TH-cam and a button to filter them out.
yep. And it lets you do things you couldn't before. For instance, the biggest stumbling block to designing a game on my own was the availability of art, and now I can get past that and get started thanks to Stable Diffusion. Midjourney and DALL-E 3 are even better, but they are less customizable and censored unlike SD, but if it works for you, you can do even more with it in many respects.
I love when you talk about scientific papers, specially the most impactful recently released ones, it's just way cooler to know what SCIENCE is saying about these technologies than just hear random predictions.
The data quality improvement is definetely real, I believe it can go way beyond the "Textbooks Are All You Need" paper, which is already awesome. Some other clever strategies will definetely appear in 2024.
Synthetic data seems to be the path as of right now.
Thank you for your amazing work Phillip, you can count on me here watching all your videos this year, I'm already looking foward to your GPT-5 analysis when it comes out!
Haha thank you Gabriel
The background noise on AI you sounds different (and is a little distracting) from your normal ambient noise. The inflection is closer than I was expecting though. Slightly different, the upper range seemed bounded. But like, getting very close.
And MJ6 is stunning. Thanks for the video as always! Can't wait to see the blastoff that is 2024.
Its amazing how useful these Ai tools are. Ive been using them to help me program microcontrollers with very little C experience. Ive gained so much confidence in new project ideas without having people belittle me on stack overflow.
Neckbeard mods ruined it
A fantastic video, as always. Looking forward to hearing more about Mamba as it’s scaled up.
Thanks trenton!
- Recognize the importance of data quality for AI models (Start: 1:10)
- Understand that models like Mamba can improve language modeling efficiency (Start: 1:39)
- Consider the potential of Hardware aware State expansion in AI architecture (Start: 6:30)
- Explore the benefits of allowing AI models to allocate compute dynamically (Start: 10:42)
- Investigate how AI models can optimize their own prompts for better performance (Start: 15:41)
- Stay informed about the rapid advancements in multimodal AI capabilities (Start: 17:06)
Thank you
here it comes...
I've recently been devouring 'The Expanse' book series, immersing myself in its world that, despite being set several hundred years in the future, feels incredibly tangible. Given the rapid advancements in technology, it seems the author may have been conservative in his timeline predictions. Technologies like fusion, lunar bases, and asteroid mining, once deemed futuristic, now appear to be on the horizon much sooner than anticipated.
in The expanse human computer technology is primitive like in retro future movies ...
@@mirek190it often has to be, in science fiction: an agi+ is either the center of the story or is not there, we couldn't relate with the situations otherwise. The whole story in the expanse for example is incompatible with a widely spread agi...
They still have AI to help with all the crazy calculations and correlations they need to make. It’s just not AGI.
I only saw the first two season of the TV show. It's cool, because it's very realistic apart from a few necessary cheat. But it's economics of it is making it hard to enjoy. For starters, in that world poverty and oppression should not exist. And most conflicts revolve around those issues. Even today poverty and oppression are just a legacy issues, forced by people and ideologies stuck in the deep past.
Of course there always be conflicts, and even wars, but not for the same reasons as today.
@@andrasbiro3007 In the books, there are 30 Billion people on planet earth and just one government, the UN. There is the concept of UBI, but it's a very small amount and very hard to live on. I don't think we will ever be at a point where there is no more poverty or oppression, especially oppression. I think the series is more grounded in reality than people would like to think.
"The steep part of the exponential" is kind of an inherently ridiculous phrase. *All* of an exponential curve is the steep part, the joint in the graph is purely a matter of what scale you choose. If tech growth is on an exponential curve, then we've been on "the steep part" since day 1.
Fair point!
Doesn't have to have started as an exponential. Plus, exponential graphs can start off slower than a linear graph and get steeper and steeper. "Steep" is a matter of semantics, and is absolutely not "inherently" ridiculous.
From a purely mathematical standpoint, the concept of a singular 'steep part' of an exponential curve is incorrect. But the phrase is still meaningful in a practical sense in various contexts. It captures the experience of nothing happening over an extended period of time and then suddenly the perception of rapidly accelerating growth. Since there is practically no infinite growth, it is a sigmoid curve. A sigmoid has clearly defined points of maximum curvature. In this case it is quite appropriate to say that we are approaching the "steep part" of the curve.
@@minimal3734 Why on earth would you suppose that tech growth is on a sigmoid curve? Both evidence and the literal comment in question contradict this.
2023 was the year of surprise, enthusiasm,.. But 2024 will be the year of deployment. Most "look at this tech demo" will be commercialized, standardized, and deployed for production use.
Either 2024 will be full of workers strikes, economy in uncertain situation,... or it can be the year, "we are working with the AI". It depends on the industry segment, I guess, but it will be hard.
I know for myself, GPT4 turbocharched my productivity, and I can have up to 2 hours a day more free time. And this is not with specialized solutions. At first, just "copy/paste to GPT4, tell it to sort by importance, suggest replies, .... and ~4hours become 1 or max 2. This is the first "tool" I've seen in my 30y career as software dev.+., that really changed the game.
You speak as if you have a perfectly functioning crystal ball. That, to me, marks you as a bit of a dummy. :) Perhaps you'll be proved right, but even if you are, that doesn't mean you know, today, what the future holds.
It is quite common for technological change to take longer than "experts" and those most enthusiastically following developments expected to have any impact, but then go on to have a much greater impact than almost anyone expected in the longer term. I don't know if it is habits or conservatism or something else, but it's something. For instance, fifteen years ago I thought the internet would very rapidly transform how corporations organize work, because of the very strong economic incentives that existed to pay someone in Manila or Hanoi or Bangalore who are just as talented as me a fraction of my salary to do the same job - and with much weaker legal protections for those workers as an added bonus from the employer's point of view.
And maybe five years ago we were all told that voice recognition had finally become so good that voice interfaces would transform most of computing. And it is undoubtedly true that it works well enough for many use cases. Still, the majority of people don't use it at all, and the minority that does use it mostly uses it to set timers and reminders and prefer other interface options for everything else. I am usually in the "early adopters" group on most tech things, but I still find myself using the on-screen keyboard rather than the voice option even for things like searching NetFlix on the TV app (in this case, habit is likely the culprit, using voice is much easier, but I tend to forget that I can).
It took a pandemic for companies to be forced to discover that even without reinventing organizational structures and ways of working, remoting wasn't actually all that difficult. In the aftermath of that, some have begun to change their thinking about the necessity of expensive offices, and more new companies have sprung into existence that are putting new models with a global employee base to the test - but it is far from clear yet what comes of this. Many corporations believe that building a corporate culture and - my choice of words - manipulating employees to make them more loyal is much easier if they physically congregate, and that this more than makes up for the added short-term costs.
And when I try today to ask myself if the "level global playing field" vision of the future I held fifteen years ago still holds up, I no longer really believe it. Now I find myself thinking instead that if "the good jobs" started disappearing in groves, voters would demand protection, politicians would compete to be the most protectionist, and they'd get elected. Of course the top 0.1% care just as little about computer programmers and doctors as they do about blue-collar workers, but they would find it a lot more difficult to manipulate these people than the working class, which sadly often votes against its own best interest (in my opinion) - swallowing propaganda on topics such as workers unions, the only realistic way for workers to weild any power at all.
The point being, your personal experience and "the objective benefits" (to the extent such a thing exists) of a technology, and even strong economic incentives, do not necessarily lead so immediately to wide-spread adoption. A lot of fringe factors are also in play, and anything from a regulatory smackdown to people's perception of the tech may matter as much as any productivity gains that could be had from it. For example, I am also a computer programmer, but I am feeling very ambivalently about using AI coding tools. Even if that did boost my productivity (I am not convinced it would, in my current setting at least), if I use those tools I am also training a machine to become able to replace me. And yes, I know AI proponents shoot down all such objections, but they provide no logical reasons why it would not, they just point to how it hasn't so far, which is to be expected given the chronic lack of programmers and the fact AI is far from able to replace programmers at the moment... Anyway, "I don't like it" can be a big obstacle in its own right.
Yes. Deployment is key. We need AI experts, integrators, consultants etc to add mundane but effective AI to businesses. That will happen over the next year or so. Layoffs - or more likely 'none replacements' - will then start making dents in the employment numbers.
Fantastic video! That voice clone is eerily close now. Thank you so much for all the work you put into keeping us up to date!
Thanks so much!
@@aiexplained-official how did you generate it? It was so good!
Elevenlabs
if you want more engagement in your videos, i suggest adding timestamps to future uploads. this way people can spend time watching what THEY want to watch. other than that, amazing video well done.
Have a wonderful year everyone
17:50 got that correct pretty easily.
the right one just looks way too good. the left one on the other hand screams amateur tourist photo.
the crazy thing about image-gen is that it is not just fully out of the uncanny-valley by now, but somehow goes even beyond photo-realistic.
good gens can be both entirely believable yet at the same time unrealistic in their quality...
Yeah i had the same thought process.
I know how image generation prompts look and no-one prompts for "people under ..." as then the model will focus on the people and not the thing.
Also once you zoom in you can see the tale tale signs of jpeg compression and phone image processing
When I look at anything AI generated I just _feel_ the wrong more than 90% of the time. I'll admit that more advanced models occassionally fool me, but my gut just wants to twist every time someone tries to make me submit an AI contribution
@@tomaszkarwik6357 It doesn't have to focus on people just because it's included. It entirely doable to generate people like in the photo, with the right prompt.
I believe the only reason we could tell is because of the people. If the AI had people or the real one didn't have people I think it would've been out of the water. Either way if someone showed you the AI photo, you would probably say that's cool and be none the wiser.
What I am waiting for is gpu manufacturers putting decent amounts of vram on gpu's. Seriously that stuff is (relatively) cheap, why are there even still 8gb vram cards being sold?
I'd like to believe the absolute minimum next gen will be 16gb vram.
This is mostly important for the open source community but I believe thats where the real gains are to be made.
High speed vram is one of the most expensive parts of GPUs, so your starting presumption couldnt be more wrong
the cartoon guy predicted the exact year where it would become possible to create comics automatically, 100 years in advance. insane
Indeed!
Happy New Year! Thanks for all the great content! 🙏🏼
Great video!
Thanks for not only keeping us up to date but also encouraging and eliciting forward thinking. These are such fascinating (and worrying) times. Looking forward to another educational year with your content 🎉
Things will never be as slow as they are now. 4 million years ago we had stone tools and it took us 3 million to get controlled fire, one million years ago. The Neolithic revolution happened roughly 15000 years ago … we use established tools, both hardware (tools) and software (knowledge/skills), to built better tools, with which we can built even better tools even faster. And soon the tools will built themselves and better ai built better ai to built better ai …
...and then, one very normal day, while one person is arguing with their boss, and another is picking their kids up from school, and another is singing karaoke, AI will surpass humanity, and humans will become tools.
@@41-Haiku i think it’s more likely that we humans become pretty quickly insignificant (being neither a rival nor a threat) in the grand scheme of things. Imagine if humans would be an invention of cats to make sure life is nice and easy. From the cats view a big success although they would have no idea what is going on in detail in the background. From humans perspective house cats are one interesting aspect of life among many, but not much more than that. Or (although that is more speculative) we go in the direction of human augmentation to stay at the same level, without there being a significant difference between ourselves and our creations in the long run. Or a spectrum between these extremes
@@41-HaikuI don't think a human tool would be of any value for a super-intelligence.
All I want for New Years is AI Winter...
🙏
your videos are so important - i make sure to watch them above all else before going about my day
steep part of the exponential:
There's no such thing. The green graph is incorrect, and does not show an exponential.
An exponential curve _always_ looks like you are taking off compared to what it was before. It's scale invariant. It does not have a steep part; it is the same everywhere.
Very true!
Since there is practically no infinite growth, it is a sigmoid curve. A sigmoid has clearly defined points of maximum curvature. In this case it is quite appropriate to say that we are approaching the "steep part" of the curve.
Reminds me of subconscious vs conscious thought, teaching Ai what to “ignore”
A kind of negative learning.
There is a part of your brain that is constantly looking for threats and ignoring almost all potential threats at the same time. I once stayed awake for a little over 90 hours and I experienced the dropping of that filter, where anything that could be a snake or a spider or anything malicious was perceived as such. We've all had that moment where we mistake something harmless for a spider etc.
Well right now, AI is experiencing those kinds of hallucinations where it is confidently incorrect about all kinds of things, because it is not intelligently filtering noise. I think your analogy is spot on. AI, like us, might just need a bit of a filter that rejects the ridiculous while also taking the improbable into account. Weighting this manually will likely be impossible, but like all other successful approaches to AI, letting it build its own parameters might prove to be incredibly easy.
Thank you for all your videos
Happy New year - very exciting video thanks.
Happy New Year GPhilipT
Haha, you too
Thank you for the work you're doing! Amazing video as always!
Thank you!
Good 2024 dude! A lot of news for starting new year!
Great video. I'd love to hear more about data quality. What is bad about the current data? Would would good data look like? What training sets would people ideally want if they could be created?
Check out my phi-2 video, and phi-1 for that matter!
The next generation of higher data quality will involve locally trained high density and high precision models from wearable always active observers, think what pi is doing xrai etc
@@aiexplained-official Like, of course good data matters. The real question is what are the limits of synthetic data for quality. Phi-2 is small. It's nifty to have strong small models that learn fast due to careful "pedagogy". But can those same techniques push forward 70B+ parameter models that are really data hungry?
Merry Christmas and Happy New Year AI Explained and the community! 🎉
You too sub!
Isn't every part of the exponential curve the steep part?
Haha fair enough!
maybe he meant sigmoid
@@btm1Most certainly. Since there is practically no infinite growth, it is a sigmoid curve. A sigmoid has clearly defined points of maximum curvature. In this case it is quite appropriate to say that we are approaching the "steep part" of the curve.
I can't wait for the day that Ai suggests that researchers should publish dark mode papers and slideshows :D lmao
Happy New Year Philip 🎉✨️. Looking forward to big things with you this year 😅
Thanks sola, you too!
Amazing video as always, thanks!
Thank you so much for all your amazing videos I look forward to each one
Hi! I'm Jess from VOC AI, absolutely loving every second of your content! 🌟Keep the awesome content coming, I'm here for it!
another great video, looking forward to this next one, thanks again Happy New Year !
Thanks ryan
I think it's so interesting how many different companies and approaches are being taken to try and sovle/improve the intelligence of AI. It's very reminiscent of natural selection. When of these groups of people figure something out that works way better than anything else then it gets adapted by everyone else and then they all keep pushing their own directions and ideas. Focusing on scale for some, data quality, rigor and reliability, training method, etc. So many different approaches and ideas about what is most important for maximizing gains, when in reality all of those things are important and the work from any one group will assist other groups. It's amazing to watch it all.
It's also neat to me that companies that have ideas then end up being bad or not useful die off in a natural selection like way too.
anyways just a random thought
You are correct if you look from a long term perspective. The problem is that we have limited life span and even if your suggestion that the "bad"/useless companies die off is true in the long term , we will never experience this. Just think of the oil/gas corporations (as bad as you can get). Yes they will die off eventually, but in the meantime, we will be long time goners.
I fear human greed for wealth,fame and power. (Bcs I read our human history).
Thanks for the content as usual!
00:02 AI in 2024 is on an exponential rise
02:12 Maximizing data quality is crucial for AI model efficiency.
04:21 Mamba is a structured state space for sequence modeling.
06:38 AI in 2024 aims for faster inference and hardware-aware state expansion
08:57 Mamba architecture excels at long sequence lengths
11:09 AI models will have the ability to generate sequences of things before giving an answer.
13:20 AI methods in 2024 show significant compute equivalent gain.
15:30 AI in 2024 will see dramatic improvement due to prompt optimization
17:35 AI predictions and capabilities in 2024
19:35 AI will soon be able to imitate human voice and video.
"By painting it orange!" You got me.
I guessed the arch correctly.
Your AI impersonator got your cadence quite well! My amazement-bar got set pretty high after hearing ElevenLabs some time ago.
The voice at the end is great. Wish i had a chatbot with a realistic voice like that
M. C. Escher's Snakes woodcut is a good visualisation and prescient of the quadratic complexity addressed by Mamba.
Wow, as always, you deliver the best and concrete content for random people who have nothing to do yet with AI
None of us are ready for what's coming I feel. Wild times ahead.
I am ready and waiting for it since a long time.
I want to have my mind blown.
There were people that crossed the ocean on a steamship and then crossed back on a jumbo jet. There were people who saw America from the back of a covered wagon that lived to see the American flag planted on the Moon, broadcast live into their homes.
I don't know what it would take to shock me by how far the species has come from when I was young, but I hope to find out.
I agree, and people, largely boomers, are delusional - thinking there will be enough new jobs for the displaced workers.
Sure, there are going to be new jobs, but mostly high-skilled ones, all well and good, but the problem is that the majority of the job market today consists of menial and repetitive jobs, there will simply be too many displaced workers, that's why people like Andrew Yang and Elon Musk are in favor for UBI (Universal Basic Income).
It gives me hope for the future.
Your videos are just invaluable … especially for following up on previous developments
Thanks Christoph!
I feel all your videos are a MUST see. Love your effort.
Also I hope we will see soon a model that is actually like a teacher. Ok we will have the AGI that will solve the most complicated problems and dont think about the problems of a person.
And if there will not be a dedicated model then I will do it.
How about exponentially improving AI and ourselves together.
Thank you for sharing your time and work Phillip, 2024 should prove to be fun, as long as choas is part of the fun. Peace
Great as always!
Thanks thales!
Happy new year! 2024 is going to be a crazy AI year 😊
This stuff is Super exciting !
Thanks for making interesting videos.
Amazing content as always! best channel by far on AI
Thanks so much Lucius
Awesome video as always. Even the 2.8B Mamba is performing quite well on coding tasks in some cases, as well as Phi 2. Meta should make their move and release some new models, hopefully some multimodal models.
Transformers are so geared toward language. I think s4 seems profound because it's based on signals. It seems elegant. Maybe it's even closer to nature.
Thank you. Good work, indeed!
It would be nice to reflect on this video on Dec 2024
The voice dupe was impressive, but I’m sure I can hear the microphone effect affecting the final sound.
Well that 2024 prediction came true a bit quicker than expected.
Indeed :) can't say I didn't warn people
Love it!
Midjourney v6 is scary good. Can't wait to see how well the video model output will be. They said they will be starting training it this month.
Happy New Year. I would watch this video later x)
Happy new year to you too and enjoy!
2024 will be the year creators realize how much power they have to pump out ideas faster than ever before using AI. your AI voice is indistinguishable to your real one to me already.
stay frosty, Philip ❤
You always (and I love you for this) find a way to bring that "holy shit" factor into your videos on AI. That voice at the end was truly frightening lmao
Happy new year and thanks for the great work in 2023! Q: Do you have a summary somewhere of the tips you shared to get the most out of ChatGPT and the like? How to prompt, chain of thought, etc.
Yep, on AI Insiders, video called 'Prompt Hacks', but if Patreon not for you, check out my SmartGPT 2 videos!
@@aiexplained-official thank you!!
About performance:
Tanenbaum once wrote that techniques in software are a bit cyclic and repeat a pattern. whenever a new platform appears.
(For example that operative systems go through stages like server multiuser, personal monouser, personal multiuser, mobile mono user, mobile multiuser).
Performance and optimization is like that. Optimization is a stage that comes after the product is developed.
So whenever you talk about new techniques that requires more power, you have to keep in mind that it will be met by optimization of previous stages and techniques.
IMHO it goes like this, 1st naive implementation then optimized, then optimized architecture, then hardware optimization, then specialized hardware acceleration.
For example Orca or a smaller ai trained by chatgpt, could be an optimization of a general chatgpt. just like modern clang was developed by computers with older compilers.
GPUs are still general purpose, one of techniques in the video today is about hardware optimization (Video SRam) but we still are just about to get new custom hardware for LLMs and AI (like analog computer modules for general neural network). so that will make the curve even steeper.
also some techniques like trading compute power for time is a very old well know strategy (aka space time tradeoff).
So i'm pretty sure there will be more optimizations in every stage of this but that comes after the implementation.
Dr. Dao, is someone I have a ton of respect for. His job in flash attention 1 and 2, is just jaw dropping. Increase batch size and increase speed with less memory requirements, its funny how microsoft, llama, and mistral all train with it. What a genius tri dao is.
Pumped for 2024!! I have an incurable condition that could definitely be solved with enormous sums of money, but better to have an enormous amount of compute thrown at it for fractions for the cost.
damn this is all so cool, also it does sounds like AI Phillip needs to adjust his microphone settings since he sounds quite condensed
As a Phillip myself, i glad u make these high quality AI News! Best one out there! Thank u for ur great work :D
Thanks Phillip! I am of the one L variety...
It seems like SORA from OpenAi sped up that text-to-video prediction a bit 😅
I noticed some subtle things in your speech and so I’m interested to see what’s transfers
Interesting behavior: Pause at 17:40, and take a look at the Midjourney arch. When you pause and unpause the video, (or full/unfullscreen) there's WAY more artifacting on the Midjourney arch, and none at all on the actual photo.
I guess you could well try to 'manually' optimise the prompts, by not only asking it whatever you want ask it, but prequel it with a simple "please optimise this question, and then use that optimisation as prompt" e.g. when building a bot of your own. In fact I think I'll give it a shot.
Excellent
10:19 😮 crazy
Solving the "infinite memory" issue will be fundamental step towards sentient artificial beings. That, combined with general extrapolation should make for some crazy convincing life like entities.
4:42 I was pretty sure DALLE3 and DALLE2 used diffusion not transformers? I know the original DALLE used transformers but then diffusion became the trend and DALLE2 utilised this and i do not believe they reused transformers for DALLE3. I could definitely be wrong though.
'The input to the Transformer model of Dall-e is a sequence of tokenized image caption followed by tokenized image patches'
@@aiexplained-official Ah well, cdn.openai.com/papers/dall-e-3.pdf, i am probably wrong again lol but my understanding is they use transformers to basically produce an encoded representation of the text prompt which is then used by the diffusion model to generate the image. Whereas with the original DALLE arxiv.org/pdf/2102.12092.pdf they use a transformer to model both text and image tokens as a single stream of data. When generating an image, the model uses the text tokens as a prompt and autoregressively generates image tokens, creating an image that matches the description. I should have been more specific, but I think DALLE2 and DALLE3 both still use diffusion for the actual image generation. Thanks for your response though!
One thing i would love to revisit is the original DALLE. It was a fine tune of GPT-3, and I just wonder if GPT-4 (or their next model) was finetuned to generate images (but finetuning it on a LOT of data) and if it still went through chat tuning, what would happen? I’m thinking of something like this arxiv.org/pdf/2312.17172.pdf (Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action - really cool multimodal LLM paper) and on page 35 it seems that this single transformer based multimodal model has a richer understanding of images leading to more accurate and realistic generation compared to other methods.
I count that as using transformers!
@@aiexplained-official Haha, that's fair enough! Putting that aside, I'm really excited about the potential AI developments we might see in 2024 and I can't wait to see your in-depth, high-quality videos on these developments - you truly have an amazing way of delving deep and presenting things! Have a good day/night!
mannn i barely understand shizz but IT'S SO DANG INTERESTING. I wonder if OpenAI will use Mamba or similar architectures - and I wonder if it has slightly different emergent behaviors like from what Transformers were doing
2024 will introduce chips like H200, L40S, B100 and B40 from Nvidia. I am very excited to see how Mamba and new data input techniques will scale out with all the new hardware. Thanks for the video. Need to view this video a few more times so I do not miss anything.
Good stuff
The current state of AI makes me feel like we're still watching the intro of a video. I'm so excited to finally get to the meat of it! Jesus, where's the skip button?!
What tool do you use to visualize your PDFs this way (sources in green, links in blue...)?
Hypothesis on chrome
17:51 - I guessed it right within about 30 seconds
Nice
@@aiexplained-official Cool, I did not expect you to reply to me and that quickly :)
Anyone know if any researchers are using Mamba on Cerebras WSE-2s? 40GB of SRAM per chip is pretty compelling (compared to 40MB on A100s).
I use gpt 4 to help me take my prompts for mj v6 to help make more and better mjv6 prompts, then feed the image results back to gpt4 to then get better prompts yet again.
Happy new AI year!
You too!
i do think the architectures need to become a lot more modular, evolution didn't go that route for no reason.gpt-4 is pretty sure it needs recursive loops within the model and specizalized modules, no matter which route i go with prompting. my intuition is that reusability of the same networks is the solution to reasoning. but it shouldn't first output a token and then feed it back, but instead feed back the internal state representations
nice man
So, will there be a swingback to the LSTM RNNs from like 2014?
love the french accent!