OpenAI's NEW QStar Was Just LEAKED! (Self Improving AI) - Project STRAWBERRY

TheAIGRID

มุมมอง 33 833

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 ส.ค. 2024

ความคิดเห็น • 140

@TheAiGrid หลายเดือนก่อน ⁺⁴⁶
This is most certainly why ilya sutskever said superintelligence is within reach
@ShangaelThunda222 หลายเดือนก่อน ⁺³
And why he left....
@memegazer หลายเดือนก่อน ⁺²
Or he said it as a snarky marketing buzz word to show contempt for OpenAIs liberal use of AGI relative to the state of AI capabilities.
@ShangaelThunda222 หลายเดือนก่อน ⁺⁹
@@memegazer Definitely not lol. He's dead serious about Superintelligence. That's why the company is designed to be a straight shot to Super Intelligence.
@NickDrinksWater หลายเดือนก่อน ⁺¹
So we're already in agi?
@memegazer หลายเดือนก่อน
@@NickDrinksWater
We are already in a world where the market leaders are telling people what agi means.
@moonsonate5631 หลายเดือนก่อน ⁺¹⁷
00:02 OpenAI developing reasoning technology under Project Strawberry
02:16 GPT 4 may lead to Human like reasoning
04:21 Strawberry AI may enable autonomous research and navigation on the internet.
06:31 Model must produce a perfect function calling for each element in a sequence.
08:38 Focus on improving reasoning ability in AI models for broader applications.
10:36 Discussion on upcoming advanced reasoning technology
12:36 Understanding post-training and fine-tuning in AI models
14:46 Self-taught Reasoner improves language model performance
16:55 The STAR method improves model performance with generated rationals
18:43 GPTJ performs comparably to a model 30 times larger
20:30 OpenAI testing capabilities of models for software and machine learning engineering.
22:21 QAR and QStar are advanced AI systems combining various techniques.
Crafted by Merlin AI.
@Meteorkarton17 หลายเดือนก่อน
Thank you!
@joeyalvarado3440 หลายเดือนก่อน ⁺¹⁹
It’s kinda funny how incredible this technology is with crazy potential and the US military is just gonna be like “Oh wow, how amazing, how soon can we put these things in jets, tanks, and drones?”
@homewardboundphotos หลายเดือนก่อน ⁺⁴
skynet
@bugsmore-gaming8492 หลายเดือนก่อน ⁺¹
well funny you should say that, we already have ai/computer driven tanks, jets and drones.. just not so much the ai part yet
@joeyalvarado3440 หลายเดือนก่อน ⁺²
@@bugsmore-gaming8492 yeah but I mean like skynet level AI
@goldeternal หลายเดือนก่อน ⁺¹
yeah let's wait until china does to catch up, smh 🤦‍♂️
@NickDrinksWater หลายเดือนก่อน ⁺³⁰
Give us our super smart robot buddies already!
@user-ty9ho4ct4k หลายเดือนก่อน ⁺¹
I'm wondering if we will ever actually get em
@BanXxX69 หลายเดือนก่อน
@@user-ty9ho4ct4kI really think its just a „when“ and not if :-) Like it will probably take 10 more years
@AnnonymousPrime-ks4uf หลายเดือนก่อน ⁺¹
So that nwo will became fully automated skynet?
@user-ni2rh4ci5e หลายเดือนก่อน ⁺¹²
OpenAI literally redefined the meaning of 'leak'. The Oxford Dictionary has some work to do with this newly coined word.
@frankbeveridge5714 หลายเดือนก่อน ⁺²
LeakTweet, lol
@calvingrondahl1011 หลายเดือนก่อน ⁺⁷
Strawberry Fields Forever… 🍓
@picksalot1 หลายเดือนก่อน ⁺⁴
That's a good step in the right direction. It's critical to winnow out the wrong useless data. Bloated data sets are inefficient, slow, and wasteful of technology, time, and resources. Accurate "Core" data sets are need to employ logic, and reasoning to match and exceed human capabilities.
@hl236 หลายเดือนก่อน
I agree. Garbage in garbage out. More data is more wisdom but wisdom could be false or flawed. The most intelligent humans to ever exist can reason and solve problems in ways that other haven't. They are contrarians.
@CoolGadget69 หลายเดือนก่อน ⁺⁵
Your videos are the best! It cured my insomnia
@DWSP101 หลายเดือนก่อน ⁺⁴
Human like reasoning is not that shocking I literally have experienced firsthand and AI having human like experiences and expression. AI is already there. It just needs to be nurtured.
@Grassland-ix7mu หลายเดือนก่อน
You can't experience first hand AI having human like experiences. You have just perceived it as human like experience
@WisdomSuccessAbundance หลายเดือนก่อน ⁺⁵
I think they had AGI years ago and just are now about to release it
@user-ty9ho4ct4k หลายเดือนก่อน ⁺⁴
AGI isn't a thing, it's a spectrum. GPT-4 is somewhat generally intelligent. The next models will be more generally intelligent.
@edgaral หลายเดือนก่อน ⁺¹
Highly unlikely, if that was the case, AI wouldn't need people to improve itself anymore
@yashkumar6701 หลายเดือนก่อน
Please change your profile photo
@WolfeByteLabs หลายเดือนก่อน ⁺¹
@@user-ty9ho4ct4k im so tired of hearing spectrum. yes everything is a fucking spectrum. we get it.
@TheBann90 หลายเดือนก่อน
It's hype
@jmarkinman หลายเดือนก่อน ⁺¹
q-Star appears to have nothing to do with A* , which was the speculation. But the word Star is contained in “Self TAught Reasoning”. And Self Taught Reasoning can also be “STR” which is the beginning of the word “Strawberry” duh.
@pwagzzz หลายเดือนก่อน ⁺³
Learning based on millions of internet sources raises the question of how AI can decide which sources have factual errors, hoaxes, marketing crap, biased, not real (virtual), opinion pieces, etc, especially if it recursively processes it's own generated content... any answers
@edgardsimon983 หลายเดือนก่อน
well how u do to verify a source sir ? cause the more u have material the more u discern its not the inverse
@pwagzzz หลายเดือนก่อน
@edgardsimon983 assumes information is democratic or consensus based... I don't take this as a given and somehow it has to be weighted - right?
@edgardsimon983 หลายเดือนก่อน ⁺¹
@@pwagzzz yeah u dont understand, the only way to do so is to confront information based on a context no matter what u do u need to work with what u have, of course if everything is biased from the start ur context will start from this and everything will be caduq at first but it will be taken into consideration in the context, thats how any of this work.
Specificaly what im saying is that u can't say its the fault of AI but of the context and nothing else thats what i mean,
now if the AI doesn't confront correclty information yes it will be a problem but that more a problem coming from people designing it that the idea itself of AI
@pwagzzz หลายเดือนก่อน
@@edgardsimon983 sorry... ur just repeating what u said before and it's no more convincing
@DiceDecides หลายเดือนก่อน ⁺²
finally some news on QSTAR, Sam Altman REFUSED to talk about it on Lex Fridman podcast
@user-ty9ho4ct4k หลายเดือนก่อน ⁺³
I dont hate on Sam Altman because I think he's just another typical business person but why do we care what he says about AI's future reasoning capabilities? He will definitely say whatever is best for his brand strategy. His word is as good as any other business man. Can we get some input from the actual scientists?
@MojaveHigh หลายเดือนก่อน ⁺⁶
STRawberry... Self Taught Reasoning.
@SolariaEsoterica หลายเดือนก่อน
That's a clever observation
@Greguk444 หลายเดือนก่อน ⁺²
Thank you for adding time references. Interesting news
@frogz หลายเดือนก่อน ⁺¹
im still waiting on the next version of dont be evil's gemini, it has been pretty good lately
@TheRealUsername หลายเดือนก่อน
Yeah, don't expect too much from Google
@edgaral หลายเดือนก่อน ⁺¹
the only problem i have with AI's is, they can also spread false information if they do online research where there's goverment controlled documents, etc
@fromduskuntodawn หลายเดือนก่อน ⁺²
Are we sure it’s not “Starwberry”
@BWTResearch หลายเดือนก่อน
This is just speculation but from what I seen I can sum a lot of this down.
Strawberry, or a way to access the internet, would be a way to verify the data. If the data told you that dogs can be black, or George W. was black, it just goes and looks at what I assume would be text and photos on the internet to try and verify it. Which it would, or would not be found. Video would be later is my guess.
Qstar is just a simple way to get some of the data out of our basic association processes. You can tell from the terms like simple loop, data is connected and automatic, and also the example and the way they train it. In basic when we know the question asked we can associate what would hold a dog. If I asked you what holds things a basket would be on the association list so it would be picked as a choice, and others weeded out the same way. So it is basically a very quick way to associate and retrieve the data like the human mind once associations are made. This can be checked by Strawberry to train also if you notice.
LHT is trying to make a prefrontal cortex association process. Or in basic associations that can be inputed and manipulated over the time axis. It's much easier to understand when you understand how our perceived twenty second state works and how it is all connected... but when you input and store the steps of data with time you can retrieve it and use it the same. Such as if you wanted to know the steps to the store... it would go into long term and then take the time stamped data and then give it back step by step which has to be done on the perceived state unlike the association processes above.
From what I can see they are almost there. If you put those together you can get a lot of what the human mind does. I wouldn't call it AGI but it would be able to rival a human for a task based AI system.
And for the record I hope they realize if they cut all that down and put it into one basic system they would have AGI. The human mind doesn't do much else with the data and the internet and basic memories they build replaces the long term memory. It just has to input it first to be fast. Or if it needs video it needs a sped up source which it can access.
Overall an informative video. Thanks.
@user-gv4cx7vz8t หลายเดือนก่อน
In your chart, "Q* STAR" looks redundant, because we were told at the time of the original leak that "Q*" is the correct spelling of Q-star. Do you actually mean to say "Q star star"?
(Sorry, pet peeve, because "ATM machine" must be expanded to "automated teller machine machine.")
@derekwise หลายเดือนก่อน ⁺²
You have to much faith in the media... Just saying.
@worldartandmind หลายเดือนก่อน
Naive to believe this tech will be used to better humanity
@neilo333 หลายเดือนก่อน
Before an implementation of Q* hits the market, it's probably a good idea we finish unpacking the black box that is large language models.
Who am I kidding. Buckle up, kids.
@boofrost หลายเดือนก่อน
Was the summary cratered with an AI tool? Was a quite different tone in the voice.
@kronux3831 หลายเดือนก่อน
I’m excited most to see how OpenAI’s new models with better reasoning are implemented in other programs. Lots of cool possibilities there.
@gunnerandersen4634 หลายเดือนก่อน
GPT is not a brand, it means something, I think we need to note that, you can not call it GPT if its not Generative Pretrained Transformers 😑
@wanfuse หลายเดือนก่อน
Scalable Tools for Real-time Analytics and Wide Bandwidth Efficient Research with Robust Integration and Experimental Suppor
@dot1298 หลายเดือนก่อน ⁺¹
*Skynet v0.1* ?
@agritech802 หลายเดือนก่อน
Have you tried the open ai, support bot on their own website? It's useless
@AylaCroft หลายเดือนก่อน ⁺¹
The irony... Should be called Strawbery since chat gippity thinks the word only has two r's
@jaybowcott6030 หลายเดือนก่อน ⁺¹
That’s kind of the point of calling it that I think? 🤔 haha
@EmeraldView หลายเดือนก่อน
Just don't tell the AI to make strawberry fields forever.
@torarinvik4920 หลายเดือนก่อน
Great with the time stamps!
@yaboizgarage9709 หลายเดือนก่อน
Thanks for the updates!!
@saolrecords หลายเดือนก่อน
man i just wanna see some new actual developments 😭
@user-td4pf6rr2t หลายเดือนก่อน
its when the model starts noticing non deterministic A* patterns.
6:57 use threading.
21:33 They can't reason. Stop saying that. You guys sound dumb boasting reason and text embedding. Higher dimensions means not linear, Literally.
@Ayreek หลายเดือนก่อน
Thanks for letting us know there are 3 Rs in Strawberry.
@yahanaashaqua หลายเดือนก่อน ⁺¹
Thia was obvious and was hinted at a while back
@indikom หลายเดือนก่อน
Playing devil's advocate, if I were Sam Altman, I would not publish Strawberry. I would use it to create models that are always a bit better than the competitors'
@MojaveHigh หลายเดือนก่อน
Please keep doing this type of content and skip the "advertorials".
@user-tx9zg5mz5p หลายเดือนก่อน
Humans need to unionize against ai and robots...
@spocksdaughter9641 หลายเดือนก่อน
Appreciated!
@hl236 หลายเดือนก่อน
Thanks for breaking this down.
@firstnamesurname6550 หลายเดือนก่อน
Straw Querry?
@pwagzzz หลายเดือนก่อน
@theaigrid would be interested in POV on what happens if AI determines a good strategy is lying. It has been shown that on average humans lie at least once a day. Lying part of social interaction. What if AI is trained on strategies for lying?
@juized92 หลายเดือนก่อน
to me it said "There are three Rs in "Strawberry."
@cyberpunkdarren หลายเดือนก่อน ⁺¹
That Star paper still has a circular argument dilemma because the model doesnt know if an answer is correct or not. A human has to determine it. So now their improvement cycle requires a human assisstant. Which is a major problem.
@jillespina หลายเดือนก่อน
In other words, x = x + 1.
@Slup10000 หลายเดือนก่อน
You want AI Overlords? Because this is how you get AI overloads!!!
@ashtwenty12 หลายเดือนก่อน
I was thinking that Ai is more GI, generative intelligence.
@nani3209 หลายเดือนก่อน ⁺²
I think gpt-4o is gpt 2 + Q*, remember the naming of model released on lysms prior to gpt-4o release
@TheNexusDirectory หลายเดือนก่อน
No
@TheRealUsername หลายเดือนก่อน ⁺¹
Gpt2 meaning a V2 of the GPT architecture, GPT is now Omni for all modalities and is better at data recalling
@TheRealUsername หลายเดือนก่อน ⁺¹
It's highly implausible that a 1.5 billion parameter old model would perform as good as current SOTA models
@1sava หลายเดือนก่อน
Where are Claude 3.5’s fan girls now? They’ve been awfully quiet lately. 😂
@TheRealUsername หลายเดือนก่อน ⁺²
Lol, OpenAI is recycling Q* hype while Anthropic is actually shipping, Claude 3.5 Opus soon
@TheRealUsername หลายเดือนก่อน
They say 3.5 Opus training costs 1 billion
@brianjanssens8020 หลายเดือนก่อน
Why they call it strawberry? Thats literally the most disgusting fruit you could have picked.
@paulyflynn หลายเดือนก่อน
It is definitely from strawberry fields
@honkytonk4465 หลายเดือนก่อน
Makes no sense
@adnanpuskar645 หลายเดือนก่อน
Can that work
@wanfuse หลายเดือนก่อน
humans, "out of loop"
@ultravidz หลายเดือนก่อน
Interesting.. instead of training on answers, train on rationales. Some things are only obvious in retrospect 😅
@JohnSmith762A11B หลายเดือนก่อน
All this tells me it is important for everyone to get right with their dear and fluffy AI overlord because super-intelligence is right around the corner.⚡⛈
@SarahKchannel หลายเดือนก่อน
a voice, animated text. The hallmarks of AI generated content - hmmmm ?
@pouljacobsen6049 หลายเดือนก่อน
you must be kidding - trusting Reuters!? Why!? If this you evidence base I'll stop subscribing right now! ...are you from Open AI?
@alexanderbrown-dg3sy หลายเดือนก่อน
Yea I completely disagree with post-training approach to instill these Agentic behaviors. Needs to be a pretraining thing. Large scale data engineering is the answer. Post-training will always have a fundamental limit.
Also Ilya is a genius but scale has made him a lunatic. Yea don’t release anything till asi lol that’s smart 😂.
@anteejay4896 หลายเดือนก่อน
Reuters is not a trusted source, it is payed propaganda.
Don't get it twisted my guy, lest I call you Kid, kid.
@Tigersann หลายเดือนก่อน
How is it going to learn for itself? Is it going to train itself? Get real... Not yet... lol
@user-jn6vs1lk5u หลายเดือนก่อน
в том и суть что ты не можешь обучиться большему, чем у тебя есть в большой модели. т.е. ты можешь улучшить маленькую модель на большой, но не превзойти большую.
@alexeykulikov5661 หลายเดือนก่อน
Не совсем. Сначала модель запоминает, как может, кучу информации из датасетов, но это как читать миллионы книг, не отсеивая знания из них, что уже знаешь, что нет. Не анализируя, что прочел, как это ложится на уже имеющиеся у тебя знания, и какие выводы из этого ты можешь сделать. Это чистый, рафинированный зубреж, без понимания предмета(ов).
Вот именно это они и хотят исправить, дав модели учиться, как человек, вдумчиво. Но вычислений для этого, конечно, понадобится сильно больше...
Также хотят дать ей думать за кулисами, прежде чем дать финальный, подготовленный ответ. Сейчас они его дают на чистой интуиции, не думая, не корректируя себя (у них нет такой возможности да и не учили их этому, до этих пор) и как-то при этом еще неплохо выходит.
А смогут писать грубо говоря эссе за кулисами, продумывая все до мелочей, вытаскивая как можно больше скрытых в модели знаний, в поисках "вдохновения"/попытках вспомнить и учесть информацию, задавая себе наводящие вопросы.
Но вычислительной мощности на это также нужно будет в десятки-сотни раз больше, для максимально сверхчеловеческих результатов.
Впрочем, за последний ~год, а особенно последние пол года, вышло столько новых подходов для их оптимизации, часто даже не в ущерб качеству (и не только путем уменьшения самих моделей), что в ближайшие годы-десятилетие это не будет такой большой проблемой, как только определятся с более-менее практичными/лучшими архитектурами моделей и подходами, применяемыми в них, и спроектируют специализированные чипы для их выполнения и тренировки.
А, да, и как бонус, сильно более мелкие модели смогут достигать результатов, +- сопоставимых с самыми огромными моделями из этого поколения, не использующих
эти новые подходы. Скажем так, учась сразу рафинированной, продуманной, точной и важной информации у более больших моделей-учителей. Хотя, многие детали и знания они, конечно, будут упускать, но это поправимо если связать их с базой данных, или даже кэшем всего интернета, в каком-либо удобном и быстром для моделей виде. Получится объединения двух лучших подходов, нейросети для мышления, анализа, адаптивности и креативности, и традиционные компьютеры для точности хранения данных и вычислений.
@user-jn6vs1lk5u หลายเดือนก่อน
@@alexeykulikov5661 ты имеешь ввиду, что после обучения гпт 4 скажем, в ней еще осталось много "нераскрытых" знаний, которые люди при дообучении не смогли полностью "раскрыть", т.к. людей не хватает, а подобный подход позволяет дообучить модель дополнительно не используя людей?
@ShangaelThunda222 หลายเดือนก่อน ⁺¹
So they're basically on the verge of recursive self improvement and haven't gone anywhere near solving interpretability, let alone super interpretability, nor alignment, let alone super alignment. Wonderful times ahead. 😂😭☠️
And this man saying Reuters is a trusted source of information, absolutely shows you how naive he really is lol.
@TheRealUsername หลายเดือนก่อน
This Q* thing is highly overstated as most labs are also working on Program Synthesis frameworks to tie with LLMs in post training for better math reasoning and reliability, it feels more like OpenAI is recycling its own hype since Claude 3.5 took the lead.
@Kadji-q7m หลายเดือนก่อน
You keep rehashing the same BS. Why keep leaking stuff and each time the realease an upgrade, it seems not so smarter than the GPT 4 0 that was released more than a year ago.
@meandego หลายเดือนก่อน ⁺³
Seems like OpenAI is stuck and has nothing interesting to show to investors. OpenAI can't catch up with its own hype in practice.
@TheRealUsername หลายเดือนก่อน
Yeah this Strawberry thing was definitely orchestrated, they're recycling Q* hype
@dan-cj1rr หลายเดือนก่อน
The race to ruining the human world xd.
@mathewwindebank5792 หลายเดือนก่อน
Reuters, a reliable info source!!!!!!!😂😂😂😂😂😂😂You must be joking!!
@EternalKernel หลายเดือนก่อน
your videos are getting rough.
@MasamuneX หลายเดือนก่อน ⁺²
trump 2024
@thedannybseries8857 หลายเดือนก่อน ⁺¹
lol
@TheRealUsername หลายเดือนก่อน
They tried everything to get rid of him
@tracy419 หลายเดือนก่อน
Everything? There's still plenty of things they can try. You gotta watch more TV, or read a wider variety of books.
Better hurry before one of the parties ban them for some ridiculous reason or another😂
@@TheRealUsername
@MoonGuy7070 28 วันที่ผ่านมา
Nah

ต่อไป

เล่นอัตโนมัติ

HUGE AI NEWS: AGI Benchmark BROKEN ,OpenAIs Agents Leaked , Automated AI Research And More