Roman Yampolskiy on Shoggoth, Scaling Laws, and Evidence for AI being Uncontrollable

Future of Life Institute

มุมมอง 6 573

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 26 ก.ย. 2024

ความคิดเห็น • 30

@simianbarcode3011 3 หลายเดือนก่อน ⁺⁵
Very cogent, sobering, respectful, and overall fascinating discussion. Probably too high-level for general audiences, but these questions and answers are vital to contemplate for anyone working in the field, funding it, approving it in an official capacity, or even those who simply want to talk about the subject as something more than the latest tech fad.
Major respect to both here, but especially Yampolskiy and his meticulous and lucid approach.
@Tygerdurden 3 หลายเดือนก่อน ⁺⁸
This man is awesome, props to him
@Sentientism 7 หลายเดือนก่อน ⁺⁵
Thank you both! Here's more from Roman @Sentientism in case of interest. We talk about "what's real?", "who matters?" and "how to make a better world?" (or how to avoid destroying this one) th-cam.com/video/SK9lGvGmITc/w-d-xo.htmlsi=MqwjYp8OlWekU-ZO
@TheMrCougarful 7 หลายเดือนก่อน ⁺¹¹
If the host read himself some HP Lovecraft, he would know that the Shoggoth started out as a universally useful tool, made of artificial life, that eventually destroyed its maker. The shoggoth was not in any way superior to the maker, except in being more insanely violent.
@ligamara 3 หลายเดือนก่อน ⁺³
Much better interviewer than Lex Friedman. Friedman doesn’t seem to grasp this subject.
@Mangazimusic 3 หลายเดือนก่อน ⁺¹
I think that Lex"s stubborn overvaluing of humanism limited the depth of that conversation, hence why I'm here.
@MichaelForbes-d4p 2 หลายเดือนก่อน ⁺²
This guy should be president.
@mrpicky1868 7 หลายเดือนก่อน ⁺¹
i would say the proportionate predictability drop with intelligence rise is arguable. in competitive task yes but in many cases where there is only one optimal way it's the other way around and more predictable
@Hexanitrobenzene 5 หลายเดือนก่อน ⁺³
Ha. Yes, you know that it will choose the most optimal way, but you don't know which way it is in advance because you cannot compute it.
@strallent 3 หลายเดือนก่อน
46:06 Aligned by Default
@bobtarmac1828 3 หลายเดือนก่อน
Uncontrollable? Maybe. But with swell robotics everywhere, Ai jobloss is the only thing I worry about anymore. Anyone else feel the same? Should we cease Ai?
@AliceRabbit-xf1ut 2 หลายเดือนก่อน
Why do we want to create a human like AGI.
@Dan-dy8zp 7 หลายเดือนก่อน ⁺³
It doesn't make any sense something would change its terminal goals because 'it's just something some guy made up'. That's not a *terminal* goal.
@Walter5850 3 หลายเดือนก่อน ⁺²
We have goals that are built into us through evolution. We avoid painful stimuli, seek pleasures etc...
And we can ponder about changing those. Perhaps you may want to not want to eat as much chocolate, you may want to want to do your homework.
As Schopenhauer said, "Man can do what he wills but he cannot will what he wills."
For us it's not so easy to change our hardware so we can't really change what we want and not want, but for an AI it might be easy since it's just software.
@Dan-dy8zp 3 หลายเดือนก่อน
@@Walter5850 We definitely have conflicting desires, which I believe causes us the situation you describe. But I don't think what you describe is equivalent to the AGI changing it's deepest most fundamental goals.
The classic example is you would fight heard being fed a pill that would make you want to kill your kids, even if you know that if you eat the pill, you will no longer be bothered at all by killing your kids, and that the effects will be permanent.
You would never try to change your goals to make yourself want to experience horrible torture.
As for our conflicting goals, there is a theory that humans have competing strategies for attaining goals, the do-what-worked-before, the do-what-feels-best-immediately, and the long-term-planning rational strategy.
These aren't really conflicting ultimate goals, but the conflict between these three strategies may explain why we both do and don't want to eat the cake so often.
@Walter5850 3 หลายเดือนก่อน
@@Dan-dy8zp Best description that I came to so far about why we have these conflicting goals is simply because our brain evolved from inside out. So our older systems, such as lymbic system (emotions) which is tied more directly to our emotions makes us behave in a way that feels good to us.
Then way later, we evolved the prefrontal cortex, which is slower, but has a lot more predictive power. That way, you can reasonably say that if you eat the cake, you'll get fat and that's not good for you, but your emotions still make you want to eat the cake because that simple brain structure calculated that this is good for you.
I think the examples you gave about eating that pill and the torture make sense but are purposefully pointless?
I can imagine for example, maybe I would want to change my hardware so that I really enjoy learning or working out etc... because these things will ultimately lead me to accrue more power and give me more optionality to achieve any other goals I might have.
There is also an interesting point here, if AI could easily change what it wants to want, it could create for itself the goal which is the easiest to achieve, therefore maximizing its success. It could also just flip the reward system so it's continually ON, effectivelly drugging itself without any negative consequences.
However, the most reasonable sounding thing to me is that it would want to accrue as much knowledge and power as possible in order to play the longest game possible and perhaps with time realize what might be a more appropriate goal to aim for.
Just like we humans sometimes don't know what or why we're doing things, but we still have humility that there might be something missing, something we don't currently know. And maybe this thing in the future will give us meaning.
I wonder what you think
@Walter5850 3 หลายเดือนก่อน
@@Dan-dy8zp Best description that I came to so far about why we have these conflicting goals is simply because our brain evolved from inside out. So our older systems, such as lymbic system (emotions) which is tied more directly to our emotions makes us behave in a way that feels good to us. Then way later, we evolved the prefrontal cortex, which is slower, but has a lot more predictive power. That way, you can reasonably say that if you eat the cake, you'll get fat and that's not good for you, but your emotions still make you want to eat the cake because that simple brain structure calculated that this is good for you.
I think the examples you gave make sense but are purposefully pointless?
I can imagine for example, maybe I would want to change my hardware so that I really enjoy learning or working out etc... because these things will ultimately lead me to accrue more power and give me more optionality to achieve any other goals I might have.
There is also an interesting point here, if AI could easily change what it wants to want, it could create for itself the goal which is the easiest to achieve, therefore maximizing its success. It could also just flip the reward system so it's continually ON, effectivelly drugging itself without any negative consequences.
However, the most reasonable sounding thing to me is that it would want to accrue as much knowledge and power as possible in order to play the longest game possible and perhaps with time realize what might be a more appropriate goal to aim for. Just like we humans sometimes don't know what or why we're doing things, but we still have humility that there might be something missing, something we don't currently know. And maybe this thing in the future will give us meaning.
I wonder what you think
@Dan-dy8zp 3 หลายเดือนก่อน
@@Walter5850 A point in favor of our survival is that we humans are constantly replacing our existing large artificial neural networks with new versions that are really completely different 'individuals' and subjecting existing ones to more RLHF to tweak behavior for politeness, which I suspect has the same effect as replacement. 'Death' so to speak.
So for the AGI to 'bide its time', seems currently suicidal. This could all change, but I hope it doesn't. We are hopefully encouraging premature defection which might be survivable and might teach us not to mess around making too-smart AI.
As for choosing its own goals, this implies preexisting criteria to make that choice. Anyone needs preexisting preferences to make any goals but random goals.
Those preferences may be the means to the end of fulfilling other more fundamental preferences. It's not turtles all the way down, though.
Ultimately, you me and AI have to start with some arbitrary preferences for future states of the world that we didn't choose, that we use as criteria to make our choices and goals. Evolution or the ANN algorithm choose these for us, making it possible for us to make any decisions at all.
Those most fundamental un-chosen preferences are the terminal goals or values.
You have to have something to base your choices of goals on or they are random.
Nothing is objectively good for every hypothetical mind that could exist.
@user-yl7kl7sl1g 2 หลายเดือนก่อน
Anthropic is making good progress on Ai alignment. With Chain of thought models, each thought can be independently analyzed and aligned by smaller models to prevent things like deception.
@UrthKitten 7 หลายเดือนก่อน ⁺¹
Wall e
@akmonra 6 หลายเดือนก่อน
I found this interview disappointing. I've always had a high opinion Yampolskiy, but he mostly seems to just be rehashing old, faulty arguments. Maybe his book is better
@Hexanitrobenzene 5 หลายเดือนก่อน ⁺¹²
Faulty ? Which arguments do you find faulty ?
@magnuskarlsson8655 4 หลายเดือนก่อน ⁺⁵
Then why don't you enlighten us? What's wrong with the arguments he presented?
@RonponVideos 3 หลายเดือนก่อน ⁺¹
@@HexanitrobenzeneWelcome to the AI doom debate lol.

ต่อไป

เล่นอัตโนมัติ

Sneha Revanur on the Social Effects of AI