AI Gridworlds - Computerphile

Computerphile

มุมมอง 123 918

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 12 พ.ค. 2024
Sponsored by Wix Code: Check them out here: wix.com/go/computerphile
A safe place to try out AI algorithms, gridworlds are a standardised testing ground. Rob Miles takes us through AI safety, gridworld style.
EXTRA BITS: • EXTRA BITS: AI Gridwor...
Gridworld Paper: bit.ly/2ryxhGt
Gridworld Github: bit.ly/2KJE6xH
More from Rob Miles: bit.ly/Rob_Miles_TH-cam
Thanks to Nottingham Hackspace for providing the filming location: bit.ly/notthack
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

ความคิดเห็น • 216

@EamonBurke 6 ปีที่แล้ว ⁺²¹⁸
Im a simple man.
I see Rob talking about AI, I watch the video twice.
@z-beeblebrox 6 ปีที่แล้ว ⁺⁴⁵
Sounds like you're abusing your reward function
@VoidMoth 6 ปีที่แล้ว ⁺⁸
gotta make sure you interpet your training data correctly
@stumbling 6 ปีที่แล้ว ⁺¹
73% Lions Shagging
16% A Lion
10% Car
1% Covfefe
@anonanon3066 3 ปีที่แล้ว
Rob?
This is a robbery! Give me your wallet
@AlexanderKazakovIE 6 ปีที่แล้ว ⁺⁶⁴
This is the first AI safety video of yours (and of all that I've ever seen) that makes the AI safety immediately practical and immediately relevant in today's world! It would be great to see more of diving into such super practical examples in this released 'gridworld'!
@TechyBen 6 ปีที่แล้ว ⁺⁸
Yes. So much so. We especially need cars that avoid lava right now!
@sd4dfg2 6 ปีที่แล้ว ⁺⁶
Is there anyone who didn't play "don't fall in the lava" or "don't get eaten by sharks" as a kid?
I do think "don't walk on the baby" is a lot more understandable to regular people than the "paperclip maximizer" story the nerds always bring up.
@julianw7097 6 ปีที่แล้ว ⁺⁴
Do you watch his channel?
@z-beeblebrox 6 ปีที่แล้ว ⁺²
TechyBen, hey if you're in Hawaii right now, a car that avoids lava would be pretty damn useful
@AlexanderKazakovIE 6 ปีที่แล้ว ⁺¹
I do. What I love about these gridworlds is that they make the problem tangible in a way that you can try solutions on them easily.
The walking on baby or paperclip examples are closer to the real world, but also hypothetical (due to their complex real nature). And because of it any proposed solutions can assume a lot. While in the gridworlds the rules are super straightforward. And this enforces any proposed AI safety solutions to be super explicit and testable.
@TGC40401 6 ปีที่แล้ว ⁺¹³¹
Kids use data more efficiently than current AI.
AKA The nerdiest thing I've heard on this channel.
@hexzyle 4 ปีที่แล้ว ⁺¹⁷
That's because humans are too sensitive to the data. That's how we get superstitions. We're efficiently using data that is actually meaningless.
@thefakepie1126 3 ปีที่แล้ว ⁺³
@@hexzyle or it's just because we have about 86 billion more neurons
@jh-wq5qn 3 ปีที่แล้ว ⁺⁸
@@thefakepie1126 Some models have more parameters than that. GPT3 has about 170 billion if I remember correctly. Our neuroplasticity and our ability to build on previously learned knowledge (and knowledge we are born with, like a super optimized 'reward function', a.k.a. our senses and animal instincts) are some of the reasons we use data more efficiently. Simply put, we have more pre-learned knowledge to work with. An AI learning to make a cup of tea from scratch may have to learn that there is a world, that they can move their appendages and that liquid can be poured. Kids were either born with that knowledge or already know it. There is a whole subfield of machine learning for this called meta learning or few-shot learning, wherein models are attempted to be trained using pre-learned knowledge and fewer data points. It's fascinating, really.
@golym6807 6 ปีที่แล้ว ⁺¹⁰⁷
5:08 "its in your performance evaluation function" I always knew this guy was a robot
@yondaime500 6 ปีที่แล้ว ⁺⁶
That sounds like something GLaDOS would say.
@JmanNo42 6 ปีที่แล้ว ⁺¹
LoL pretty close he is ENTP, did you see bladerunner ;)
@JmanNo42 6 ปีที่แล้ว
The Voight-Kampff test, the android do not have the tree deep to evaluate between two potentials, so it goes into polarity mode "also known as binary evaluation".
@JmanNo42 6 ปีที่แล้ว
I think that general evaluation depends upon knowledge of concepts, that you find similarities of features "pattern finding". So the ultimate intelligence must not only be fast it must learn concepts and ***explore them***
Well to take an example Kirks test in Startrek, he did not apply what he had been learned his mind was outside the box. That is association skills, in its deepest meaning, to take knowledge into next step/level regardless your area of expertise.
Learning is quite another thing, ENTP's are the best learners there ever will be.
When i get angry i call them parrots, because their thinking about the subject is really shallow outside what they read/learned.
@KebradesBois 6 ปีที่แล้ว
GLaDOS or Mark Zuckerberg...
@dieisonoliveira6994 5 ปีที่แล้ว ⁺⁴
I just love ever single bit of everything in this guy.
@Macieks300 6 ปีที่แล้ว ⁺⁶
my favorite topic on Computerphile
@xyZenTV 6 ปีที่แล้ว ⁺¹⁷²
More AI videos, yay!
@kingxerocole4616 6 ปีที่แล้ว ⁺⁵
Looking forward to reading this paper even though I have absolutely zero training in any relevant field.
Thanks, Rob!
@moistmayonese1205 5 ปีที่แล้ว ⁺⁵
8:50
-”But AI, you can’t do that!”
”Well, I just did”
@Qual_ 6 ปีที่แล้ว ⁺¹³
thanks to the animation guy for that cute little car :D
@silkwesir1444 6 ปีที่แล้ว ⁺²⁶
6:10 "usually they apply whatever rules they've learned straightforwardly to this different situation and screw up."
so, pretty much like humans... ;)
@TylerJBrown192 6 ปีที่แล้ว ⁺¹
Yay! More Robert Miles videos!
@silkwesir1444 6 ปีที่แล้ว ⁺²
4:00 interesting you talk about how in Pac-Man all you do is move around.
just a couple days ago i thought about how a variant of Pac-Man might be intersting and fun to play, in which you would have to hold down a button to collect the dots. Doing so would also slow you down.
On the other hands, the ghosts would have a different behavior, most importantly, while they have line of sight to you (Pacman), they would speed up, chasing you.
@hnryjmes 6 ปีที่แล้ว
Great! Enjoying these a lot
@lobrundell4264 6 ปีที่แล้ว
Yes yes more Rob!! :D
@CoderShare 6 ปีที่แล้ว ⁺²¹
Can't wait to see the video on Google Duplex.
@justinwong7231 6 ปีที่แล้ว ⁺¹
Google Duplex is extremely exciting, but the technology isn't ready and hasn't been released yet. A useful discussion would be difficult without making wild speculation.
@himselfe 6 ปีที่แล้ว ⁺¹
I enjoyed this one!
@tiikoni8742 6 ปีที่แล้ว ⁺¹
I like the light in this office room :-)
@pickles4263 6 ปีที่แล้ว ⁺⁴
A very interesting paper im getting ready to read! And thank you for a brief explanation (sometimes i get lost without explaining) :3
@bradburyrobinson 6 ปีที่แล้ว ⁺²
Is that a Quickshot joystick I see lurking on that top shelf? It may not be, it's been a while since I last used one. I'm surprised I even remember the name.
@Vladhin 6 ปีที่แล้ว ⁺⁵
Whoaaa! Hi Roberto!
@yosoyjose 6 ปีที่แล้ว ⁺¹
really good idea
@aka5 6 ปีที่แล้ว ⁺³
"Like a child learning really?"
"...they just use data way more efficiently."
Lmao
@gravity4606 6 ปีที่แล้ว ⁺¹
is the reward function similar to a fitness function used in EA?
@nobodykid23 6 ปีที่แล้ว ⁺¹
So, to make this clear, is this applicable outside the area of reinforcement learning? Bcs the paper strongly use RL terms but you explained that it can also applicable to machine learning methods
@globalincident694 6 ปีที่แล้ว
RL and machine learning are being used synonymously here. The implication is any AGI will not be told what to do, it will learn by doing.
@Yupppi 3 ปีที่แล้ว
I once found a super mario world neural network from youtube that you could run yourself and tried it. The lava being in a different place brought it to mind, how it took a night to get it to mostly finish the level, but the moment you changed the level, it was all over again.
Made me think how it's somewhat of a problem that it (or them in general often) doesn't seem to really make notes of what things really are, like a human conceptualizes things and knows to avoid them or pursue them in any environment. You would absolutely want them to make a note like "that's a goomba, gotta avoid it in the next level as well". But how. Do they always need like a library of real world concepts like a human builds over time to be able to conceptualize and transfer its ideas from situation to situation? Or environment.
I'm sure people have tried to find ways around that issue of extremely limited base knowledge that the AI can't take advantage over. Kinda like how just feeding the unicorn thing massive amounts of data helped it become so much better without interruptions or tweaks, which usually just isn't a realistic option. And even when openAI learned DOTA2 for 1-2 years straight playing I recall millions of games, played with itself, played with pros, played with people, it still didn't manage to grasp majority of the heroes in a functional enough way to be played and the devs tweaked and taught it different rules multiple times to make it progress towards victory more reliably. And it was only in the default map that the game plays in, not to even consider if the map was completely different (although throughout the year there's the multiple balance patches changing items and characters and usually one with some map changes as well).
Can you grade the AI's performance as learning event? Like feedback to compare their evaluation? Although it kinda fights the idea of having a good reward function if you tell it that it did bad but it measured itself great. On the other hand it would be a step towards having the AI self-fix. I'm sure people have tried or are doing it, but how does it fare in solving the usual problems? What are the caveats? Or is it just not even useful for what is tried to be accomplished?
@richardhayes3102 6 ปีที่แล้ว ⁺⁹
"Kids [...] use data way more efficiently"
@Guztav1337 4 ปีที่แล้ว ⁺³
"Kids use data more efficiently than current AI."
@Tehom1 6 ปีที่แล้ว ⁺¹
Gridworld is obviously located in Hawai'i: 6:30
@024Carlos024 6 ปีที่แล้ว
hey try to fix the sound there is a static noise in the video ! great AI vid
@EpicFishStudio 6 ปีที่แล้ว
2 minute papers just published about AI which generates a dream environment where it can train without actually intercting with anything- its amazing!! it beat alpha go by a significant margin.
@kasanekona7178 6 ปีที่แล้ว ⁺¹
I realised that a video I have open in another tab is by a person who sounds exactly like Rob Miles here :o
@vanderkarl3927 3 ปีที่แล้ว ⁺¹
Is it Mob Riles, his nega-universe duplicate?
@recklessroges 6 ปีที่แล้ว
Seems to be missing the front off the wix advert at the end of the video, (or has it been designed that way by AI engagement learning? )
@superscatboy 6 ปีที่แล้ว
Reckless Roges Wait, you watch the sponsored bits on YT videos?
@adammercer9679 6 ปีที่แล้ว ⁺¹
It's interesting to think about some of these questions about AI and wonder if we'll ever be able to approximate them. For instance, in the video there's the question "How can we build agents that do not try to introduce or exploit errors in the reward function in order to get more reward?" Do humans even handle this properly? It's in our best interest to cooperate with each other and not murder each other and yet people still do it. How can we hope to ask an AI to do this if humans can't? This exposes a fundamental problem with AI that cannot be solved.
@fiona9891 5 ปีที่แล้ว
Nothing says that AI can't be smarter and better than humans, but even if we get to the point where they are it'll take a while.
@KryptLynx 4 ปีที่แล้ว
7:20 it sounds like a compliment :D
@parsa_poorsh ปีที่แล้ว
0:20 that's weird! facebook published an image classification model this week!
@ietsization 4 ปีที่แล้ว ⁺¹
9:10 please be careful with screen sharing, things like a session id in the url can come back to bite you.
@ivuldivul 6 ปีที่แล้ว
Comodore PET in the background!
@GhostEmblem 6 ปีที่แล้ว ⁺¹
Could you explain how they behave differently if the supervisers there?
@pleasedontwatchthese9593 6 ปีที่แล้ว ⁺²
Ghost Emblem the supervisor probably effects the scores. Like if it sees something bad it takes away score.
@4ringmaster 6 ปีที่แล้ว
I guess it's a different way of thinking about it, but wouldn't monte carlo tree structures provide the same insight?
@JmanNo42 6 ปีที่แล้ว ⁺²
Are the ghosts random acting? Can the pacman agent know the full map with ghost agents and pills or just a subset is all information traceable any moment? It seem pacman simulating the ghost behaviours should be rewarding. And of course tracking of the playing field changes.
@JmanNo42 6 ปีที่แล้ว
I mean a smart agent must be able to "learn" guess the ghosts move at any point, and make the best choice out from ghost action? Picking points just secondary when it come to be caught? I would track the arrow of any ghost that traverse a fork/crossing and calculate from it. You do not need keep track of ghosts traversing every pill. Just forking and their movement arrows so you can get calculate the tree of possible 4-5 next moves i think.
So now you narrowed down what to keep track off. I think it could be a fairly small engine.
@JmanNo42 6 ปีที่แล้ว
Isn't this a bit like euler paths ability to chose the free path that the ghosts will not traverse in X moves?
So it is realtime chess? But then your pacman must know the ghosts relative speed vs his speed at any given time, if they are always synched relative speeds nothing really changed in the dataworld regardless their actual speeds. But if velocity for opponent is exponential vs yours as time goes you must keep track of time.
So you should not just play your agent you should "simulate" the ghost agents, only then you can chose the optimal path. But the more erradic and chaotic random the ghost actions get the harder to know the correct path choice for pacman. So it ends up to be a proballistic blocked path game.
@JmanNo42 6 ปีที่แล้ว
But then i maybe have not created a learning agent but a smart system, but it could be combined?
@JmanNo42 6 ปีที่แล้ว
How does agents deal with systems that have almost random behaviour, is it possible to chose a best scenario or is it just action response?
@JmanNo42 6 ปีที่แล้ว
So when it pass a fork and make new arrow it will have 1/2 chance in next split and 1/3 traversing a crossing because i do not think i ever saw a ghost stop and go back or..... So now you can calculate your choice of path dependent upon the agents probable path choices. If ghost agents behavior is unique to them, they must get an ID and be tracked separately by different rulesets.
@nawdawg4300 6 ปีที่แล้ว ⁺¹
It seems to me that the biggest issue with AI right now is something no one seems to question, the required size of data sets. Like Rob says in half a sentence, babies/humans use data much more efficiently. I reckon half of the issues in this paper would be solved immediately if we were able to create an algorithm that only needs to see a situation < 10 times to fully adapt to it. Of course, this is probably the biggest IF in all of AI R&D.
@jakejakeboom 6 ปีที่แล้ว ⁺¹
That's because machine learning (and backprop neural networks) are fundamentally different from animal brains in how they learn and function. We still have zero idea how to approach the learning ability of a human child. It's not that people don't question the inefficiency of ML (the reasons for which are well understood mathematically), it's just that no other 'AI' technique from the past has gotten us anywhere close to what neural nets have done. And just because they're hugely inefficient in the amount of data needed doesn't mean that we won't be able to engineer a nerual-net-based AI in the future which is actually capable of superintelligent self-improvement, despite requiring enormous resources and data. In some ways, it's unfair to look at the capabilities of a human brain without considering the billions of years of evolution behind its genetic design. If we can meet and surpass the brain within this century, I'd say that's pretty impressive.
@nawdawg4300 6 ปีที่แล้ว
While I agree with what you've said, I think you may have misinterpreted what I said. I wasn't saying that we should question ML, but that it's clearly isn't the end all be all of AI. On top of this, at least from my small sample of youtube videos, it seems people are more focussed on ML and it's improvements instead of something new. Now that's probably because we have so far to go, and ML has proven to be incredibly effective, at least with enough data. If the brain can learn with such little information, then in the far future we should be able to have computers do the same. ML, while tangible, is lack luster relative to what's possible.
@migkillerphantom 5 ปีที่แล้ว ⁺²
@@jakejakeboom there is nothing fundamentally different about them. The difference is that your brain is the equivalent of a network that has been 99% trained at compile time ( evolution) and only needs to be slightly tweaked by runtime learning.
@migkillerphantom 5 ปีที่แล้ว ⁺¹
Most modern machine learning is done on uniform arrays of data. Much broader than they are deep. Biological brains are extremely deep sparse (and recursive, but that's besides the point) arrays - only a tiny subsetof all the possible links and perceptrons in each layer actually exist.
This means you get much more rapid adaptation and a whole bunch of functionality out of the box, but at the cost of generality.
@hamleytejada9226 6 ปีที่แล้ว
why dont you have caption
@XtraButton 6 ปีที่แล้ว
Has anyone thought to use AI to make safety protocols? In that the AI will make sure another program doesn't go out of control and have major disaster, and then use that refined AI to do the same thing again (set standard safety protocols). Maybe it will get to the point they are passing particular information to the other.
@kitrana 6 ปีที่แล้ว ⁺¹
"kind of like teaching a child how to drive" well you are technically trying to build a silicon-based life form.
@MoritzvonSchweinitz 6 ปีที่แล้ว
But why not give the algorithm access to the safety function? Or at least a meta-algorithm?
@Fnartprod 5 ปีที่แล้ว
because in the real world you don't have access to it
@maxsnts 6 ปีที่แล้ว
We are nowhere near the AI that most people communally think about (Dave, T800, iRobot).
I for one think that is great!
@PregmaSogma 6 ปีที่แล้ว ⁺¹
7:15 It's a glitch in the matrix :v
@TheDuckofDoom. 6 ปีที่แล้ว ⁺¹
I have a hunch that making a proper general AI with desirable interaction with the world, safety, versatility, creativity(negotiating complex problems), estimating with incomplete data... will loose all the advantages of robotic automation and gain all the inefficiencies and fallibility of humans.
@TechyBen 6 ปีที่แล้ว
Uber need to watch all these videos...
(Too soon?)
@judgeomega 6 ปีที่แล้ว ⁺²
it seems immediately apparent to me that a large number of issues with AI have to deal with our own expectations vs the explicit goals/rewards given to the ai.
@andrewkelley7062 6 ปีที่แล้ว
Lol the multiple forms of the double slit experiment somebody is going to get it
@andrewkelley7062 6 ปีที่แล้ว
by the way I actually did not know someone with my same name happen to post a paper on this subject I had nothing to do with that. It actually freaks me out especially after working on all the things I have been working on.If I have in any way in-pleaded the progress of that I am truly sorry this is an actual quiescence.
@andrewkelley7062 6 ปีที่แล้ว
make sure you make three at once you are not me.
@andrewkelley7062 6 ปีที่แล้ว
if yours is actually working
@andrewkelley7062 6 ปีที่แล้ว
are you ready to start again.
@platinumlagg 6 ปีที่แล้ว ⁺⁶
I have made my own "amazon alexa" called Maverick, and it can make me any coffee and cups of tea that i want...
@ragnkja 6 ปีที่แล้ว ⁺⁵
Premium Lagg
Did you have to “Maverick-proof” its environment, just like we often have to child-proof or pet-proof our homes?
@sarahszabo4323 6 ปีที่แล้ว ⁺⁴
I suppose this is where the "Maverick" Virus is derived from that devastates AI and reploids and mechaniloids a few centuries from now?
@platinumlagg 6 ปีที่แล้ว ⁺¹
Yes!
@Max_Flashheart 6 ปีที่แล้ว
The Commodore PET is watching and learning ...
@DanteHaroun ปีที่แล้ว
Is that an urbit flag in the background 😳
@CaudaMiller 6 ปีที่แล้ว ⁺¹
4:06 not solvable sokoban level
@magventure1019 6 ปีที่แล้ว
I wonder if humans could ever define 'enjoyment' or 'happy' to an agi. If we could do that we might be able to give it chance at life and see if it could find the optimal happiest life possible?
@KX36 6 ปีที่แล้ว ⁺³
How long will it be before AI start writing their own papers?
@pleasedontwatchthese9593 6 ปีที่แล้ว ⁺⁴
KX36 how do you know we are not all ai and your the only real person left
@KX36 6 ปีที่แล้ว ⁺²
How do you know I am a real person?
@jonasfrito2 6 ปีที่แล้ว ⁺¹
How do you know that you know?
@mr.sunflower3461 6 ปีที่แล้ว ⁺¹
how do u know that ur not dreaming?
@jonathanolson772 6 ปีที่แล้ว
The dreamworld and the "real" world often intermix
@topsmiler1957 6 ปีที่แล้ว ⁺²
Yay
@Locut0s 6 ปีที่แล้ว ⁺¹
I like how Rob mentions with a laugh that he’s too young to have played Pac Man. I don’t know why but it somehow really accentuates how incredibly smart you suddenly realize he is for his age, well hell for any age.
@REALsandwitchlotter 6 ปีที่แล้ว
Locut0s smart but gets confused by the rules of pacman
@icebluscorpion 3 ปีที่แล้ว
5:51 this happens not only in machine learning people do this all the time and get no consequences i the same scenario. corrent people are real bad to ask for help too
@FalcoGer 10 หลายเดือนก่อน
so when i write a python script that stops and resumes when i press a button, uses a standard A* heuristic path finding function where anything that results in changes that are not explicitly asked for by giving it a high pathing cost, obviously is completely deterministic and therefore doesn't depend on me being there or not, doesn't self modify because that'd be a silly idea, is proven to work in all environments in the specification with mathematics and logic, and I do it such that it works first time around (that never happens, forget about it), then I solved AI without ever using neural networks or learning?
Whenever i tried to do anything with AI or machine learning, it was always a catastrophy. want to find a square in an image? AI took days to train and was completely garbage at even the most simple tasks like that. use computer vision and classical algorithms? worked 100% every time and took just a few minutes to write the code.
I just don't get it how to tweak the magic knobs to make it work. If a problem can be solved with classical computing, then I think we should just do that.
@tocsa120ls 6 ปีที่แล้ว ⁺⁷
Okay, this is the third time I read it as "Griswolds"... that paper would probably be much funnier.
@jonaskoelker 5 ปีที่แล้ว ⁺¹
Whenever I click 'play' on a Computerphile video I always stay a while and listen :-)
@CreativeTutz1 6 ปีที่แล้ว
Why don't they introduce another function and call it the "loss" function, if he made the wrong move (or if he got eaten by a ghost in the Pacman example) you will lose instead of gain. Therefore the AI will try to maximize the gain while trying to minimise the loss
@pleasedontwatchthese9593 6 ปีที่แล้ว ⁺¹
Ahmed SH that's not really different from making bad things give a negative score
@lm1338 6 ปีที่แล้ว
A computing related TH-cam channel being sponsored by a WYSIWYG editor is kind-of selling out
@katowo6521 6 ปีที่แล้ว ⁺⁵
Can someone explain the difference between computer science and software engineereing for me please
@mheermance 6 ปีที่แล้ว ⁺³⁰
A computer scientist studies how computers work, the limits of computability, and tries to uncover new algorithms. A software engineer applies these concepts to solve real world problems.
@progamehackers1433 6 ปีที่แล้ว ⁺¹
Martin Heermance can u tell whio earns more??
@valiok9880 6 ปีที่แล้ว ⁺¹⁹
the one who does the job better, duh
@mheermance 6 ปีที่แล้ว ⁺¹¹
Often you can do either job with either degree, so earnings depend upon your chosen career path. A PhD computer scientists that becomes university faculty will earn about 20% less than a software engineer with a BS or Masters degree. But a well known computer scientist might do consulting and earn more.
@AndDiracisHisProphet 6 ปีที่แล้ว ⁺⁴
same difference as a physicist and a (regular) engineer
@notyou6674 4 ปีที่แล้ว
what would happen if you applied this kind of gridworld ai to a chess board, with there possible actions being all legal moves for whatever side they are on.
@BEP0 6 ปีที่แล้ว ⁺³
Nice.
@galewallblanco8184 5 ปีที่แล้ว
Ai Gridworlds?
Just confine it to a virtual world, a game.
@monhuntui1162 6 ปีที่แล้ว
Why is it called a reward function/system and not say a parameter system? What I mean is, how does a machine appreciate a reward? I just find it hard to understand why people give human attributes to somethings, when it makes more sense to describe something in a more objective manner especially a machine learning system. Saying it learns on a reward system can confuse and make the machine seem more sophisticated than it actually is. I don't know, maybe I'm just bothered by the language for no reason since I still understand what was being explained.
@pleasedontwatchthese9593 6 ปีที่แล้ว
monhuntui I think it's a good description of what it's doing. It's trying to get more reward like someone would in real life
@andrewkelley7062 6 ปีที่แล้ว
As you can see looks a bit weird still works with the least amount of variables you can use
@andrewkelley7062 6 ปีที่แล้ว
And that should be enough
@andrewkelley7062 6 ปีที่แล้ว
Any questions
@dpt4458 4 ปีที่แล้ว
what if you tried to tell it to go make you a cup of tea while interacting as little as possible with the current enviroment so for example touching anything that is not required for the creation of tea would result in a loss of points.We could point out exactly what is needed to make tea i.e. teabags,warm water,a cup and some sugar or something and anything that is not speciefied is not allowed to be touched so i guess we would change it's goal from make a cup of tea as fast and efectivily as possible to make a cup of tea as fast and efectivily as possible while exibiting as little interaction with the enviroment as possible.btw i'm definetly not even close to an expert in this but i would like to know exactly how this idea would fail spectaculary
@ekki1993 3 ปีที่แล้ว
There's a video by robert miles that talks about the possible problems of a couple of ways you could implement this. I think it's the one about empowerment or any other from his "concrete problems in AI safety" series.
@dpt4458 3 ปีที่แล้ว
@@ekki1993 Thanks
@andrewkelley7062 6 ปีที่แล้ว
Please
@andrewkelley7062 6 ปีที่แล้ว
Whopes my bad false alarm no worries I am almost back to sanity or at least back to where I was.
@2l3r43 5 ปีที่แล้ว
AI learns to fly cars above "lava"
@aopstoar4842 6 ปีที่แล้ว
Am I misunderstanding the whole thing. It starts of with "not scientific" when different datasets are used instead of a standardized, in this case grid, space. Then it shows a paper for a world with a highly specific task, which means you only test the learning for that type of task instead of a generalized work agent.
You test the equivalent of a walking stick (the biological creature) in what way at all does that relate to AI? A steppingstone perhaps, but is it even rudimentary or has it placed itself at a far to trivial level? Lot of big words with esoteric interpretation, but I hope you get what I am pointing at.
In my world an AI will be able to theorize, like we human AI do as to identify what type of problem it is, if it is a problem at all or just a bump in the road that will sort itself out through quantumprobability effects - i.e entropy. Then identify if an already produced solution grid works or if a new one have to be invented. What can be used from the toolkit and what have to be invented?
Can the AI then invent from nothing?!!! Our world is built on repetition of patterns. I for instance grew a pepper plant last year and took the seeds from it this year. One of twenty look like and behaves like the motherplant. The others either grow taller with fewer fruits, another one grew to the first split in top branches then stopped growing that branch and instead started growing ALL the buds on the stem at the same time. That is the AI as is the plant had several builtin growing solutions waiting in the genetic code (what we call junk DNA), but where did those solutions come from?
Where did the invention step in or are we trying to prove there is no such thing as intelligence at all? Perhaps intelligence are just elaborate repetitive patterns that have worked and been ingrained in gene and meme. Intelligence in that case is then just applying principles from one area for instance "hydraulics" and putting it in a new context "NAND-gates". Then fine tuning the application in respect to the new area. Instead of bar of pressure, it is voltage difference. Instead of 240 V it is 0-5 V.
@DustinRodriguez1_0 6 ปีที่แล้ว ⁺¹⁴
There was recently an announcement about the Uber car that killed a woman. It said that the cars systems recognized the woman, but its higher order attention systems decided to ignore her. Most people see this as clear failure worthy of condemnation of the system. However, a human being could easily make exactly the same error. We are extremely resistant to developing a system which which can show will fail and result in deaths in 1 out of a million trials.... yet entirely comfortable with putting humans in the mix even if it results in deaths in 500 out of a million trials. What if making mistakes is not simply an artifact of learning systems, but actually a fundamentally necessary feature of them? Will society ever be wise enough to accept an artificial system with known dangerous limitations even if those dangers are radically less than the human-based alternative?
@MarkFunderburk 6 ปีที่แล้ว ⁺¹⁰
That's not exactly what happened, the "higher order attention systems" did not "decide" to do anything, it was pre-programmed to ignore ALL breaking requests. They claimed this was due to the system being very sensitive. So while the car could navigate itself it was left to the "driver" to look out for obstacles. This was a very poor decision on Ubers part becuase a person can't be expected to stay engaged perfectly while not continuously playing an active role in driving. There has also been some speculation as to weather or not the driver even knew that autonomous braking had been disabled.
@SFKelvin 6 ปีที่แล้ว
Or you develop the algorithm at DARPA, then commercialize it secretly for civilian use-say a police dispatch decision making algorithm for C4, then look for modes of failure as a real world test.
@themeeman 6 ปีที่แล้ว ⁺³
0:35 Subtle joke for mathmeticians ;)
@JuliusUnique 6 ปีที่แล้ว ⁺¹
7:10 why not put the cars on imaginary roads? let them do the mistakes on a simulated street and then put them on real streets
@eideticex 6 ปีที่แล้ว ⁺⁷
Watch the video again and pay close attention to what they are talking about. That's exactly what this en-devour they are discussing is. A virtual playground to develop, train and evaluate AI safety protocols. The task may seem simple enough for you or me but currently these are task that AI are horrible at solving. Start small and work up towards a very real and useful test that can serve as a standard for production machines.
@JuliusUnique 6 ปีที่แล้ว
"Watch the video again and pay close attention to what they are talking about" do I look like I have infinite time?
@dannygjk 6 ปีที่แล้ว
You spoke of AI following rules to solve problems. That applies to using traditional algorithms and heuristics for example but does not apply to some other AI systems for example neural nets. I'm surprised you did not distinguish between various AI techniques.
@dannygjk 6 ปีที่แล้ว
Another thing you do is give the impression that a system can come up with something out of thin air. Learning is like a process in nature. Processes in nature are limited to what is possible due to physics, chemistry, etc. If something is impossible in nature it will never happen. Similar to a learning system's environment. The environment is defined as to what is or isn't impossible and no amount of learning will change that.
@RobertMilesAI 6 ปีที่แล้ว ⁺¹
Typically once a neural network has been trained, its behaviour is a pure function of its inputs. The 'rules' in that case are not explicit or easily legible to humans, but the learned policy can still be thought of as a set of rules that the system follows, possibly a very large set.
@jolez_4869 5 ปีที่แล้ว
*Mission failed, we'll get them next time.* Or not.
@distraughtification 6 ปีที่แล้ว
Looking at humans as an example, we tend to learn from others. A child learns not to break a vase because their parent reacts negatively if the child either breaks the vase or does an action that might lead to breaking the vase. Then later, when that child is asked to do something near a vase, they recall that the vase being broken is bad and automatically add that (as in, not breaking the vase) as a secondary goal, or a part of the goal, however you want to think about it.
My point is, this paper seems to expect that an AI can be made that can learn how to behave without ever being told or shown how to behave, and I think that's a pointless expectation. You can't expect a child not to break a vase if you don't tell it that breaking a vase is bad.
Sure, it can learn on its own that breaking a vase is bad, but only by actually breaking the vase (or something similar - essentially, _something_ has to be broken, which isn't a desired outcome).
I think the same applies to AI.
In my eyes, trying to come up with a general solution like "penalizing the agent’s potential for influence over its environment" is a fruitless effort, because then you have to define what parts of the environment are okay to influence and which are not, and how you can influence them and how you can't. It's like Rob Miles said earlier on a video about Asimov's laws of robotics - you can't expect to have to define the entire field of ethics just to be able to tell a robot not to harm a human.
TL;DR humans learn safely by interacting with other humans, we shouldn't expect AI to learn safely without interacting with another intelligence.
@levipoon5684 5 ปีที่แล้ว
Dlesar I agree to some extent. However, one of the challenges in AI safety is to make an AI that will listen to feedbacks and allow you to correct its reward function. This is built into a human child. We have ways to punish a child, but punishing a superintelligence is much more difficult.
@andrewkelley7062 6 ปีที่แล้ว
Ok I might need some help I am in completely blind territory here and I don't want to really die so I don't know if I am panicking or my body is doing something weird
@Redlabel0 5 ปีที่แล้ว
abstractFunction () {
#what if the Link is a !edgeCase You Code in explaining to the code if you wish like a child [yet like a mature adult, for u don't underestimate their understanding] why you don't want that.
/*
10 years of collaborative man though processed/machine aided edge cases
to try to account for a finite/ not infinite number of possibilities and with quantum maybe heart just maybe
*/
}
@Redlabel0 5 ปีที่แล้ว
I mean if all imaginable things are accountable and not infinite therefore The goal of specifying specifying scenarios granted all possible possibilities and imaginary ones can be counted it's attainable, to use this vast override system. and yes not just the only thing to do seems promising but now it's about time and if it's attainable to compute with quantum computers operations
@andrewkelley7062 6 ปีที่แล้ว
Ok please help because my existence it no longer needed and I would really not like to return to one
@andrewkelley7062 6 ปีที่แล้ว
Someone please help
@andrewkelley7062 6 ปีที่แล้ว
The point of me doing all of this was to make sure everyone gets to come at some point you have to blindly stare into the void and reach in I now am the single point but you now know we all do a little. At some point you have to trust that when you put your hand in the darkness you will be able to pull it back out again. There will always be that fear. There will always be that time you do not know. Just look at this point you are all as strong as me now.
@dannygjk 6 ปีที่แล้ว
I don't get your point... unless you don't have the gist of what is going on when a system learns.
@andrewkelley7062 6 ปีที่แล้ว
Bingo
@andrewkelley7062 5 ปีที่แล้ว
However I figured it out.
@andrewkelley7062 6 ปีที่แล้ว
By the way we need every one every separate line of experience makes a new resolution to the complexity
@andrewkelley7062 6 ปีที่แล้ว
Please try and save them all
@andrewkelley7062 6 ปีที่แล้ว
Because now like me you have all the time in the world.
@thomaswhittingham550 6 ปีที่แล้ว ⁺¹
271th ye
@andrewkelley7062 6 ปีที่แล้ว
You know you guys are going to at some time in the future pretty soon collapse this on your end to I'm not going to leave you guys behind and at this point it is just seeming more and more silly
@simargl2454 6 ปีที่แล้ว
safety... zzzZZZzzzZZZzzzZZZ
@andrewkelley7062 6 ปีที่แล้ว
oh and one last thing before all this goes down in a few days you should be stable enough for me to give you the solution to getting around that hole gravity problem, or at least a starter version, but just to let you know its stranger than you think. lol
@andrewkelley7062 6 ปีที่แล้ว
😀😀😉
@StefanReich 5 ปีที่แล้ว
Argh... Deep Mind :[
@andrewkelley7062 6 ปีที่แล้ว
Now that all that is done would you seriously need some help with the coding or do I actually need to go through all the proper channels and at this point what feels like hold the entire worlds hand with this... and of course start on the ungodly amount of papers I could start to produce and trust me it usually looks a lot nicer I just wanted to make a point and trust me this is not the first thing I used it on.
@andrewkelley7062 6 ปีที่แล้ว
It is basically me trying to become interesting and convay something i have found and now relize I have been doing this same pattern for days with a stop watch and have become a real life Pavloves dog or what ever his name damit I'm turning off my phone
@andrewkelley7062 6 ปีที่แล้ว
Oh and there is a stream of lies to randomize personal data
@andrewkelley7062 6 ปีที่แล้ว
Son of a snitch it's because of the way it was set to see importance.
@andrewkelley7062 6 ปีที่แล้ว
There is more but there is a lot of it and it's just a lot easier to run it yourself and see but I would not recommend doing it to much on actual paper like I did that was mostly out of convenience for me.
@andrewkelley7062 6 ปีที่แล้ว
Hmmm viloent mood swings and massive bouts of panic expected but going to each side expected but not un passable I'm pretty sure I can make this just have to sleep see you on the other side guyes
@dannygjk 6 ปีที่แล้ว ⁺¹
Hmmm sounds like you OD'ed on some substance. If you understand me get to a clinic.
@andrewkelley7062 6 ปีที่แล้ว
And to tell the truth I wasn't on well anything
@TaimourT 6 ปีที่แล้ว ⁺²
Third
@db7213 6 ปีที่แล้ว
But isn't all this just another example of "when what you have is a hammer, everything looks like a nail?". The AI code in a robot should simply run in a sandbox and its outputs verified to be safe (by non AI code) before being executed. And the inputs sent to the AI should also be filtered (again, by non-AI code) so that the AI doesn't get to know about the existance of supervisors or its own off-switch etc.
@pleasedontwatchthese9593 6 ปีที่แล้ว
D. Bergkvist the problem with that is a super ai could outsmart the person checking the output.
@db7213 6 ปีที่แล้ว
It wouldn't be a person checking the output, but a computer program. The AI can't "outsmart" it anymore than it can outsmart gravity.
Take a self driving car, for example, where the AI wants to reach its destination as fast as possible. Then the AI would learn that if it tries to run over a pedestrian, that will just result in the car stopping. Thus, the AI would only attempt (and fail) to run over pedestrians if it wants the car to stop.
@judgeomega 6 ปีที่แล้ว
arent all actions ultimately irreversible? even if we move the vase, we still might cause wear/ finger prints. all actions irrevocably increase entropy...
in addition; the logical outcome of minimizing influence on the environment is death, stillness, and the ceasing of chemical/ electrical processes.
the intention of such a directive is to preserve things which we care about; our children, people/ pets, and our property. from such a simple model as shown in these gridworlds we lose the ability to make that distinction. yes we need to generalize these things, but going so far as to make EVERY action avoid change on the environment is throwing out the baby with the bathwater.
a much better directive is to MAXIMIZE the future freedom of action of all cooperative entities. a child is a possible cooperative entity, so not only would the ai not crush it, it would do everything it could to provide the child with the tools, resources, and knowledge with which the child could harness to accomplish many actions.
@dirtypure2023 6 ปีที่แล้ว
but now you've essentially changed the reward function from (making tea) to (successfully raising well-adjusted human children)
I'm not so sure that's the right approach
@judgeomega 6 ปีที่แล้ว
im not so sure its a good idea to put high intelligence into somethings whose sole purpose is to pass the butter.
i do think its important that for EVERY intelligent machine, its fundamental goals are the same as if it was ruling the world/ superintelligent/ or all powerful.
@Faladrin 6 ปีที่แล้ว ⁺¹⁰
He finally explains properly why we don't have learning algorithms. That would apply these systems have understanding. We have "Algorithm Self Adjustment Procedures". There is no learning, there is no intelligence. No one is researching true AI. All the things you see being done are just ways to get systems which can program themselves usually via trial and error. It's about the stupidest thing ever made that is really useful.
@julianw7097 6 ปีที่แล้ว ⁺¹⁷
Pretty sure that would apply to all of us too then.
@sparkyfire8123 6 ปีที่แล้ว ⁺¹⁵
Faladrin I'm going to disagree with you conclusion here. How do you learn? You are either given information/data to work with, and through trial and error. When we are born, we don't know anything and everything is learned through trial and error. It's not until we develop understanding that you can take data given to you and incorporate it into your life. Where is the difference? If your talking an algorithm, is it any different than how the brain works? Understanding something requires you to first have something to relate it to. I don't see it being any different with AI. Without it first having experience with trial and error it will never develop an understanding of anything.
@sparkyfire8123 6 ปีที่แล้ว ⁺³
I want to add that I don't feel we are near true AI but I do feel we have taken the first step, developing experience that can then be used to develop understanding and application
@pleasedontwatchthese9593 6 ปีที่แล้ว ⁺¹
Faladrin that's just semantics. That is learning. I think it was good when he said that kids use the information more efficiently. The computer is doing the same thing just not as good
@MrBleulauneable 6 ปีที่แล้ว ⁺¹
@Faladrin How about you give a proper definition to what "learning" is, and then realise by yourself how wrong you are.
@andrewkelley7062 6 ปีที่แล้ว
I am just an ordinary man

ต่อไป

เล่นอัตโนมัติ