- 31
- 889 637
Natasha Jaques
เข้าร่วมเมื่อ 5 ต.ค. 2011
Social Reinforcement Learning talk at RLDM
Social learning helps humans and animals rapidly adapt to new circumstances, coordinate with others, and drives the emergence of complex learned behaviors. What if it could do the same for AI? This talk describes how Social Reinforcement Learning in multi-agent and human-AI interactions can address fundamental issues in AI such as learning and generalization, and improve human-AI interaction. I demonstrate the difference between social learning and imitation learning, and show that when agents can learn how to socially learn from experts, they can generalize to fundamentally different environments at test time than they have experienced during training. I then present a method for selectively learning who and what to imitate by computing when following other agents’ policies would pay off under the learner’s own preferences. This technique, called PsiPhi-Learning, is a step toward enabling more human-like social learning. In the last part of the talk I discuss early work on training language models with human feedback, focusing on implicit cues present in the text itself, rather than manually curated binary labels. Together, this work argues that Social RL is a valuable approach for developing more general, sophisticated, and cooperative AI.
มุมมอง: 910
วีดีโอ
Badly trained policy after 40000 steps
มุมมอง 8799 หลายเดือนก่อน
Badly trained policy after 40000 steps
Multi-agent DQN training step 90000 trajectory video
มุมมอง 2739 หลายเดือนก่อน
Multi-agent DQN training step 90000 trajectory video
Multi-agent DQN training step 0 trajectory video
มุมมอง 1479 หลายเดือนก่อน
Multi-agent DQN training step 0 trajectory video
Learning to grab with bell as reward
มุมมอง 1.3K9 หลายเดือนก่อน
Is reinforcement learning how babies learn?? This video shows clips that take place over the course of 3 months as baby Nathan learns to grab. Starting from random wiggling, he appears to use trial and error learning to learn to control his limbs to hit the bell and make it ring (bell acts as a reward shaping term). Eventually, he learns to grab the object.
Intel Deep Learning Community of Practice talk
มุมมอง 5K3 ปีที่แล้ว
Social learning helps humans and animals rapidly adapt to new circumstances, coordinate with others, and drives the emergence of complex learned behaviors. What if it could do the same for AI? This talk focuses on Social Reinforcement Learning, which leverages social learning and affective computing to enhance coordination, learning, generalization, and human-AI interaction. To improve coordina...
Natasha Jaques PhD Thesis Defense
มุมมอง 786K3 ปีที่แล้ว
Presentation of my thesis "Towards Social and Affective Machine Learning" natashajaques.ai/publication/social-and-affective-machine-learning/
Personalized Multi-task Learning for Predicting Tomorrow's Mood, Stress, and Health
มุมมอง 1.6K3 ปีที่แล้ว
This brief video describes our paper "Personalized Multi-task Learning for Predicting Tomorrow's Mood, Stress, and Health", which was selected as Best of Collection in the journal Transactions on Affective Computing. For more information, see: natashajaques.ai/publication/personalized-multitask-learning-for-predicting-tomorrows-mood-stress-a/
VHRED Cornell baseline
มุมมอง 9965 ปีที่แล้ว
Baseline Variational Hierarchical Recurrent Encoder Decoder trained on Cornell movie dialogs. Video shows interface of neural.chat
Influence agent in Harvest game
มุมมอง 9206 ปีที่แล้ว
Agents trained with influence are able to coordinate their actions to not exhaust the apples in this tragedy-of-the-commons game. This represents the best random seed over an extensive hyperparameter search.
A3C baseline in Harvest
มุมมอง 4346 ปีที่แล้ว
Selfish A3C agents consume all apples in this tragedy-of-the-commons game, so they are quickly exhausted. This represents the best random seed over an extensive hyperparameter search.
Agent trained with intrinsic social influence reward - Tragedy of the Commons
มุมมอง 5386 ปีที่แล้ว
The purple agent has been trained with an intrinsic reward for causally influencing other agents. Weight on the influence reward is .211.
Agent trained with intrinsic social influence reward
มุมมอง 2786 ปีที่แล้ว
The purple agent receives intrinsic reward for causally influencing other agents.
A3C will not free other agent trapped in a box
มุมมอง 2516 ปีที่แล้ว
A3C will not free other agent trapped in a box
Influence agent frees compatriot trapped in a box
มุมมอง 3236 ปีที่แล้ว
Influence agent frees compatriot trapped in a box
Affective Computing - Spring 2015 Virtual Visit
มุมมอง 7K9 ปีที่แล้ว
Affective Computing - Spring 2015 Virtual Visit
5 Lego Robots Dancing to Gangnam Style
มุมมอง 2.8K11 ปีที่แล้ว
5 Lego Robots Dancing to Gangnam Style
Hello Dr. Natasha The defense perfect about your thesis... Good job
Reccommended.
Awful system. This should not be a requirement at all. No one should be forced to present and defend in public. Good thing I dropped out of my phd, what a waste of time.
Amazing presentation
This is a lecture.
why this has so much views ?
Great learning options here.
✅
Nice presentation ❤❤
Thomas Paul Wilson Brenda Taylor Timothy
Your presentation was outstanding, delivering complex information clearly and engagingly. It captivated the audience with its insightful content and seamless flow.
Dear Natasha that was superb❤🎉
Liliana Village
Schiller Spurs
Thompson Linda Garcia Kimberly Lee Joseph
Stokes Garden
Hammes Field
Great. You can be my gf
Oh wow! I am about entering the second year of my masters program to be honest I feel a sense of doubt like do I even know where I am and what I am required to be doing but after watching this video there's been some level of excitement and just some relieve. Thank you so much Dr. Natasha Jaques probably you could also help me get clarity in this field of machine learning as I am lost.
Hello Natasha, I first found your work 'Social Reinforcement Learning' while looking for related literature for my Masters Dissertation. The video of your PhD defense encouràged me to virtually and socially learn from your methods while you were presenting your thesis. I'm interested in the application of Artificial Intelligence in Mechanical Engineering Production Processes.
Hi... what problem are you working on to apply AI in production process?
This is my second cousin
Hi Luke!
@@natashajaques4478 hi
Wow this must be very important as deepmind guys were part of the comity these guys are doing crazy stuff with Artificial intelligence and machine learning like AlphaGo etc
I wonder if there will going to be different results for different race like Asians? We are very expressive in showing our emotions.
Hello Very great. Can I have your email.
Booooooo. This is lame. Quit following trends. Is p != np? Or what. This is just trend following bullshit. Dislike.
I’m confused here. When I got my Masters degree I had to present a Thesis and defend it. I thought that the research for a PhD was called a “dissertation”, not a “thesis”
Your right - partially. But you see, when you talk about a thesis, you have to clarify which degree you're talking about (MA or PhD). When you say "dissertation", no further clarification is needed.
Great research work in AI Social Learning from Humans. Congrats. Best wishes for your future endeavors !
the hell am i looking at
I am too ignorant to understand this. As a beginner in AI i will work hard until i can add something or at leat follow this impressing presentation. Excellent work.
I don't buy into this, she's like a robot that has learned all of what they taught her at heart. FAKE!
Im here for the boobs
0:10 the camera was zoomed because of one reason -Have a good day-
떡국 사세요 😢ㅇ
Our defense in the UK was by an external and internal examiner in a viva. It was a closed session.
And so it is mostly the case these days. A university's community usually doesn't have a clue that somebody is getting their PhD - because the defence is not open to the public.
Is it like SLAM algorithm?
This is incredible work Doctor, I find this subject so interesting! I am a junior computer science and stats major and all of this is my focus!
Wow you're young attractiveattractive rich entitled and completely useless So proud of u!
That is a great presentation, you have demonstrated expertise in your area of research. I also like the way you keep the audience engaged
Interaction between Man and machines is good interesting subject. But it has devastating consequences than benefits if not used in a rightful way. Scientists are trying develop artificial human, but without human action machine can't respond from its own wisdom. In conclusion later either it would be an utter failure or create such a havoc in human society that world be the cause of human extinction.
Are you married? Lol
Humbly Asking , Will You marry Me?
Aturan wtc, wto, pentagon,?? 600, tahun tidak ada jawaban., 2024. 600, tahun kemudina tidak ada orang, 2024. Apa tindakan wtc, wto, pentagon di, april, 2024 ???😀😀🙏🙏🙏🙏🙏🙏🙏🙏
Marry me !
u go girl, impressive presentation
Very nice
Very nice
Ended on a passive note.
didn't understand a word but you sounded amazing, congrats XD
very well
Mankind will destroy itself by endlessly creating and solving problems