MIT Embodied Intelligence
MIT Embodied Intelligence
  • 70
  • 117 073
Felix Yanwei Wang - Inference-Time Policy Customization Through Interactive Task Specification
Title: Inference-Time Policy Customization Through Interactive Task Specification
Abstract:
Imitation learning has driven the development of generalist policies capable of autonomously solving multiple tasks. However, when a pretrained policy makes errors during deployment, there are limited mechanisms for users to customize its behavior. While collecting additional data for fine-tuning can address such issues, doing so for each downstream use case is inefficient at scale. My research proposes an alternative perspective: framing policy errors as task mis-specifications rather than skill deficiencies. By enabling users to specify tasks unambiguously at inference-time, the appropriate skill for a given context can be retrieved without fine-tuning. Specifically, I propose (1) inference-time steering, which leverages human interactions for single-step task specification, and (2) task and motion imitation, which uses symbolic plans for multi-step task specification. These frameworks correct misaligned policy predictions without requiring additional training, maximizing the utility of pretrained models while achieving inference-time user objectives.
Biography:
Felix Yanwei Wang is a final-year PhD candidate in Electrical Engineering and Computer Science (EECS) at MIT, advised by Prof. Julie Shah. His research focuses on adapting pretrained manipulation policies for human-robot interaction. He earned his Bachelor's degree from Middlebury College and his Master's degree from Northwestern University. He has also worked under the guidance of Prof. Dieter Fox at the NVIDIA Robotics Lab. Felix is a recipient of the MIT Presidential Fellowship and the Work of the Future Fellowship in Generative AI at MIT. His research has been recognized with oral and spotlight presentations at CoRL and ICLR, featured on PBS, and is currently exhibited at the MIT Museum.
มุมมอง: 232

วีดีโอ

Sharut Gupta - Redefining Context for Powerful Test-Time Adaptation Using Unlabeled Data
มุมมอง 37521 วันที่ผ่านมา
Title: Redefining Context for Powerful Test-Time Adaptation Using Unlabeled Data Abstract: Foundation models, while powerful, often struggle under distribution shifts in unfamiliar domains, typically requiring costly data collection and retraining to maintain performance. Test-Time Adaptation (TTA) has emerged as a promising approach to address these limitations, enabling models to adapt dynami...
EI Seminar - Jason Ma - Recent Progress on Foundation Model Supervision for Robot Learning
มุมมอง 2.1Kหลายเดือนก่อน
Title: Recent Progress on Foundation Model Supervision for Robot Learning Abstract: Achieving general-purpose robotics requires robots to quickly learn diverse tasks without extensive training data or hand-engineered controllers for each scenario. While recent efforts in crowd-sourcing robot datasets have expanded available training data, these remain orders of magnitude smaller than datasets u...
EI Seminar - Tim Dettmers - The Promises and Pitfalls of Open-source Agent Systems
มุมมอง 374หลายเดือนก่อน
Title: The Promises and Pitfalls of Open-source Agent Systems Abstract: Agent systems, AI systems that make their own plans and act on them, have shown promising results particularly for coding-changes such as SWE-bench. However, currently, most agent systems rely on closed-source API models such as GPT-4o and Claude as it is believed that open-source models do not have the capabilities to make...
EI Seminar - Jaime Fernández Fisac - Games and Filters: A Road to Safe
มุมมอง 239หลายเดือนก่อน
Title: Games and Filters: A Road to Safe Intelligence Abstract: Despite their growing sophistication, autonomous systems still struggle to operate safely in uncertain, open-world situations-as highlighted by public skepticism toward early automated driving technologies. Meanwhile, the excitement around generative AI has been tempered by concerns about potential harms from poorly understood huma...
EI Seminar - Mengzhou Xia - Aligning Language Models with LESS Data and a Simple (SimPO) Objective
มุมมอง 5682 หลายเดือนก่อน
Title: Aligning Language Models with LESS Data and a Simple (SimPO) Objective Abstract: Aligning pre-trained language models ensures they follow human instructions reliably to produce helpful and harmless outputs. Supervised fine-tuning and preference optimization are two key approaches for achieving this goal. In this talk, I will introduce two novel algorithms designed to enhance these two st...
EI Seminar - Danny Driess - Have Large Models Changed Robotics?
มุมมอง 2.4K2 หลายเดือนก่อน
Title: On Building General, Zero-Shot Robot Policies Abstract: In this talk, I will give perspectives on how large models have changed robotics, and why there is still fundamental research to be done. The main focus of the discussion is how we can achieve generalization in robotics. More traditional methods from Task and Motion Planning (TAMP) are capable of solving complex sequential manipulat...
EI Seminar - Mahi Shafiullah - On Building General, Zero-Shot Robot Policies
มุมมอง 5K2 หลายเดือนก่อน
Title: On Building General, Zero-Shot Robot Policies Abstract: Robot models, particularly those trained with large amounts of data, have recently shown a plethora of real-world manipulation and navigation capabilities. Several independent efforts have shown that given sufficient training data in an environment, robot policies can generalize to demonstrated variations in that environment. Howeve...
EI Seminar - Paul Liang - Foundations of High-Modality Multisensory AI
มุมมอง 8282 หลายเดือนก่อน
Title: Foundations of High-Modality Multisensory AI Abstract: Building multisensory AI that learns from text, speech, video, real-world sensors, wearable devices, and medical data holds promise for impact in many scientific areas with practical benefits, such as supporting human health and well-being, enabling multimedia content processing, and enhancing real-world autonomous agents. However, m...
EI Seminar - Ge Yang - Learning Robust, Real-world Visuomotor Skills from Generated Data
มุมมอง 3303 หลายเดือนก่อน
Title: Learning Robust, Real-world Visuomotor Skills from Generated Data Abstract: The mainstream approach in robot learning today relies heavily on imitation learning from real-world human demonstrations. These methods are sample efficient in controlled environments and easy to scale to a large number of skills. However, I will present algorithmic arguments to explain why merely scaling up imi...
EI Seminar - Naomi Saphra - Interpreting Training
มุมมอง 3269 หลายเดือนก่อน
Title: Interpreting Training Abstract: For years, both learning theory and empirical science of deep learning used multipass training on small image classification datasets as a primary testbed and source of inspiration. As a result, our understanding of models and training has largely taken the form of smooth, simple, and continuous laws. Recently, the machine learning community has begun cons...
EI Seminar - Jacob Andreas - Good Old-fashioned LLMs (or, Autoformalizing the World)
มุมมอง 1.3K10 หลายเดือนก่อน
EI Seminar - Jacob Andreas - Good Old-fashioned LLMs (or, Autoformalizing the World)
EI Seminar - Oriol Vinyals - The Deep Learning Toolbox: from AlphaFold to AlphaCode
มุมมอง 2K2 ปีที่แล้ว
EI Seminar - Oriol Vinyals - The Deep Learning Toolbox: from AlphaFold to AlphaCode
EI Seminar - Recent papers in Embodied Intelligence I
มุมมอง 7873 ปีที่แล้ว
EI Seminar - Recent papers in Embodied Intelligence I
EI Seminar - Deepak Pathak - Rapid Adaptation in Robot Learning
มุมมอง 2.3K3 ปีที่แล้ว
EI Seminar - Deepak Pathak - Rapid Adaptation in Robot Learning
MIT EI Seminar - Ted Adelson - Good tactile sensing lets robots do cool things
มุมมอง 1.6K3 ปีที่แล้ว
MIT EI Seminar - Ted Adelson - Good tactile sensing lets robots do cool things
MIT EI Seminar - Pulkit Agrawal - The Task Specification Problem
มุมมอง 2.2K4 ปีที่แล้ว
MIT EI Seminar - Pulkit Agrawal - The Task Specification Problem
MIT EI Seminar - Lerrel Pinto - Diverse data and efficient algorithms for robot learning
มุมมอง 1.5K4 ปีที่แล้ว
MIT EI Seminar - Lerrel Pinto - Diverse data and efficient algorithms for robot learning
MIT EI Seminar - Max Welling - Learning equivariant and hybrid message passing on graphs
มุมมอง 7K4 ปีที่แล้ว
MIT EI Seminar - Max Welling - Learning equivariant and hybrid message passing on graphs
MIT EI Seminar - Pierre-Yves Oudeyer - Developmental Machine Learning, Curiosity and Deep RL
มุมมอง 1.3K4 ปีที่แล้ว
MIT EI Seminar - Pierre-Yves Oudeyer - Developmental Machine Learning, Curiosity and Deep RL
MIT EI Seminar - Hector Geffner - Model-free, Model-based, and General Intelligence
มุมมอง 1.5K4 ปีที่แล้ว
MIT EI Seminar - Hector Geffner - Model-free, Model-based, and General Intelligence
MIT EI Seminar - Russ Tedrake - Feedback control from pixels
มุมมอง 3.7K4 ปีที่แล้ว
MIT EI Seminar - Russ Tedrake - Feedback control from pixels
MIT EI Seminar - Stefanie Tellex - Towards Complex Language in Partially Observed Environments
มุมมอง 8884 ปีที่แล้ว
MIT EI Seminar - Stefanie Tellex - Towards Complex Language in Partially Observed Environments
MIT EI Seminar - Laura Schulz - Curiouser and curiouser: why we make problems for ourselves
มุมมอง 1.7K4 ปีที่แล้ว
MIT EI Seminar - Laura Schulz - Curiouser and curiouser: why we make problems for ourselves
MIT EI Seminar - Phillip Isola - Emergent Intelligence: getting more out of agents than you bake in
มุมมอง 3.1K4 ปีที่แล้ว
MIT EI Seminar - Phillip Isola - Emergent Intelligence: getting more out of agents than you bake in

ความคิดเห็น

  • @许子琪-x8b
    @许子琪-x8b วันที่ผ่านมา

    different tasks bring different learning mode boundaries, how can we ensure that the robot can classify every task's boundary when meet a multi-task or a multimodal problem?

  • @harmeshkumar1748
    @harmeshkumar1748 22 วันที่ผ่านมา

    The work is highly interesting. It has a lot of future. The Confidence of the candidate😊 is very high.... 😊😊😊

  • @harmeshkumar1748
    @harmeshkumar1748 22 วันที่ผ่านมา

    Excellent talk....

  • @julienhovan5725
    @julienhovan5725 28 วันที่ผ่านมา

    Incredible research and vision for future

  • @shybry3371
    @shybry3371 หลายเดือนก่อน

    🙏🏻

  • @jpmitchell925
    @jpmitchell925 หลายเดือนก่อน

    This should have more views!

    • @shybry3371
      @shybry3371 หลายเดือนก่อน

      Unfortunately the algorithm doesn’t favor this content. Hopefully engagement helps boost this content.

  • @faaf42
    @faaf42 2 หลายเดือนก่อน

    It would be great if you could repeat the asked question (11:13:16). Now it's a game of Jeopardy to guess the question. Thanks for sharing anyways! Lab tour fly was nice too.

  • @jazzvids
    @jazzvids 4 หลายเดือนก่อน

    Thank you for this! presentation starts 19:33 :)

  • @the_engineer97
    @the_engineer97 6 หลายเดือนก่อน

    This is so intriguing. I am so inspired

  • @LeoTX1
    @LeoTX1 8 หลายเดือนก่อน

    Thanks!

  • @herbertarnold6372
    @herbertarnold6372 9 หลายเดือนก่อน

    Promo-SM 😕

  • @LeoTX1
    @LeoTX1 9 หลายเดือนก่อน

    It's a good presentation. Very useful for me! Thanks a lot!

  • @jeremydy3340
    @jeremydy3340 9 หลายเดือนก่อน

    Talk starts 32:57

  • @Shintuku
    @Shintuku 10 หลายเดือนก่อน

    Does this presentation correspond to some paper? It would be nice to have access to the slides/citations, very interesting stuff

  • @jeremydy3340
    @jeremydy3340 10 หลายเดือนก่อน

    Talk starts at 16:30

  • @hansbleuer3346
    @hansbleuer3346 11 หลายเดือนก่อน

    Interesting explanation

  • @CandidDate
    @CandidDate ปีที่แล้ว

    I think a sense of humor in robotics would lead to clownish appeal.

  • @DhruvMetha
    @DhruvMetha ปีที่แล้ว

    Starts at 15:25

  • @aennmatyasbarra-hunyor5506
    @aennmatyasbarra-hunyor5506 ปีที่แล้ว

    Great one, thank you! I would like to be part of it. One day it will be possible.

  • @JosephHeck
    @JosephHeck ปีที่แล้ว

    Content actually starts at 17:30, and the speaker's audio starts at 18:30

  • @araldjean-charles3924
    @araldjean-charles3924 ปีที่แล้ว

    For the initial conditions that work, have anybody look at how much wiggle room you have. Is there an epsilon-neighborhood of the initial state you can safely start from, and how small is epsilon?

  • @cbasile22
    @cbasile22 2 ปีที่แล้ว

    is there any formal course that covers multi agent RL? I find it confusing thus far. Thanks!

  • @AngeloKrs878
    @AngeloKrs878 2 ปีที่แล้ว

    1:07 subtitles "my experience with drugs couldn't be better"

  • @bhaskartripathi
    @bhaskartripathi 2 ปีที่แล้ว

    I was always confused by MA-MDP. You made it look very simplistic. Mathematical notations were very concise and research paper ready.

  • @keeperofthelight9681
    @keeperofthelight9681 3 ปีที่แล้ว

    how to do convolution lstm and other things on JAX more tutorials please sir Mathew Johnson

    • @kshitijshekhar1144
      @kshitijshekhar1144 2 ปีที่แล้ว

      flax is a high level nn library built on top of Jax, check out its documentation. It's a very new library, built for flexibility. And you can make a mark in that by making PRs

  • @ImtithalSaeed
    @ImtithalSaeed 3 ปีที่แล้ว

    u and a confuse me

  • @georgemu7464
    @georgemu7464 3 ปีที่แล้ว

    Very insightful

  • @devjaiswal1685
    @devjaiswal1685 3 ปีที่แล้ว

    Thank you sir

  • @adamantinebipartite4732
    @adamantinebipartite4732 3 ปีที่แล้ว

    Nazi.

  • @LB-fx1kx
    @LB-fx1kx 3 ปีที่แล้ว

    Great work!

  • @iandanforth
    @iandanforth 3 ปีที่แล้ว

    Really enjoyed the presentation. The 'Puzzle' slide is problematic. All three have 'lots of wiring', the camera has smaller wires in a better package.

    • @syedshahid8316
      @syedshahid8316 ปีที่แล้ว

      I live in karachi Pakistan I like your

  • @p.z.8355
    @p.z.8355 3 ปีที่แล้ว

    How do you linearize the KG without getting into exponential complexity ?

  • @p.z.8355
    @p.z.8355 3 ปีที่แล้ว

    How do you combine selfsupervised learning with declarative knowledge ?

  • @pakistanbtsarmy2625
    @pakistanbtsarmy2625 3 ปีที่แล้ว

    👌

  • @ImtithalSaeed
    @ImtithalSaeed 3 ปีที่แล้ว

    Which book I can refer to

  • @f150bc
    @f150bc 3 ปีที่แล้ว

    The diehold foundation is along with the suspecious observers pushing a 12 thousand year cycle of super nova and magnetic reversal which brings a catastrophic event please debate them on their theory's they have tens of thousands of people following them. I fear that the theory might be partially right. Find them on TH-cam under those names.please look into this thanking you in advance Carl.

  • @fredxu9826
    @fredxu9826 3 ปีที่แล้ว

    This is a great talk. Personally I haven't had the prerequisite for manifold learning, but the idea behind hybrid message passing is quite profound. Just wandering: if you have a Bayesian GNN where the prior encodes the linear assumption, would that be equivalent to the GNN + PGM model presented here? or is there a limit to the expressiveness of a Bayesian prior?

  • @thanasisk
    @thanasisk 4 ปีที่แล้ว

    Great talk, thank you for uploading.

  • @DistortedV12
    @DistortedV12 4 ปีที่แล้ว

    Amazing work

  • @wgharbieh
    @wgharbieh 4 ปีที่แล้ว

    Talk starts at 6:00

  • @harrysaini7702
    @harrysaini7702 4 ปีที่แล้ว

    Can we get the PPT plzz.....??

  • @alinouruzi5371
    @alinouruzi5371 4 ปีที่แล้ว

    good

  • @TheRcCrazyFan
    @TheRcCrazyFan 4 ปีที่แล้ว

    Starts at 12:08

  • @AvindraGoolcharan
    @AvindraGoolcharan 4 ปีที่แล้ว

    Starts around 7:03

  • @viktoriyat1815
    @viktoriyat1815 4 ปีที่แล้ว

    this was amazing, thank you so much for uploading it!!

  • @Mefaso09
    @Mefaso09 4 ปีที่แล้ว

    Starts at 11:20

    • @mitembodiedintelligence8675
      @mitembodiedintelligence8675 2 ปีที่แล้ว

      Thank you! I have updated the video so that it starts playing from the very beginning! -Ge

  • @AndersonSilva-dg4mg
    @AndersonSilva-dg4mg 4 ปีที่แล้ว

    thank you for sharing information

  • @dimwitquack
    @dimwitquack 4 ปีที่แล้ว

    exellent

  • @SaifUlIslam-di5xv
    @SaifUlIslam-di5xv 4 ปีที่แล้ว

    Here from Reddit. (Y)