Andrej Karpathy: Tesla Autopilot and Multi-Task Learning for Perception and Prediction

แชร์
ฝัง
  • เผยแพร่เมื่อ 8 มิ.ย. 2024
  • Clips from Andrej Karpathy's talk at ICML (June 2019). I think multi-task learning is one of the most important (and understudied) subfields of machine learning. Most real world problems are multi-task. I especially find the discussion on team workflow fascinating (see 18:25). I've been thinking and working on this topic a lot lately, and will probably give a lecture on it. Here's the outline:
    0:00 - Sensors
    0:29 - Single-task learning challenges
    4:35 - Multi-task neural network architecture
    11:50 - Loss function considerations
    14:34 - Training dynamics
    18:25 - Team workflow
    Full talk:
    slideslive.com/38917690/multi...
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 44

  • @LexClips
    @LexClips  4 ปีที่แล้ว +32

    Clips from Andrej Karpathy's talk at ICML (June 2019). Here's the outline:
    0:00 - Sensors
    0:29 - Single-task learning challenges
    4:35 - Multi-task neural network architecture
    11:50 - Loss function considerations
    14:34 - Training dynamics
    18:25 - Team workflow

    • @robosergTV
      @robosergTV 4 ปีที่แล้ว

      is this the full version?

  • @ghostlv4030
    @ghostlv4030 3 ปีที่แล้ว +2

    What a presentation! Andrej Karpathy's talk is always full of insights while being enjoyable.

  • @safekidda46
    @safekidda46 4 ปีที่แล้ว +36

    I get the feeling Andrej is giving academia a kick up the arse to start working on some of these problems.

  • @akarmdit2267
    @akarmdit2267 4 ปีที่แล้ว +1

    indeed humanity at its finest kudos

  • @RoryFrenn
    @RoryFrenn 2 ปีที่แล้ว

    Excellent presentation, it gave me good intuition on MTL

  • @johnlucich5026
    @johnlucich5026 6 หลายเดือนก่อน

    ANDREJ; KEEP UP ALL YOUR GOOD WORK WHEREVER YOU GO

  • @hchattaway
    @hchattaway 3 ปีที่แล้ว

    I enjoy listening to insanely smart people... I don't mind that he talks fast.. helps keeps me focused.

  • @HeavyDist
    @HeavyDist 3 ปีที่แล้ว +6

    Quoting Elon Musk about Andrew, "You wanted neural net cars? This is THE guy." We are so fortunate to have him working on this problem.

  • @pks.
    @pks. 3 ปีที่แล้ว

    I remember the different images of cars explanation in cs231n(but that was cats).
    This guy is THE best person in explaining computer vision

  • @jijie133
    @jijie133 ปีที่แล้ว

    Great video!

  • @bobi461993
    @bobi461993 4 ปีที่แล้ว +35

    I was actually in the audience. 😄

    • @MasterofPlay7
      @MasterofPlay7 4 ปีที่แล้ว

      he explains nothing but the challenges, trade secret i guess....

  • @vinson2233
    @vinson2233 3 ปีที่แล้ว

    This is a really good video. I wonder how they solve the team workflow problems in the end.

  • @volotat
    @volotat 4 ปีที่แล้ว +18

    May we expect more lectures on this channel? I was missing them a lot!

    • @lexfridman
      @lexfridman 4 ปีที่แล้ว +5

      Full lectures will be on the main channel: th-cam.com/users/lexfridman
      This is a channel for shorter clips: th-cam.com/users/lexclips
      Please subscribe to both if you interested in both clips and longer lectures.

  • @alankirkham5598
    @alankirkham5598 2 ปีที่แล้ว

    Andrej, I have two questions. What contrast ratio’s are the current cameras and software capable of & is the neural network utilizing photogrammetry or just photo data sets?

  • @FrenchingAround
    @FrenchingAround 4 ปีที่แล้ว +9

    Have they tried using a depth map and putting a priority value on tasks and sub tasks according to the distance of the task relative to the car? If a road sign is far away it will have less ressources allocated than the task used to detect a car cutting in.

    • @liuculiu8366
      @liuculiu8366 4 ปีที่แล้ว +1

      I guess the distance to the road sign can only be obtained if the road sign detection network is properly working which requires an importance weight in advance

    • @FrenchingAround
      @FrenchingAround 4 ปีที่แล้ว +1

      @@liuculiu8366 If the road sign is far away, the weight will be low (thanks to the depth map). You don't need to know about the road sign until the car needs to act upon it. Just like you do with your own brain, you're not super focused on a stop sign 500ft ahead. You could have a small ressource allocation to detect there is "A sign" (not knowing what it is yet) far away, and put that on hold until you actually have to act upon it. I guess the question is how ressource intensive is the depth map calculation.

    • @liuculiu8366
      @liuculiu8366 4 ปีที่แล้ว +2

      @@FrenchingAround As far as I know, Tesla cars have no laser radars. So, the depth map can only be obtained by stereo vision(assumed to be not very accurate). I am not very familiar with 3D techniques, but maybe some algorithm is also required to obtain the position of road signs from a not very accurate depth map?

    • @QuintinMassey
      @QuintinMassey 2 ปีที่แล้ว

      So, all you really get is depth and being able to tell if it is a sign from that depth alone might require some intermediate image processing. You might not need a detector per se, but you still have to recognize it is a sign given the scenario you posed. Same goes for the car.

    • @FrenchingAround
      @FrenchingAround 2 ปีที่แล้ว

      @@QuintinMassey indeed

  • @HappyLeoul
    @HappyLeoul 4 ปีที่แล้ว +8

    “Cars upside down”
    lol

  • @pw7225
    @pw7225 4 ปีที่แล้ว +1

    Allocating Karpathity is tricky.

  • @adsk2050
    @adsk2050 ปีที่แล้ว

    How is model versioning done exactly? Seems pretty complicated. Like are the weights updated and stored in a git repository? Or is there some other witchcraft involved in it?
    How can models be non-reproducible? If you are storing weights in git then you can always go back and get those weights, right?

    • @Splish_Splash
      @Splish_Splash ปีที่แล้ว

      weights isn't the problem, since you can actually change the structure of the model (simple example delete or add some layers, change loss function, activation function, data sampling), but I am also curious how the can't reproduce some of the results if they are using git

    • @dailygrowth7967
      @dailygrowth7967 หลายเดือนก่อน

      I think he is referring to the process of fine-tuning fine-tuned models for multiple iterations. After a while, this will become non-reproducible

  • @prabdeepsingh7721
    @prabdeepsingh7721 4 ปีที่แล้ว

    Hey, it's badmephisto!

  • @jonclement
    @jonclement ปีที่แล้ว

    didn't he recently say that going this multi route was a mistake? And having one x-large neural network was the way to go?

  • @kalebakeitshokile1366
    @kalebakeitshokile1366 3 ปีที่แล้ว +5

    Y’all have to stop uploading this guy at 1.5x

  • @nishatmahmud2991
    @nishatmahmud2991 ปีที่แล้ว

    Plz help me.
    which is best??
    tensorflow of pytorch???

    • @Splish_Splash
      @Splish_Splash ปีที่แล้ว

      torch

    • @Neonb88
      @Neonb88 10 หลายเดือนก่อน

      Torch for research and small networks, Tensorflow for distributed training

  • @m_sedziwoj
    @m_sedziwoj 4 ปีที่แล้ว

    In first minute, I get this feeling, maybe we need different approach? I think people make assumption what road is constant width, and if we not see road, so something is there, because as driver many times I see something, I do not know what it is, but most time is not so important, as if I should drive over or around.

  • @JD-kf2ki
    @JD-kf2ki 3 ปีที่แล้ว +1

    I guess they just add the flipped car recently; otherwise...

  • @SolidSnake013Duds
    @SolidSnake013Duds 4 ปีที่แล้ว

    Hm I see some of the same materials images in the February presentation.

  • @zelsu5646
    @zelsu5646 4 ปีที่แล้ว

    Trop drôle

  • @markmd9
    @markmd9 4 ปีที่แล้ว +1

    Forget about identifying all kinds of types of objects on road and focus on identifying drivable road surface and everything else as obstacles

    • @piyushpatel9830
      @piyushpatel9830 3 ปีที่แล้ว +1

      On the surface what you say seems to make sense but in order to coexist with other human drivers and not be too slow to clog traffic in realistic conditions, the driving needs to be what is called in the industry as "naturalistic", which means the machine needs to model behaviors e.g a pedestrian's behavior is different than cars. See for example what MobileEye does (they have videos on their website). Companies in this space have complex behavioral models that they use for "path planning" so that the cars drive naturalistically while being safe and this requires an awareness of what is in the scene (not merely the presence of things but identification) to make intelligent decisions. You also have to follow traffic rules even if you can drive without causing crashes while not following them, because they are enforced locally so you don't want to be fined. To follow the rules, you have to be able to detect signs etc.

  • @theredflagisgreen
    @theredflagisgreen 4 ปีที่แล้ว

    2nd

  • @gkeers
    @gkeers 4 ปีที่แล้ว +1

    Ultimately, instead of using intuition/judgement to decide things like the hierarchy of the tasks, they should be optimised via some ML process too. More like the way a brain must do it.

    • @debayandas1128
      @debayandas1128 4 ปีที่แล้ว +9

      That is not a good problem statement. That is the issue: everyone wants to do what you stated, but defining the statement is difficult.