PROOF Tesla Is Creating a WORLD MODEL!! What are End-To-End Training, and Foundation World Models?

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ก.ค. 2023
  • Evidence is mounting quickly that Tesla has made a MAJOR move in their Full Self Driving (FSD Beta) technology: they're creating a foundation world model, likely with end-to-end training to solve car and Teslabot autonomy! But there is also a LOT of confusion about what all of these terms mean. Come with me to discover exactly what world models, foundation models, and end-to-end training means, why they are NOT 3D reconstructions, and why they're so important!
    I recorded this prior to xai's Twitter Spaces one July 14th, and this provides EVEN MORE evidence that what I'm saying is correct. Stay tuned for a "part two" to this video!
    Join this channel to get access to perks:
    / @drknowitallknows
    **To become part of our Patreon team, help support the channel, and get awesome perks, check out our Patreon site here: / drknowitallknows . Thanks for your support!
    Go to drinkag1.com/drknowitall to get started on your first purchase and receive a FREE 1-year supply of Vitamin D3+K2 and 5 travel packs.
    Get 25% off Blinkist premium and enjoy 2 memberships for the price of 1! Start your 7-day free trial by clicking here: blinkist.com/drknowitallknows
    Get The Elon Musk Mission (I've got two chapters in it) here:
    Paperback: amzn.to/3TQXV9g
    Kindle: amzn.to/3U7f7Hr!
    **Want some awesome Dr. Know-it-all merch, including the AI STUDENT DRIVER Bumper Sticker? Check out our awesome Merch store: drknowitall.itemorder.com/sale
    For a limited time, use the code "Knows2021" to get 20% off your entire order!
    **Check out Artimatic: www.artimatic.io
    **Want to get in on WeBull's awesome Crypto and stock fun? Check out this link, and get started trading stock and Crypto!
    a.webull.com/i/DrKnow-it-allK...
    **If you are looking to purchase a new Tesla CAR, Solar roof, Solar tiles or PowerWall, just click this link to get up to $500 off! www.tesla.com/referral/john11286. Thank you!
    **You can help support this channel with one click! We have an Amazon Affiliate link in several countries. If you click the link for your country, anything you buy from Amazon in the next several hours gives us a small commission, and costs you nothing. Thank you!
    * USA: amzn.to/39n5mPH
    * Germany: amzn.to/2XbdxJi
    * United Kingdom: amzn.to/3hGlzTR
    * France: amzn.to/2KRAwXh
    * Spain: amzn.to/3hJYYFV
    **What do we use to shoot our videos?
    -Sony alpha a7 III: amzn.to/3czV2XJ
    --and lens: amzn.to/3aujOqE
    -Feelworld portable field monitor: amzn.to/38yf2ah
    -Neewer compact desk tripod: amzn.to/3l8yrUk
    -Glidegear teleprompter: amzn.to/3rJeFkP
    -Neewer dimmable LED lights: amzn.to/3qAg3oF
    -Rode Wireless Go II Lavalier microphones: amzn.to/3eC9jUZ
    -Rode NT USB+ Studio Microphone: amzn.to/3U65Q3w
    -Focusrite Scarlette 2i2 audio interface: amzn.to/3l8vqDu
    -Studio soundproofing tiles: amzn.to/3rFUtQU
    -Sony MDR-7506 Professional Headphones: amzn.to/2OoDdBd
    -Apple M1 Max Studio: amzn.to/3GfxPYY
    -Apple M1 MacBook Pro: amzn.to/3wPYV1D
    -Docking Station for MacBook: amzn.to/3yIhc1S
    -Philips Brilliance 4K Docking Monitor: amzn.to/3xwSKAb
    -Sabrent 8TB SSD drive: amzn.to/3rhSxQM
    -DJI Mavic Mini Drone: amzn.to/2OnHCEw
    -GoPro Hero 9 Black action camera: amzn.to/3vgVMrH
    -GoPro Max 360 camera: amzn.to/3nORGYk
    -Tesla phone mount: amzn.to/3U92fl9
    -Suction car mount for camera: amzn.to/3tcUfRK
    -Extender Rod for car mount camera: amzn.to/3wHQXsw
    **Here are a few products we've found really fun and/or useful:
    -NeoCharge Dryer/EV charger splitter: amzn.to/39UcKWx
    -Lift pucks for your Tesla: amzn.to/3vJF3iB
    -Emergency tire fill and repair kit: amzn.to/3vMkL8d
    -CO2 Monitor: amzn.to/3PsQRh2
    -Camping mattress for your Tesla model S/3/X/Y: amzn.to/3m7ffef
    **Music by Zenlee. Check out his amazing music on instagram -@zenlee_music
    or TH-cam - / @zenlee_music
    Tesla Stock: TSLA
    **EVANNEX
    Check out the Evannex web site: evannex.com/
    If you use my discount code, KnowsEVs, you get $10 off any order over $100!
    **For business inquiries, please email me here: DrKnowItAllKnows@gmail.com
    Twitter: / drknowitall16
    Also on Twitter: @Tesla_UnPR: / tesla_un
    Instagram: @drknowitallknows
    **Want some outdoorsy videos? Check out Whole Nuts and Donuts: / @wholenutsanddonuts5741
    Sources:
    • ONE MODEL to Rule Them...
    • Tesla Engineer: Tesla'...
    www.sciencedirect.com/science...
    ai.stackexchange.com/question...
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 227

  • @KentLester
    @KentLester 10 หลายเดือนก่อน +61

    Just a word of encouragement to say that I love the geekiness of your videos on AI! You do a great job of taking an esoteric subject and breaking it down into understandable analogies for us "laygeeks." I'm particularly interested in MICS, (massively complex interconnected systems) and am attempting to write a layman's guide on the subject, and find your geek sessions to be highly inspiring and informative. Bravo.

    • @RedRyan
      @RedRyan 10 หลายเดือนก่อน

      Totally agree

    • @MooseOnEarth
      @MooseOnEarth 10 หลายเดือนก่อน +1

      Without formulae and numbers this is nowhere near geeky. I still like it, but do not mix up his explanations with actual geek stuff.

  • @tech-utuber2219
    @tech-utuber2219 10 หลายเดือนก่อน +24

    29:47 This is exactly what versions FSD 10 & 11+ have been missing, SELF-LEARNING. It presently encounters a driving scenario, as though it has never seen it before, even if it has been driven through that same scenario for weeks. (The Tesla team has been hand-tweaking the algorithms to compensate.) What will distinguish versions 12 and beyond is this fundamental ability to learn from previous errors, and then predict and avoid those errors in the future. It took some time to get here, but this is huge progress.

    • @johnross6314
      @johnross6314 10 หลายเดือนก่อน +4

      Elon Musk way back when talked up Edge Inference. As he should have. Including discussions on chip design the mimics the brain. Perfect for his “visual” approach. The only chip design that is visual, neural at its basic core is FPGA architecture. He has shifted away from this idea (mistake), as part of his mass FSD firing several years ago. Field programmable gateway arrays actually process data into visuals like are eye does. And move those bits in a natural neural method at the silicon level. Very fast. Very brain like. This should be exploited. Makes it so easy to do edge inference (self learning, with communication back to AI father trainer, DoJo). Also, any self learning HAS to be at human eyeball 3D and quality, 360 degrees. That would could make it faster than us humans who have to move our head for the 360 view (and/or use mirrors to augment).

    • @russm8193
      @russm8193 10 หลายเดือนก่อน

      Self learning happens only during training in batch mode on the server. Only inference (run-time) can occur in the car - takes huge compute and time to train. Models are iteratively trained and deployed to the cars with updates.

    • @tech-utuber2219
      @tech-utuber2219 10 หลายเดือนก่อน

      @@russm8193 That's clear but when FSD is eventually deployed in the future, what does each individual vehicle receive from DOJO and what is it computing? A tokenized subset of the geographical area model in which the vehicle is currently operating, which was gathered by the local vehicle "hive" in the area and compiled by DOJO? This is memory and bandwidth limited, implying that when there is no connectivity, in a new geographic location, the vehicle computation will revert to a simplified general model, or will HW4 or HW5 be able to do this stand-alone? How many adjustable parameters will the vehicle computer be using in order to accomplish the processing? (A modern LLM uses billions of parameters, for example) Encoding of the area model will have to be very efficient, in order to reduce both memory and computation.

  • @RobertDickert
    @RobertDickert 10 หลายเดือนก่อน +20

    As a software engineer who is not in AI but took the Andrew Ng online ML class years ago, I found this extremely helpful and interesting. Things have progressed a long way.

    • @jbencze
      @jbencze 10 หลายเดือนก่อน

      😅😅

    • @jbencze
      @jbencze 10 หลายเดือนก่อน +1

      Instructions Instructions 😮😅😅and

    • @jbencze
      @jbencze 10 หลายเดือนก่อน

      😅😊 4:25

    • @jbencze
      @jbencze 10 หลายเดือนก่อน

      😅

  • @LoisSharbel
    @LoisSharbel 10 หลายเดือนก่อน +22

    Thank you for this explanation that actually seems clear to me, although my understanding of most of the details is fuzzy. It seems this World Model is going to learn more closely to how we humans learn to navigate the world. Your ability to clarify dense information is amazing. Thank you!

    • @2ndfloorsongs
      @2ndfloorsongs 10 หลายเดือนก่อน

      Yes, very clear explanations. So clear that I actually think I understand it for at least 5 minutes after I finish it. But that's 5 minutes more than before. 😁

  • @briansilver9652
    @briansilver9652 10 หลายเดือนก่อน +12

    I can see this system actually reading the text on signs and interpreting their intent and authourity. In other countries it could translate this text to align with the existing AI training. ARRET = STOP etc. LIDAR cannot compete with this.

    • @MattOGormanSmith
      @MattOGormanSmith 10 หลายเดือนก่อน +1

      I think, for inference efficiency, separately trained context blocks of models could be switched and potentially blended at a high level, rather than having French road signs in the model used to drive in US, or snowy conditions to be considered when driving in a hot desert. This would be contrary to the end-to-end approach, so it doesn't look like it's the way they're going.

    • @briansilver9652
      @briansilver9652 10 หลายเดือนก่อน

      @@MattOGormanSmith So if I was driving from say Ontario to Quebec, the car would download the French context module at the border or have to have all languages loaded all the time just in case I decided to make a run for the border. This would need to be considered for left hand drive juridictions that border right hand drive ones. The world is such a complex problem.

    • @zoemayne
      @zoemayne 10 หลายเดือนก่อน +1

      Lidar is just a sensor it is not comparable to anything with regards to AI or coding.

  • @johnholman5341
    @johnholman5341 10 หลายเดือนก่อน +3

    Great video! One of your best so far IMO. I keep coming back to your channel for these great explanations.

  • @Flat4fun
    @Flat4fun 10 หลายเดือนก่อน +1

    Hope all is well as can be with your family. You being able to put things out while supporting your loved ones (in your special way) is inspiring.

  • @alexanderpoplawski577
    @alexanderpoplawski577 10 หลายเดือนก่อน

    Thank you for breaking it down! You have to give it to George Hotz. He said from the beginning, that end to end neural networks will be the solution and Tesla will come around eventually. Also when he saw the occupancy network image, his immediate reaction was: "They created a LIDAR image in software".

  • @alsetalokinalset
    @alsetalokinalset 10 หลายเดือนก่อน +5

    IMO the best and most insightful video you have created in a long time 👍🙂

  • @danielmadison4451
    @danielmadison4451 10 หลายเดือนก่อน +4

    Thank you, thank you. Excellent video. Foundation Model! That's the part I was trying to put a word to. I'm a 30year software engineer who came through the maze of 8080 (Z80) assembly, pascal, C, C++ object design. My major was Software Architecture. I knew I was missing the door-way into AI because it is NOT based on deterministic algorithms. I have made a few comments and didn't mean to be snarky, but I knew their was something deeper that I wasn't understanding. Love your videos along with Warren's.

  • @donfields1234
    @donfields1234 10 หลายเดือนก่อน +1

    Hey Dr know it all... just wanted to tell you your awesome... I gave a bit of healthy criticism on your last video and just wanted to add that thats all it was... your the bomb and we all can still improve right? Keep up the great work bud.

  • @davidhawkins7138
    @davidhawkins7138 10 หลายเดือนก่อน +4

    This was a great summary. Thank you.
    I'm pretty sure an ontology will still be required to generate the FSD onboard display and as an audit trail for accident and situational analysis. If my guess is correct, hydra heads will be part of the reinforcement learning to keep key ontological elements in the foundation model so they can be extracted as needed.
    As you've said in your videos on this, a foundation world model is a significant breakthrough. I'm interested in your thoughts about Tesla doing with a diffusion-based approach instead of generative transformers.

  • @DavidDuwaer
    @DavidDuwaer 10 หลายเดือนก่อน

    Your best video I’ve seen, by far. Thanks for this. More of these tech deepdives please 🙏 ! You also explain it really well.

  • @glwicks
    @glwicks 10 หลายเดือนก่อน

    One of your best ever ... now I get it ... AMAZING stuff... and can't wait for it to show up in my car!

  • @Harkonic
    @Harkonic 10 หลายเดือนก่อน +4

    In the xai QA Twitter Discussion Elon said: „If i look at the experience with Tesla what we’re discovering over time is that we actually over complicated the problem i can’t speak too much detail about what Tesla‘s figured out but except to say that in broad terms the answer was much simpler than we thought we were too dumb to realize how simple the answer was“ Do you think what he talks about is related to this?

    • @chriswright9096
      @chriswright9096 10 หลายเดือนก่อน

      Classic Musk nonsense. He makes it up as he goes along.

  • @HubertHeller
    @HubertHeller 10 หลายเดือนก่อน +1

    RLHF: Reinforcement Learning from Human Feedback
    NeRF: Neural radiance fields (NeRFs) are a technique that generates 3D representations of an object or scene from 2D images by using advanced machine learning

  • @sekching8197
    @sekching8197 10 หลายเดือนก่อน +1

    GPT-4 serves as a foundational model for understanding the world linguistically and is used to create systems like ChatGPT. In a similar vein, Tesla's world model, which learns to understand the world through raw images or photons, will form the foundation for their Full Self-Driving system. Correct me if i am wrong !

  • @z4zuse
    @z4zuse 10 หลายเดือนก่อน +2

    How long before the car would recognize it is driving towards itself, when confronted with a large mirror?

    • @alexanderpoplawski577
      @alexanderpoplawski577 10 หลายเดือนก่อน

      Doesn't really matter, it would have to break anyway, as a human would. You don't know what's behind the mirror.

  • @JohnCorrUK
    @JohnCorrUK 10 หลายเดือนก่อน +3

    Your most brilliant video yet, superb explanation . Bravo 👏👏👏

  • @nettlesoup
    @nettlesoup 10 หลายเดือนก่อน

    Thank you for getting the right understanding and separating out the different aspects of what the CPVR videos taught us. It seems others don't get it unless it's spelled out like here.
    Final step is just to be able to pronounce Ashok's name correctly. At one point you mentioned Ashok and smoke in the same sentence and yet pronounced Ashok the way you always have!

  • @MarcelR-89
    @MarcelR-89 10 หลายเดือนก่อน +2

    Great video ! Great explanation ! Keep doing your work … at some point people will never drive better safer than a Tesla car ! The question is not if it happen … the question is only when will it happen!

  • @jackcoats4146
    @jackcoats4146 10 หลายเดือนก่อน +3

    Great explanation. I see that getting corner and extreme cases are still a problem, but they are for people too.

    • @justchaz.
      @justchaz. 10 หลายเดือนก่อน

      LOL. Stuff must be sold to sold buyers.

  • @jimparr01Utube
    @jimparr01Utube 10 หลายเดือนก่อน

    Hardly understood a word, but REALLY appreciate what you are doing here Sir.

  • @garyrooksby
    @garyrooksby 10 หลายเดือนก่อน

    Really well covered, given the complexity of the subject. Thanks!

  • @DarylOster
    @DarylOster 10 หลายเดือนก่อน +1

    It also represents the ability for virtual time travel... into the past (to the point of the start of the 4d model), and a short distance into the most plausable furtures...

  • @vlad_417
    @vlad_417 10 หลายเดือนก่อน +2

    it's all cool, but as I understand it, it doesn't run on the car's computer. This is what will be run on Dojo to train neural networks. I think it will take a lot of additional data collection and enormous computing power.

  • @bertwright1790
    @bertwright1790 10 หลายเดือนก่อน +2

    It seems like a major difference between HW 3 and HW 4 is the radar module. Do we have any idea when and if radar is coming back into the picture? Could radar better calculate distances in inches and/or see in low light/visibility situations. Putting radar back without a plan to us it doesn’t make sense, but having two major standards is more a a difficult problem. It is possible to design a new computer (HW 4 lite) that has the computing power of HW 4 and can be upgraded, use the older cameras but the cars with no radar or LiDAR are forever limited no matter the computer or software.

  • @TheEriclwarner
    @TheEriclwarner 10 หลายเดือนก่อน +1

    Soo... I'm in the industry - but in different forms of AI... predictive maintenance... Security models... are some.
    Super great article for the layperson ... Thanks DKIA!

  • @at3941
    @at3941 10 หลายเดือนก่อน +1

    I think I understand what you’re saying about the world model and predictive training. But when I drive I also have a model of the actual environment I drive in - a 3D map in my head. I can draw it out - lane directions, # of lanes, curbs, buildings, changes in elevation, pot holes etc. on roads I actually drive on which helps me predict (in advance of even being at a particular location) what I need to do on the road I drive on every day - is this also happening or is this all part of their world model?

  • @olyalphy
    @olyalphy 10 หลายเดือนก่อน

    John, do you still think they will have heuristic code (software 1.0) for things like safety and hard boundaries (don’t over accelerate or turn wheel at a certain rate) and also do you think they will still use the voxel approach to define drive-able space. Or all this just gets replaced with the new end to end world model?

  • @MathewBoorman
    @MathewBoorman 10 หลายเดือนก่อน

    Thanks, this is great to understand! Was definitely worth clarifying.
    It would be great to understand the kick starting of these models, and where it is similar or different to biological brains. Random training from scratch you clearly want to shortcut.
    I'd be interested to learn if the solution has been to merge the previous NN's developed together as the starting point for a single world model evolution, and train from there? If that is so, I suspect some future features be helped along by developing new specific classifiers and adding that to the model for a kick along. For example new regions with some totally different structures, vehicles and signage and road rules. I imaging that is closer to how we learn, recognizing the new things as a special case, until they are integrated into the lower level 'don't think about it'.

  • @simonschmitz4894
    @simonschmitz4894 10 หลายเดือนก่อน +2

    Amazing explanations, thank you so much!! This whole debate reminds me of the days when I was studying philosophy, especially Wittgenstein. He first thought that explaining the world can only be done by describing "what is" (more like what is coming in via the photons), and who later completely changed his mind and said something like language creates the world (more like what ChatGPT is doing). And it is fascinating to understand that now when trying to emulate the human brain we are essentially using this same process of deep learning for both dimensions (space and linguistics). Do you think with driving, that part of the human feedback will be to edit out things where other humans made mistakes? otherwise the system may not be safe and may even repeat accidents...

  • @PeteCorp
    @PeteCorp 10 หลายเดือนก่อน

    So basically the zero system from Gundam Wing where it predicts the future and feeds it to the pilot.

  • @dousiastailfeather9454
    @dousiastailfeather9454 10 หลายเดือนก่อน

    Ya, we saw your video of same thing last year where you say the same thing.... You sure have the spirit! Too bad it was reported Tesla's resale values plummeted along with virtually ALL ev's recently. Keep pumping! (Same time, next year?)

  • @vuththiwattanathornkosithg5625
    @vuththiwattanathornkosithg5625 10 หลายเดือนก่อน

    Very well explanation and this make me having a goosebumps when its clicked in my head.

  • @andromeda3542
    @andromeda3542 10 หลายเดือนก่อน +1

    An End-to-End Model, in the context of machine learning and AGI, refers to a model that takes raw input data and feeds it through a series of transformations, ultimately outputting a direct prediction or control action. This approach contrasts with traditional methods that involve multiple stages of processing and human intervention. For instance, in the case of Tesla's self-driving technology, an end-to-end model would take raw sensor data (like images from cameras) and output control actions for the vehicle (like steering, acceleration, and braking commands) without any intermediate steps like object detection or path planning.
    The advantage of an end-to-end model is its ability to learn complex tasks without requiring explicit programming for each step. However, training such models can be challenging due to the complexity and the amount of data required.
    A World Model, on the other hand, is a representation of the environment that an AGI system uses to make predictions about future states based on its current state and actions. It's a kind of internal simulation of the world that the AGI uses to plan and make decisions. The world model should be able to understand and simulate the dynamics of the world in a way that's similar to how humans understand the world.
    In the case of Tesla's self-driving technology, a world model would need to understand the dynamics of driving, including the behavior of other vehicles, pedestrians, and the rules of the road. It would use this understanding to predict the outcomes of different actions and make decisions about how to control the vehicle.
    Tesla is working on creating a foundation model, which is a kind of world model that trains itself on a large amount of data without needing explicit labels. This foundation model would then be fine-tuned using reinforcement learning through human feedback to optimize its performance.
    An end-to-end model and a world model are key components of AGI systems. They allow the system to learn complex tasks from raw data and make decisions based on an understanding of the world, respectively.

  • @nickfosterxx
    @nickfosterxx 10 หลายเดือนก่อน +1

    OK, now I get it, thanks so much for persevering. In this purely visual world model, I assume they will have to devise some way to more explicitly teach about physical properties such as slipperiness of road surfaces that are eg wet, icy etc, perhaps by directly setting up training episodes so it can learn the safety envelopes?
    Also, I imagine, judgments about human interactions such as being flashed when it might mean 'after you, please go first' versus, 'warning, hazard ahead' etc. And of course 'Road Closed', 'No stopping' etc.
    With best wishes.
    Edit: I liked your ChatGPT analogy about building a purely statistical foundation model, onto which you train human preferences and higher level subtleties.

  • @ChattingwithKendall
    @ChattingwithKendall 10 หลายเดือนก่อน

    You went to Everest Base Camp? Wow! I’m impressed (and a bit jealous)!

  • @andrewsteinhaus8267
    @andrewsteinhaus8267 10 หลายเดือนก่อน

    Great Deep dive! I know a lot of this is possible because of the transparency of Tesla. Could you also do a deep dive and maybe a comparison with other autonomous programs, (Mobileye, Waymo, etc) , even though they may not have the same level of transparency. That would be super interesting.

  • @tomturnbull3723
    @tomturnbull3723 10 หลายเดือนก่อน

    Thank you, thank you, THANK YOU!!! You've helped many people to now truly understand how Tesla will finally solve L5 for everywhere, and why nobody else can.

    • @-whackd
      @-whackd 10 หลายเดือนก่อน +1

      Comma can

  • @markhooker8520
    @markhooker8520 10 หลายเดือนก่อน +1

    Thank you for making this fantastic informative video.

  • @2009RayMD
    @2009RayMD 10 หลายเดือนก่อน

    This is awesome didactics. Thank you for making this incredible step change easy to understand.

  • @spookytrigger
    @spookytrigger 10 หลายเดือนก่อน

    Thank you, I learned heaps and filled in some knowledge gaps from the way you explained this.

  • @JosefSvenningsson
    @JosefSvenningsson 10 หลายเดือนก่อน

    John Vervaeke is a neurophychologist who's developed the 4P model of cognition and it explains and summarizes very well the difference between "propositional" knowledge, i.e. seeing a baseball game on TV vs "participatory" knowledge, i.e. playing baseball.

  • @NaughtyGoatFarm
    @NaughtyGoatFarm 10 หลายเดือนก่อน +1

    Awesome videos. Thank you.

  • @ApteraEV2024
    @ApteraEV2024 10 หลายเดือนก่อน

    Where do You work?? Viewer ,instantly watching released videos. Then commenting a Paragraph response.❤😮

  • @stewartmcleod4094
    @stewartmcleod4094 10 หลายเดือนก่อน

    You made it very easy to understand, thank you

  • @alexvoigt4247
    @alexvoigt4247 10 หลายเดือนก่อน

    Great video, appreciate your work!

  • @marce8760
    @marce8760 10 หลายเดือนก่อน

    Super video. Great explanation. Love it

  • @fredhearty1762
    @fredhearty1762 10 หลายเดือนก่อน +2

    Since the car's sensors/cameras cannot 'see' any of the car itself (though it must 'understand' its own physical boundaries), wouldn't Optimus need a significantly upgraded version of the World Model in which it can see/perceive its own hands, limbs, physical 'body' relative to the World Model attributes? An end-to-end training regime would be vastly more efficient if ego had self-awareness... seeing where its hand is relative to an object it is picking up, for instance. This interaction would border on consciousness.

    • @BenefitOfTheDoubtInquiry
      @BenefitOfTheDoubtInquiry 10 หลายเดือนก่อน +1

      FSD knows it's body in the world, and it's wheels. Park assist shows this. Optimus just has a more anamorphic footprint in the 3d space.

  • @evj6043
    @evj6043 10 หลายเดือนก่อน

    I just want to know how through all of this they address the biggest flaw that is affecting my drives today, which is old/inaccurate map data. Right now map data from an outdated system acts first and will constantly put the car in say the right lane when you are about to turn left because the road was different 8 years ago and the maps are from that time. Is their solution to upgrade to a better mapping system or somehow have the vision system lead and the mapping is somehow second. Other than not recognizing signs this is the only thing that causes me to disengage frequently where I live.

  • @paulmeynell8866
    @paulmeynell8866 10 หลายเดือนก่อน

    So the big question, if you collect all data then release an update all cars drive the same.
    If each car learns like a human will we have high mileage cars being much better at driving than one that hardly gets used?
    Or will they download the best driving car to everyone as a starting point?
    Loving the videos thank you.

  • @alexandreblais8756
    @alexandreblais8756 10 หลายเดือนก่อน

    very informative. thank you for this video

  • @ApteraEV2024
    @ApteraEV2024 10 หลายเดือนก่อน

    15:00 Thank you! Sharing your Brilliant Educational Course ❤

  • @roydenvickers6382
    @roydenvickers6382 10 หลายเดือนก่อน

    You ROCK DOC! Love your videos.

  • @haroldvargas01
    @haroldvargas01 10 หลายเดือนก่อน

    Excellent explanation now I have a better understanding thank you!

  • @earleyelisha
    @earleyelisha 10 หลายเดือนก่อน +1

    If you’re citing LeCun as an authority, just know he doesn’t think the field will have any agents with a world model for at least a decade.
    And even then his thoughts are that they may only be as capable as a mouse.

  • @JohnEAvenson
    @JohnEAvenson 10 หลายเดือนก่อน

    Excellent presentation

  • @XiallaLife
    @XiallaLife 10 หลายเดือนก่อน

    Awesome explanation thank you!!

  • @adriansaenz6853
    @adriansaenz6853 10 หลายเดือนก่อน

    Thanks a lot man! Great vid

  • @appl314
    @appl314 10 หลายเดือนก่อน

    Does show why cars that are coming towards you from a backwards angle are not identified. That angle doesn't appear to be in the shot. I wonder if they do similar projections from other than the front looking cameras?

  • @JohnBrown-pw3bz
    @JohnBrown-pw3bz 10 หลายเดือนก่อน +1

    Great explanation.

  • @cherubin7th
    @cherubin7th 10 หลายเดือนก่อน +1

    Foundation models are basically unsupervised learning making a comeback with superpowers.

  • @zaneenaz4962
    @zaneenaz4962 10 หลายเดือนก่อน

    world model essentially is that the system is defined .... uniform exits, lane markings, signage. woe be the case where the cones blew away or were removed.

  • @sportbikeguy9875
    @sportbikeguy9875 10 หลายเดือนก่อน

    28:09 the driver feedback could be hitting the brakes, re-taking control of the car, facial expressions recorded on the interior camera, body language, etc.... the options are endless

  • @DarylOster
    @DarylOster 10 หลายเดือนก่อน

    Unfortunately FSD is also learning some bad habits of average drivers - for instance camping out in the fast lane. In most states the left lane is for passing only.

  • @lewiswithrow1936
    @lewiswithrow1936 10 หลายเดือนก่อน

    /freaking great! Very informative. Thanks

  • @aomurdock
    @aomurdock 10 หลายเดือนก่อน +1

    Future Headline: 'Hackers change FSD USA to right hand drive & create chaos!' Hopefully this is just me being paranoid.

  • @damonknutson2855
    @damonknutson2855 10 หลายเดือนก่อน

    “There’s a difference between knowing the path and walk in the path.”

  • @ApteraEV2024
    @ApteraEV2024 10 หลายเดือนก่อน

    18:00 just wish Text was LARGER , so i could read it... ill just keep Listening...❤

  • @FrunkensteinVonZipperneck
    @FrunkensteinVonZipperneck 10 หลายเดือนก่อน +1

    Doc, RE: ChatGPT: Are researchers creating foundation models in every language?

  • @0602bill
    @0602bill 10 หลายเดือนก่อน

    Making sense ... Once the world model is formed, you said that it organically becomes increasingly complete or perhaps higher resolution or in some way simply better and better based on its experience with the world on going. The new of a few days ago is that Tesla is proposing to hire people to driver their cars later this summer and early fall for from I think it was $18 to $45 per hour. Does this hiring integrate with the world model refining you describe?

  • @crawkn
    @crawkn 10 หลายเดือนก่อน

    I suspect it might make some uncomfortable that it seems to be creating a sort of "black box" to process in this way, making the learning and interpretation process opaque to humans. What I realized is that this doesn't eliminate labeling for the purpose of the human interface, it simply separates it from the real-time prediction process.

  • @stevenjensen3653
    @stevenjensen3653 10 หลายเดือนก่อน

    very few of these comments sound like they actually understood the video and the potential of Tesla growing a Foundational World Model... for Tesla investors, its means we get all da monies

  • @Mediiiicc
    @Mediiiicc 10 หลายเดือนก่อน +5

    Now if only I could actually use the FSD I paid for.

  • @julie-xd7rr
    @julie-xd7rr 10 หลายเดือนก่อน

    I think you are correct most of the time. However, concerning the problem with FSD going faster than the speed limit. FSD will not do that. The slow car is not causing dangerous conditions, it is the vehicles going too fast that is the problem. I am not sure how FSD is going to handle road rage.

  • @danam579
    @danam579 10 หลายเดือนก่อน

    I was talking to some safety drivers who got the opportunity work with a self driving system which had been end to end trained. It was capable of delivering spectacularly. Good behavior. Better behavior than they had ever seen in SD system made of modules like perception and planning. Unfortunately, the end to end trained system was capable of absolutely terrifying behavior. And there was no ability to determine what would cause either behavior.
    Although the DMV does make you take a road test, you can only do that once you’ve demonstrated a high level of capacity to recognize signage and other abstract concepts of operating a vehicle on the public roads. I’m still wondering how Tesla can make strong assurances that their FSD system won’t keep smashing into first responders vehicles, or first responders themselves

  • @steventaylor4159
    @steventaylor4159 10 หลายเดือนก่อน +2

    Wonder if they could map the whole united states and make a 3d clone once there are more cars on the road and then use that to also help train FSD?

    • @alexanderpoplawski577
      @alexanderpoplawski577 10 หลายเดือนก่อน

      They will need a lot of training data for sure. FSD should be able to handle even unknown roads. What they need is a lot of driving situations rather than an exact map. Human drivers can drive through a city without a map, just reacting to the current traffic situation. That's the goal.

  • @retrodraggin5540
    @retrodraggin5540 10 หลายเดือนก่อน

    I always wondered in the case of fog, precipitation, and smoke, if the headlights and cameras could be working in the infrared range to be able to see stuff humans cannot.

    • @robertwhite3503
      @robertwhite3503 10 หลายเดือนก่อน

      I don't believe the infrared has been mentioned. The computer processing has been described as photon counting. The mapping in the dark is quite impressive, much better than we can see. I think the Tesla team will have considered infrared and decided it was not justified. Obviously we do not know all the details of every decision they make.

    • @chrisheath2637
      @chrisheath2637 10 หลายเดือนก่อน

      Even seeing a little better than humans , with 8 cameras, under these conditions will help - the chances are, these abilities will improve over time...using "photons" rather than "filtered output" ( which Tesla have implemented) will provide immediate improvements...

    • @chriswright9096
      @chriswright9096 10 หลายเดือนก่อน

      Well, Musk claims so many things that never come close to happening...so why not claim infra red headlights. Then the TH-cam fanboys can make twenty-minute videos explaining how great this new tech that Tesla has (be sure to use present tense) is and how all the other car companies are therefore toast.

    • @chrisheath2637
      @chrisheath2637 10 หลายเดือนก่อน +1

      @@chriswright9096 As my French teacher once said " That, in Anglo-Saxon terminology, is a load of bolleaux .."

  • @ApteraEV2024
    @ApteraEV2024 10 หลายเดือนก่อน

    23:00 fine motor skills 😉

  • @williamdeoradesilva9444
    @williamdeoradesilva9444 10 หลายเดือนก่อน

    AMAZING!!!! Blade Runner is here. An exciting time to be alive.

  • @charlesrovira5707
    @charlesrovira5707 10 หลายเดือนก่อน

    @29:29 I only have two words to say: _Mirror Neurons._
    People who can _dance,_ not just __spaz out to a beat,_ usually learn the kinematics of a dance from just watching it, a dozen times before they try it.
    *Dojo* can be considered like *_Ego's_*_ mirror neurons._

  • @retrodraggin5540
    @retrodraggin5540 10 หลายเดือนก่อน

    Thanks for dumbing it down for the rest of us

  • @tech-utuber2219
    @tech-utuber2219 10 หลายเดือนก่อน +3

    What Ashok did not talk about in his presentation is the division of work between an individual Tesla vehicle with Hardware-4 and DOJO, so we presently do not know if Hardware-3 is already completely obsolete or still has some future utility beyond training the new system. Apparently HW3 is already maxed out regarding this new system. Elon has already responded to a Twitter question that HW3 we will not be upgradable to HW4. We will see.

    • @SyntheticSpy
      @SyntheticSpy 10 หลายเดือนก่อน +3

      We have not been told that hw3 is maxed out, nor that it will not be capable of self driving. There will be a hw5, hw6 and so on because they will always be improving it. That doesn’t mean any previous hardware will not succeed, just that the next one will be that much better

    • @Will-be-free
      @Will-be-free 10 หลายเดือนก่อน +2

      If Tesla thought that HW3 wasn't capable of FSD, they would have made it up-gradable.
      If they sometime in the future come the the conclusion that HW3 is insufficient, they will make an upgrade kit.

  • @unomilan
    @unomilan 10 หลายเดือนก่อน

    Excellent!

  • @sujanpoudel2
    @sujanpoudel2 10 หลายเดือนก่อน

    Like your shirt. Have you been to EBC?

  • @blengi
    @blengi 10 หลายเดือนก่อน

    hmm, I slow down and use the high level executive function module of my neocortex to address edge cases in the real world appropriately and not rely on some heavy duty black box world model inference process. Scares me to think others don't and that FSD wouldn't aim for a more modular high level abstraction layer over real world physical scenarios....

  • @andreasklossek9252
    @andreasklossek9252 10 หลายเดือนก่อน

    2:56 Most LIDAR scans do look more consistent, to be honest.

    • @alexanderpoplawski577
      @alexanderpoplawski577 10 หลายเดือนก่อน

      Because they contain a lot of redundant information.

  • @FinlayPG
    @FinlayPG 10 หลายเดือนก่อน

    Thanks!

  • @jabulaniharvey
    @jabulaniharvey 10 หลายเดือนก่อน

    Voice and text intervention for FSD

  • @CharlesVanNoland
    @CharlesVanNoland 10 หลายเดือนก่อน +4

    I just feel like there's going to be a loooooong tail of edge cases as long as FSD is only trained on the road. It will always end up encountering situations it fails to handle that any human could negotiate without issue. Humans understand the world through a lifetime of experience with it and the things in it. A computer backprop trained on driving alone will never understand the world and the things in it in a way that will allow it to behave like a human reliably would. That being said, FSD will of course become (I believe it has been already) statistically safer than the 'average driver' (which doesn't exist) but when it makes a glaring mistake nobody can imagine making which results in someone's death, the consensus will be that it's not safer. People want FSD to at least be as-good, or better, than themselves, and witnessing it make mistakes they can't fathom making erodes confidence in it. As long as Tesla has to tell drivers to be alert and ready to take control, it's not really "Full Self-Driving". When they can say "Lie back, take a nap, watch a movie, enjoy the ride. We'll get you where you're going quickly and safely!" because it doesn't make anymore obvious mistakes, is when it's FSD. I appreciate their pursuit, but I just don't see something that doesn't understand the world like a human reliably behaving like a human.

    • @mariusm62
      @mariusm62 10 หลายเดือนก่อน +2

      Exactly. For all this talk about perception, I have not seen so far a plan from Tesla in how to resolve edge cases. I mean if even now they are still struggling with perception, when are they going to tackle the almost as complex space of actions?

  • @WarrenLacefield
    @WarrenLacefield 10 หลายเดือนก่อน

    World models are key for living creatures with brains (insects to people). They appear both genetically hard-wired and plastically configured/learned (nature/nurture). As Piaget, Montessori, and other cognitive/developmental psychologists have learned, the "order" of things/principles learned is important. What and how an infant learns, as the brain grows and develops prior to the neural "trimming" (a "distillation"?) that occurs around 5-6 years old, establishes a sort of cognitive framework for a lifetime (very hard to change later). This is one reason suggested why "talents" and "IQ" seem to stabilize. It will be interesting to observe if, when, and how AI neural nets develop capabilities during similar predictive training.

    • @WarrenLacefield
      @WarrenLacefield 10 หลายเดือนก่อน

      For living things, while physics is rather universal, world models are shaped by experience over time in contextual environments (air, water, ground, maybe space) and by embodiment (senses, legs, wings, "cultures", etc.). Environments change (day/night, seasons, weather, etc.) but so do bodies (development, maturity, infirmity, aging, etc.) Let's hope our automobiles don't decide they are just members of a herd or school of fish and decide to follow the lemmings up ahead.

  • @ApteraEV2024
    @ApteraEV2024 10 หลายเดือนก่อน

    19:00 they need a Terminal Output OS...😅❤DARK MODE😅❤

  • @barrellcooper6490
    @barrellcooper6490 10 หลายเดือนก่อน

    I know to play piano. (I press the key on the piano that corresponds to the note in the sheet music.) But l can't play the piano.
    Same idea you were trying to express with the rock climbing analogy.

  • @M3W3
    @M3W3 10 หลายเดือนก่อน

    Thanks a lot

  • @freddydad1
    @freddydad1 10 หลายเดือนก่อน +1

    scratching my speculative itch..

  • @Marc_de_Car
    @Marc_de_Car 10 หลายเดือนก่อน

    Thank you

  • @ArizVern
    @ArizVern 10 หลายเดือนก่อน

    RETIRED 80, USAF,VFW, FREEDOM. WHY DOESN'T WORLD MODEL GETS SO BIG, DATA WON'T FIT ON VERSION 3 COMPUTER?

  • @mariusmeyer14
    @mariusmeyer14 10 หลายเดือนก่อน

    Thx, I think my brain just exploded but understand it a little more. 😅

  • @narxic
    @narxic 10 หลายเดือนก่อน

    Someone telling you how to shift a car with a manual transmission and clutch.. is different then actually doing it when you're driving.