Tesla Reveals The New DOJO Supercomputer!
ฝัง
- เผยแพร่เมื่อ 12 พ.ค. 2024
- Tesla Reveals The New DOJO Supercomputer!
Last video: Tesla Reveals The New DOJO Supercomputer!
►The Tesla Space Merch Store Is Live! Shop our first release while quantities last: shop.theteslaspace.com/
► Patreon: / theteslaspace
► Join Our Discord Server: / discord
► Subscribe to our other channel, The Space Race: / @thespaceraceyt
► Subscribe to The Tesla Space newsletter: www.theteslaspace.com
► Use my referral link to purchase a Tesla product and get up to $1,300 off and other exclusive benefits. ts.la/trevor61038
Subscribe: / @theteslaspace
Welcome to the Tesla Space, where we share the latest news, rumors, and insights into all things Tesla, Space X, Elon Musk, and the future! We'll be showing you all of the new details around the Tesla Model 3 2023, Tesla Model Y 2023, along with the Tesla Cybertruck when it finally arrives, it's already ordered!
Instagram: / theteslaspace
Twitter: / theteslaspace
Business Email: tesla@ellify.com
#Tesla #TheTeslaSpace #Elon - วิทยาศาสตร์และเทคโนโลยี
The fundamental unit of the Dojo supercomputer is the D1 chip,[21] designed by a team at Tesla led by ex-AMD CPU designer Ganesh Venkataramanan, including Emil Talpes, Debjit Das Sarma, Douglas Williams, Bill Chang, and Rajiv Kurian.[5]
The D1 chip is manufactured by the Taiwan Semiconductor Manufacturing Company (TSMC) using 7 nanometer (nm) semiconductor nodes, has 50 billion transistors and a large die size of 645 mm2 (1.0 square inch).[22]
As an update at Artificial Intelligence (AI) Day in 2022, Tesla announced that Dojo would scale by deploying multiple ExaPODs, in which there would be:[20]
354 computing cores per D1 chip
25 D1 chips per Training Tile (8,850 cores)
6 Training Tiles per System Tray (53,100 cores, along with host interface hardware)
2 System Trays per Cabinet (106,200 cores, 300 D1 chips)
10 Cabinets per ExaPOD (1,062,000 cores, 3,000 D1 chips)
Tesla Dojo architecture overview
According to Venkataramanan, Tesla's senior director of Autopilot hardware, Dojo will have more than an exaflop (a million teraflops) of computing power.[23] For comparison, according to Nvidia, in August 2021, the (pre-Dojo) Tesla AI-training center used 720 nodes, each with eight Nvidia A100 Tensor Core GPUs for 5,760 GPUs in total, providing up to 1.8 exaflops of performance.[24] credit: wiki
Elon , with his outrageously audacious visions attracts the most talented and brilliant people to his companies ❤
Thanks for sharing the actual numbers. Do you know if Tesla's numbers are fo reduced precision like the ones used for AI inference (16 bit) or Training (32 bits)? Thanks!
The speed of change and "successful change" is going to be staggering....
It should be noted that Telsa is still buying as many Nvidia GPUs as they can get their hands on.
So they had 1.8 exaflops in 2021 and now are building a computer that only has one exaflop?
Well brought presentation with understable analogies!. Thank you for your hard work .
In an unprecedented move, Dojo changed its name to Skynet.
Underrated comment
SkynetX to be exact.
@@Tailspin80 Just X :)
Xnet💫
Why Yes lol 😅😆😅😆😅😆😆😅😆😅😆😅
Well brought presentation with understable analogies!
You need to use the tensor core throughput of the A100. Probably even at lower precision (BF16) to have something realistic to compare against
it looks like they have the memory right on the chip to maximize the memory speed
The Dojo compute figure is 8 bit.
The A100 compute figure he uses are 16 bit.
Thank you for your hard work ❤
This is very well done. Thank you!
Great description of Dojo.
Amazing video for better understanding the implications and funcionality of dojo! Thanks :)
🎯 Key Takeaways for quick navigation:
00:00 🖥️ Tesla's AI Division has created a supercomputer called Dojo, already operational and growing in power rapidly, set to become a top 5 supercomputer by early 2024.
01:25 💹 Dojo's computing power forecasted to reach over 30 exaflops by Feb 2024, with plans to ramp up to 100 exaflops by Oct 2024.
03:02 💰 Tesla's Dojo, a specialized AI training cluster, equates to a $3 billion supercomputer, offering remarkable AI model training capabilities.
04:00 🚗Dojo focuses on training Tesla's full self-driving neural network, surpassing standard supercomputer definitions for specialized AI training.
05:38 📸 Dojo processes immense amounts of visual data for AI model training through labeling, aiming to automate a task previously done by humans.
07:01 🧠 Dojo adopts a unique "system on a chip" architecture, like Apple's M1, optimizing efficiency and minimizing power and cooling requirements.
08:10 💼 Dojo operates on tile levels, fusing multiple chips to create unified systems, enhancing efficiency and power in AI training.
10:00 ⚙️ Tesla can add computing power through Dojo at a lower cost, avoiding competition for industry-standard GPUs, potentially leading to a new business model.
11:23 🌐 Future versions of Dojo could be used for general-purpose AI training, enabling Tesla to rent out computing power as a lucrative business model.
12:45 🔄 Renting out excess computing power from Dojo can potentially revolutionize Tesla's profitability, similar to Amazon Web Services.
Made with HARPA AI
You feed a link somewhere and it spits these out?! Please share the secrets of your ways?
Thanks.
Realy a great video! Thanks!
Actually, the semiconductor trend for the past few years is moving away from single chip SoC designs to multi-chip packages, which means the SoC is not on a single piece of silicon, but multiple pieces of silicon inside a single “cpu” package. This is what is used in the M1, the chips in the iPhone, and inside AMD and Intel’s latest cutting edge CPUa etc. Multiple chiplets are placed very close to each other, even stacked one on top of the other inside a “cpu package,” but the SoC is no longer a single piece of silicon in cutting edge products.
The reason this is happening is, of course, economics. The different chips are produced in the process nodes that are most economical. So the I/O hub in an AMD cpu is in one process while the cpu clusters are on cutting edge processes in units of 8 or 16 cores per cluster. Then the cpu package has one or more of these separate cluster chiplets placed around an I/O hub chiplet in the AMD example. In an Apple products, the A-series and M1 cpus, separate pieces of silicon for CPU and for memory are stacked inside the CPU package. This is why your M-series computers system memory can’t be upgraded.
Technically they could add additional bus logic to allow external memory for expansion, but that defeats the purpose of being compact.
🎯💯
The reason chiplets work well is also yields, smaller chips mean higher yields per wafer. Large chips can be made useless by one tiny imperfection where as with say 8 smaller chips covering the same area that same imperfection only loses one smaller chip with all the others being fully functional. Interposers are then constructed using very old and reliable techniques to stitch all the chiplets together. Not quite as fast as a single large chip but considerably cheaper.
I more than liked this video. It was a wealth of information in less than 15 minutes. 🙂
If every vehicle on public streets had a "gps" transmitter giving out data like direction, speed, etc. FSD could take advantage incorporating this localized data (car to car) to help determine its next action.
A Futurama episode when the gang went to Robot Planet the robots move like vehicle traffic but fit between each other at high speeds. Perfect trafgic management.
no one wants to put in a tracking device in their car ffs, this isn't China
Privacy has left the chat
@@Fastotec9 What privacy are you talking of this day and age?
@@getsideways7257trust me, we still have a lot of privacy in this day and age. And although I want technology to improve and would love the sharing of location data and things between cars without intrusive being able to monitor people. I would avoid any reduction in privacy
They will kill us all
Excellent video my man!
I wonder if that would run DCS in VR with full graphics options?
Nvidia can do it
Can you Imagine, hundreds of thousands of teslas are feeding data to this machine every day
That is their main advantage, the limiting factor for AI systems is becoming the amount of training data available.
And still Tesla hasn't done much other than slightly improve FSD that is still widely ridiculed. Don't even get started in BOT until it can actually do something useful at a fast pace.
Nice pace, good graphics, not too "fanboy", plenty of terminology, and raised a few questions I need to go look up and think about. All around effective TH-cam. Well done.
Except for "Artificial Intelligence *traning* cluster" @ 04:12 :/
@@johnwillemsen6872 oh man we got an English major in our midst!! I could have watched that a thousand times and not caught it cause that is one superfluous "e" in my estimation and yet we still need to know how to differentiate long and short vowels.
Keep up the great work, Elon & Tesla Team.💯💯
Ready to see the luxury Tesla RVs also, Boss.😉😉
What is a wait if we’ve ever been sursnagged to unforgivable faulty price presumptions👽
Be interested in seeing how it compares with the new Grace Hopper processor and scalability capability from nvidia.
probably Nvidia will be faster and for all purposes, in the same way AMD will epic thread reaper 128 cores smokes into oblivion anything from Apple, apart from being able to compute open source
We squandered the train
Keep up the good work
At 2:11 your big number is missing three more zeros! That number is only 1 quadrillion.
That was quite interesting. Thanks.
🌴☀️🌴
Awesome video. funny too how at about 2 minutes in while explaining what an exa-flop is and this powerful computer, they show some basic html and css hehe
Another great vid. Thanks 👍
@2:10 - You're either missing 3 zeroes, or an exaflop is 15 zeroes.
To put it simply, FSD must produce a set of correct and safe driving responses, to a set of situational images created by the cars cameras. That requires some amount of "prediction", or what each object in the image is, and what it is likely to do next. Ignoring inattentiveness, even human drivers can get that wrong a lot of times. If FSD is to be successful it needs to get that right more often than human drivers do. Also driving responses need to be different under different road surface and weather conditions, and I don't even know if FSD accommodates this. But in any case the "computational power" required for this, probably cannot be "on board" the vehicle. It might resolve to image analysis, object identification within the image, and probabilities of what each object will do next, and the driving response to that. That is a lot of possible "image" - "driving response" combinations to be processed in real time. Even if the super computer could do it, then there is also the "real time" communication between the computer and the vehicle. (the bandwidth)
I love your news letter!!
Huge Thank you 🎉❤
A 'flop' is a floating point operation which is more complicated than a mere computer instruction.
Great stuff! Please turn up the background music a little more in the next videos.
This is the beginning of true FSD, and will be an epic win if tesla plays their cards correctly.
which new marketing scam term will it be next? FULL self driving? TRUE FULL self driving? I SWEAR BY GOD THIS IS THE TRUEST AND FULLEST self driving? THIS TIME FOR REAL FULL self driving? I PROMISE NEXT YEAR IT'S READY FULL self driving?
@@L3nny666The term is just full self driving. Always has been and always will be.
@@Astra2 as we all know, FULL SELF DRIVING is a marketing term as it's not fully autonomous. Now the comment above me said "true" full self driving...which is rather funny considering for how long musk has promised true autonomy...if you don't get a joke and rather be a butthurt tesla fanboy and billionaire boot licker, go ahead.
@@L3nny666 Full self driving means fully autonomous. It's currently in beta, that's why it's not fully autonomous. I understand what you mean but I think it would be unwise to doubt the same person who figured out how to land rockets.
@@Astra2 yeah sure..."beta". tesla is still at SAE level 2, while mercedes and toyota are already at sae level 3.
and you don't really believe musk figuered out any of this technology, right? this man is an investor and not an engineer.
Great video. Keep it up and dont make me hit the new button🤣
Great explanation, thanks!
Apart from the inaccuracies and generalizations in this video there were some nice images.
Compared to Apple CPUs that still use DRAM for memory, Dojo is using lots of SRAM which is highly expensive but much faster than DRAM. Most computers use SRAM only of L1 cache in the CPU and the main memory is using cheaper DRAM tech.
Totally different applications, the Dojo processors only need small amounts of memory because their task is very specific and highly optimised for that single task. Apple CPUs are just general purpose CPUs with a lot of sub systems integrated into a single package to reduce communication power consumption and latency. Dojo is more like a GPU than a CPU.
@@schrodingerscat1863 Dojo also has system-wide DDR4 SDRAM but it's used as fast storage device instead of treating it like a traditional RAM. Load and store speeds to storage (I would assume SDRAM) are 400 GB/s and 270 GB/s according to Wikipedia article. If you compare this to modern computers, Intel i9-13900K has max memory bandwidth about 90 GB/s while using all cores in optimal memory channel configuration.
But yes, SRAM has single clock latency: Dojo runs at 2 GHz so that would be 0.5 ns vs best available DDR4 SDRAM has latency around CL12 or about 6.7 ns. So obviously you would try to write apps so that you can use only the memory that as 13x smaller latency. However, that doesn't mean that Dojo cannot run other apps, too, only that you cannot get optimal performance with apps that cannot fit at least the full inner loop into the available SRAM.
AMD deserves the credit for the MCM design. As they were the first to show its benefits large scale with their Ryzen processors.
Actually it was the thread ripper.
super liked the video thank you so much
Its crazy to find a company like Tesla in auto industry
why?
Meaning of your comment?
I think he means crazy great!
This
Is out of my mind, amazing ❤❤❤❤❤
I wonder what the compute per watt is for Dojo vs A100?
exactly. ^^^^^^^^^^^^^^^^
THIS is what people should be asking and talking about.
Ad Dojo is more highly optimised for a specific task it is almost certainly way more efficient than the A100 for that particular task.
They should compare it to h100, a100 is last gen so the comparisons are more favorable.
@@jakubiskra523 For AI the a100 is still the better card as it has hardware specific to deep learning tasks with the H100 being the better option for raw processing scientific workloads. The A100 is also more energy efficient making it a better fit for large multi card systems. They are basically designed for different tasks rather than being different generations of the same thing.
@@schrodingerscat1863 this is why all of ai companies are using h100 for they clusters, and h100 is more energy efficient in every way, your source of information is not trustworthy
2:10 Says 18 zeros. Shows 15 zeros.
@4:30 -- that casing making it look like an asic imo.
Nice work thanks
Dojo is making the matrix!
Time will tell like the hyperloop and Tesla truck could go either way.
You said 1 with 18 zeros, but put 15, make sure minor details add up!
Great video
I thought i heard someone say that the Tesla AI used for FSD was no longer using labels.
Yeah they cracked the code - which means they probably unlocked it to mirror driving of all the live drivers on the fly and learn from it side by side weighing in on what counts as perfect driving.
Good content well presented,,,
I loved your video but.......to define an exaflop you show 1 followed by 15 zeros and say the 1 should be followed by 18 zeros. I am just curious which you intended.
"That is a one with 18 zeroes behind it" and they show 15 zeroes... brilliant.
@1:55 Too funny. An Exoflop is a 1 with 18 zeros behind it.... and the video shows 15 zeros. A lot of good info here on Dojo... thanks for the update.
18 zeros would be too small on the display. We don't all have your perfect eyesight. Hehe
I take it he is Mac man. In the old days we called this 'cascading', and we had 27 iMacs connected. No one ever talks about the software needed to use this configuration.
This sounds impressive, but the Hardware is far beyond available Software to run them. They still don't have much to do.
Back then we thought 10 gigaflops was incredible. Working on these things is what I used to do and explains why I garden now.
not sure why google/google cloud was not mentioned. This seems to be similar to googles "TPU" or Tensor Processing Unit right? Just asking to make sure i ve understood correctly. Google uses TPUs for AI training as well. I imagine its been used for their self driving car too
love it when talking about supercomputers and showing html and javascript, exactly the thing which needs exa-flops
a one with EIGHTEEN zeros- and showing 15 in the video, btw here we call that trillions.
Excellent video.
Elon: "Hey world don't do AI"
Elon: "Welcome Dojo"
Imagine car insurance companies deciding to only insure driverless cars.
Thats so stupid to say. Think about what you just said
that's going to happen sooner than you think ! Governments will refuse people to drive cars not being autonomous ! 95% of accidents are due to human error, that's an enormous cost to the social security.
@phvaessen it's going to happen but it shouldn't, even if it means a higher mortality rate
@@daviddickey9832 what do you mean higher mortality rate? will driverless cars cause more accidents than human drivers in your opinion?
@@11insertusernamehere what I'm saying is that automated cars have a lower mortality rate, but we shouldn't allow institutions to effectively prevent any person from driving even though people driving has a higher mortality rate
Well...
it could be that Tesla or any other company invest in different branches. But don't worry, as kid opening an old stereo set I was surprised to find some mitsubishi components as well
Does Tesla also manufacture the DoJo chips?
Get that DoJo working on a solid-state battery 🔋 Elon!
what
@@biggles9604 search solid-state batteries, if as advertised they are most likely the brightest path for future battery tech holds more power, charges faster, costs WAY less, and can be made in a way that rare earth minerals are either not needed or are needed in a way lesser capacity. So yeah sick that supercomputer or fleshing out the science or solid-state batteries.
I think that first picture of the number of instructions in an exa-flop, is wrong. It should show 18 zeros for 10 to the eighteenth power, no? It’s showing 15 zeros.
Wow I love it🎉😮🎉
I imagine a scenario where Tesla sells Trainjng Tiles and makes more profit from TTs than cars. Your “game changer” is spot on.
Well done.
Is Tesla constantly uploading vehicle driving data to revise its autonomous driving program and then downloading those revisions?
nice video!!!!
2:11 - that's only 15 zeros, you're 3 zeros short.
350million miles of FSD data 🎉🎉🎉
What’s amazing is that the auto industry is just the beginning. This will be the foundation of advances in gaming, MMO-VR, physics research, simulations, and more.
Is dojo controlled by arm cpu?
Amazing but also makes me curious, cause if it’s so powerful why do they stop at putting it in a Tesla or possibly rocket, there’s so much this computer would do. They have the resources so why not? 😳
These wouldn't go in a Tesla or a Rocket. It's purely for the Tesla Warehouses, and it's custom designed for processing AI through videos, and is worse than other types of supercomputers. It's essentially hyper-specialised, and isn't really helpful in circumstances other than these.
great video ty
Basically back to mainframe computing.
ELE CONSEGUE SUPRIR TODAS ÀS NECESSIDADES
I just hope DOJO doesn't morph into the supercomputer portrayed in I Robot.
Sounds like a fractal array.
t's just hard to deliver a lighting engine, you know laser sparkes that have 3 dimensionality on a 2d line in space, otherwise its a cell phone.
I want one... I want to simulate AI in social situations so it can learn to blend in better and create a more emotional AI.
AWS is far from "just rent excess" these days, that's how it started tho
Wait so you want to say on black Friday, Amazon close all the renter's services, if needed?
Liked & subbed!!! Ireland,,,
That's all nice and good but the real question is: can it run doom?
Dojo tesla optimus, can't wait.
Is it 100 exaflops double-precision? Because that's how supercomputer performance is measured.
Excellent food for thought
Plot twist he using Tesla cars as a kind of bot net , maybe even also using starlink.
2:20 why not just say 1 ExaFlop of compute is equivalent to 3 000 Nvidia A100 GPUs?
Yeye when I see it in action I will believe it
Where do they keep it?
Mini cluster cpu's quantum? Graphene? Laser
What is the new dojo supercomputer? A lot of this content seems to explain what was revealed when they showed off dojo?
Thanks!
Its revolucionary!!!
Mind blown up
Dojocat supports this computer design.
Please, somebody explain to me how this is not going to be used as the most advanced targeting system for integrated weapons in the world
It's not. We promise.
@@maxidaho Good enough for me.
Paint it black and it looks like the neuro-processor in T2.