AI Expert Deep Dives NVIDIA vs. Tesla's DOJO
ฝัง
- เผยแพร่เมื่อ 30 มี.ค. 2024
- Enjoy this clip? Watch the full video here:
• Tesla FSD v12 Just Cha...
If you'd like to support the channel, you can find me on Patreon here:
patreon.com/HansCNelson
Shopping at Tesla? You can support the channel & save by using my Referral code:
ts.la/hans71453
PANELIST INFO:
James Douma
X Profile - x.com/jamesdouma
Farzad Mesbahi
X Profile - x.com/farzyness
TH-cam - @farzyness
CHANNEL LINKS:
Hans' X profile - x.com/HansCNelson
Hans' TH-cam channel - @HansCNelson
Hans' Patreon - / hanscnelson
Want to create live streams like this? Check out StreamYard: streamyard.com/pal/d/60406730...
#Tesla #elon #musk #elonmusk #teslaengineering #teslaai #teslafsd #fsd #fsdbeta #autonomousvehicles #autonomy #neuralnetworks #machinelearning #engineering #deepmind #nvidia #dojo #teslanews
Tesla, Tesla Stock, Elon, Musk, Elon Musk, Tesla Engineering, Tesla AI, Tesla Energy, Tesla Manufacturing, Tesla Stock Videos, Tesla Dojo, AI, AGI, LLM, LLMs, Technology, Software, Computing, machine learnings, AI models, Engineering, Tesla Bot, Bot, Optimus, Optimus Gen 2, Tesla Optimus
If NVIDIA's HATES Competition, What About DOJO?
AI Expert Deep Dives NVIDIA vs. Tesla's DOJO - บันเทิง
Watched the full video right after it came out ... astoundingly informative ... more please ...
Exaflops of compute is part of the equation… data is another… power usage of your neural net is another…. Running sold robots like Tesla cars is another… ability to train is another… AND…Tesla has done it!
This is the definition of GEAT info . . many thanks Hans!!
Nvidia confirmed that Tesla has committed to buy Blackwell.
It was a very interesting chat i am always amazed at James talking points❤
My opinion, unsupported by facts:
Tesla intended to make Dojo its main training compute asset. As much effort as they put into it, they had priority.
Two things probably happened.
Dojo maturity slipped, either because programming it took longer than expected, or hardware manufacturing slipped.
And nVidia GPUs, while not as optimized for the workload, got better. And we're available in bulk.
So I don't see Dojo as originally intended as an insurance policy. Instead, I see Tesla as pragmatic and flexible. They went and got the compute cycles they needed.
But for sure, Dojo is *now* an insurance policy. Or a fallback.
nVidia's Blackwell architecture is coming. Much better compute per watt than H-100. Tesla will probably stuff Blackwell into its data centers as fast as they can buy them.
And keep building Dojo, and work on Dojo 2.0.
It's nice to be in a strong cash position.
I think there is an interesting point that a lot of people are considering when comparing Nvida's Blackwell to Dojo chips. That is Tesla's dojo chips are application specific integrated circuit (ASIC) vs graphics processing units (GPUs) in the case of Blackwell. Obviously Nvidia has built out their GPUs to target Ai/ML workloads however a GPU is still general purpose and an ASIC is far more efficient and performant in the tasks it was specifically designed for then GPUs. My guess is Tesla will be using the GPUs they buy for data processing, software development, running Tesla's manufacturing management Ai, and the bulk of FSB training will be on Dojo. People should check out the Groqcard accelerator if the are interested in seeing the difference an ASIC can make in Ai workloads. As one last note the Groq accelerator uses 14nm (3 gens old) vs 4np (advanced 5nm) and it still very impressive.
I agree with you that the interconnect is the secret sauce. Physical and routing astounding amounts of data. The shuffle in Map/shuffle/Reduce.
I am looking forward to learning more about it. Intro to Dojo, chapter 2? 😅
The interesting thing is that data centre in New York is all dojo, there is a hint there that Dojo is just fine.
Niagra Mohawk provides the hydropower. I'm sure that and the cool climate are good reasons to locate it there.
wrgg
Incredible timely discussion thanks guys
I prefer this type of conversation rather than those that come over as blatantly Tesla fanboy. This is a bit more technical and a bit more real. I am a fan of Tesla btw.
Without those Tesla fans, you might have this kind of discussion.
Without those Tesla fanboys, you might not have this kind of discussion.
Look at his X account… it’s EXCLUSIVELY bullish statements about Tesla and other Musk companies.
These are both blatant Tesla fan boys lol
Totally agree 💯
You mentioned at the start of the video that you talked about the limits of inference by the onboard HW. Does anyone have a timestamp/link to it? Would like to hear as I think right now that’s the biggest constraint with HW3. Thanks
Thank you for the succinct analysis.
wrr, outx, can outx infix etc any nmw s perfx
So the competitors that Tesla's DOJO should look at are also working on 4 or 5 applications instead of thousands like Nvidia? Those companies are their competition if they are interested in competing. But if Tesla has chosen the right mix of applications then they are already ahead of the competition in that mix and they are accelerating in that space.
Thanks Hans. This reminds of the difference between Theory and Reality. In Theory, Tesla is an AI company. But in Reality, Nvidia is.
NVDA forward P/E is significantly lower than the competition. They are growing over 100% per year. I do not consider that frothy. Time will tell.
As corrected by a subsequent post, a Blu-ray is 25 GBytes, or 200 Gbits. So at 192 gigabits per second, it's downloading about 1 Blu-ray per second or about 5 DVDs per second.
Insane
a Blu-ray is 25 GBytes, or 200 Gbits. So it's 1 Blu-ray per second. That's still a lot. DVD = 4.7 GByte = 37.6 GBits, so a little of 5 DVDs per second.
@@incognitotorpedo42 thanks for that. I've made the correction
Although compute and software are big components to neural nets but I'd also add memory to the "chips" since these matrices are all built in memory before they can be processed. Plus how do you increase the processing power? Compute, software AND memory requirements need to be expanded!
With the demand for NVIDIA processors exceeding supply, and the commensurate high prices, it seems that there is a huge opportunity for offering AI compute-as-a-service to the masses of users that cannot buy a 10,000 processor cluster. (Also, there are limited number of locations that can supply the electrical power for such a facility.) So, the opportunity for Tesla or someone to sell this service exists big time -- will be interesting to see who steps up to meet the demand.
Something that needs to be said about AI Day and the Nvidia presentations. They almost mirrored each other! Each started with the base line chip and scaled all the way up to cabinets from there! It was eerily similar. Did Nvidia take a page out of Teslas presentation? They even bragged about having the worlds first exaflop cabinet, when we know DOJO was presented as ten cabinets per 1.1 exaflop. It almost felt like Nvidia was competing with Tesla when they did this.
Very interesting, Thanks guys. You spoke about 192GB interconnect bandwidth x # connections. How does this compare to Nvidia's new Blackwell processors using Grace GB200 (10TB/sec.) And NVLink (130 TB!)? Love to see a comparison between the two architectures.
NVlink enables high-speed direct interconnection between GPUs within a server and this is measured in Bytes/s. What James is talking about is the interconnect between cabinets within a Data Center which uses a physical cables/medium, measured in bits/s, to communicate amongst each other. Infiniband switching solution can provide a hight speed interconnect of 800Gb/s between cabinets. High Frequency Trading companies have been using infiniband for years for high throughput and low latency compute to make nanosecond decisions on trades.
There are at least 50 companies per continent that say they can build a battery that has more than 500 miles of range, only they don't build it. That's the case with DoJo.
What will be will be.
Cheers for that.
I think we wanted to know how Nvidia's omniverse would compare to teslas dojo. They also have a collaboration with foxconn that seems to be another supercomputer for training ai models. Not really worried about which hardware runs it better which seems to be the preoccupation here.
The Thor driving system will run to level 4+ according to nvidia. Not sure if they use Omniverse to train it.
James, is dojo good at video training and not as good at LLM?If that is the case and the bot training requirement ramps up from all bot manufacturers would dojo would rent out video training compute for bot? I think tesla will mass produce the Optimus bot and ship the bot with the training kit and the client has to pay for the training on the dojo video training system for bot skills. The bot skill is like the iPhone apps for Apple! Tesla will screen the video send in so no bad skills will be uploaded to the bots.
Help my competitors ?
I don't think Dojo as a service will ever happen.
Boy that's a lotta data flow. They must have some kind of full flow staged computing cycle.
Dojo has shuffled off to Buffalo.
I'd like the Petaflop Basket of Buffalo Dojos please. Make those the Exaflop Hot ! Thank you. :-)
I would imagine Tesla may put a relative of the DOJO chip on Starlink satellites where it is mostly used as a mesh network nodal, remote concentrator, and front end communications processor. At the edge end points the mesh network will have billions of cars, IoT devices, robots, cell phones, neural linked biological entities and residential WIFI networks. There will be a need to push huge amounts of data around. It will take massive amounts of processing power just to manage the data communications of the distributed edge application compute processors.
James, I love your content. I'm going to expand this discussion to where Tesla goes next.
First, Tesla isn't avoiding talking about their D1 chip and Dojo. They're just not publicizing their progress.
Your comment: The three architectures required for AGI are CPU, GPU, and neural processors. Neural processors are central to AI architecture. GPUs and CPUs are peripheral.
AMD challenged Intel in CPU architecture and won that battle. In 2006, AMD purchased ATI Technologies, who manufactured GPUs. (Personally, I liked the Radeon boards better than the nVidia product.)
Tesla and AMD should be excellent partners. AMD can bring superb CPU and GPU design to the partnership, and Tesla can bring the best neural processing architecture in the world. Tesla also has the most advanced AI software in the world. The biggest AI problem ever addressed is autonomous driving and Tesla is about 6 generations ahead of the rest of the field.
I can see AMD and Tesla creating a new company together. It will be created by Tesla because this new company will need the Tesla Agile corporate structure. Honesty is the key.
That doesn't leave Nvidia out of the mix. Jensen and Elon have a huge respect for each other, and Elon will want Nvidia to join an American world-beating enterprise.
I'm a kiwi. In my eyes, American is a sociopathic capitalist culture. That's why newspapers could write about Tesla-killers as though business is all a fight to the death. I built a tech company here in the US. I've hired lots of Americans. I know Americans don't look at health care as a right, they don't care for their elderly or sick, so I implemented health insurance for my employees that covered their entire families, and long-term disabililty insurance to help them through tough times.
So Elon and I share cultural values. It isn't all about sociopathological capitaslism. It's about altruism, honesty, and collaboration. Elon has given you the clues again and again and again but most Americans still can't hear the message. What Elon wants is for everybody to be successful together, to work towards a good life for everybody. Like, Americans are all amazed that he hasn't gone off to live on his own private island somewhere, with the American sociopathic "I've got mine" mentality.
Wake up, America. Elon is lighting the way to a new philosophy.
🤗 THANKS HANS,FOR YOUR SUMMERY ,WE WATCHED THE WHOLE PROGRAM WITH FARZAD 🧐🤯💚💚💚
When does PHOTONIC NEUROMORPHIC LIGHT QUANTUM COMPUTE HAPPEN
Okay but with the bits they are now in direct competition with one another
Tesla will use Blackwell.
He is so darn smart.....Wow
A100 cost $4B. Dojo has to spend that kind of money
Extremely interesting. It is curious that Tesla have given away such an extensive trial of Full Self Driving… could this be to generate more data that they’re now ready to deal with.
NVDA and Tesla are working hand and hand to make FSD work. Tesla has a massive amount of visional data that NVDA does not. NDVA needs that data and even with thier compute it would take time to collect that data.
I hear this all the time, but NVDA is using more simulation training. Would like to know if that is adequate. I can't see why not
The key is the number of sources of that data. Tesla has millions of cars collecting that data every day. And everyone of them is connected to Tesla's sattelite array. No one else has that. Plus Dojo was designed for visual imput from its data sources, LLMs may not have even been concidered. It's possible that Dojo is better at some things that Invidia is not.
@@user-ko1yn8ci6c Everyone of them is NOT connected to Starlink, if that's what you mean. Starlink is also not Tesla's. It's SpaceX. Tesla's cars use cellular modems and the AT&T network.
@@incognitotorpedo42 I stand corrected. Thank you. I still think that FSD will be breaking out bigger than anyone expected.
Exaflops, Petaflops, Flop-a-flops ! I think it's time to alphabetize these Flipflops so lay-people can keep up with what is the most Maxi-Flops !
Holy crap! I feel like Foghorn Leghorn the rooster that the little nerd bird destroys as its babysitter. There's a scene where the nerd bird makes a calculation sticks a shovel in the ground and pops out Foghorn! As they walk by the tool bin foghorn had hidden in he says "Naw, I better not look. I just might be in there!"
That's where AI seems to be just around the corner from. Instead of a deluge quelching system I foresee an AI controlled, water-sprayed funnel, at sea, for landing the SpaceX rockets. By using the grid fins as water pressure targets; telescoping waterjets can grasp and land the rockets as gently as a peach being plucked from a high branch.
I laughed out loud! Love your enthusiasm. You're ready for the next three years, and it's going to blow your mind!😁
@@richardhamilton-gibbs6360 As Capt. Pickard of the Enterprise might say: "Please; Make it so."
I will listen to a man of faith.
Dojo Supercomputer will have 100 exaflops by the end of year while Nvidia DGX Supercomputer will have 400 exaflops by tge the end of the year
Reference to DGX capabilities?
Since we know how Elon works ("Full self driving next year!" every year for ten years in a row), it will likely be more around 10...
@@Xanthopteryx however, he might be optimistic with regard to when but he’s not wrong
@@jrsands Ehh... Boring tunnel = Fail. Hyperloop = Fail. 35.000 car = fail. Semi = fail (you will see). Starship = Fail (yes, i promise you). Bricks from tunnel thingie = failed. Self driving predictions = failed over and over again. And so on.
He brags a lot, lies a lot, scamming people a lot, makes things up a lot and more.
The only thing you can trust is that you can not trust anything he says.
@@Xanthopteryx That's not accurate.
Great info at 11:30 for todays news
Tesla should focus on inference where the fact they make moving things that need low power compute maters it’s stupid to focus on training hardware
Tesla has proved that they have some serious software skills in the company, dojo could end up being amazing.
5:14. AI is not a good business for Tesla??
James is saying that Tesla has bigger fish to fry than focusing on DOJO as a data center service to outside customers.
@@HansCNelson That makes sense. Thank you!
If anyone understood what he said could you summarize this video? Is the problem that Tesla cannot manufacture their own chips that they designed? Why not?
Neither Nvidea nor Tesla manufacture their own chips - they only design their chips, but use external chip fabs/forundaries like Samsung and TSMC. But Nvidea and their partners can't supply chips fast enough at the moment to satisfy the huge demand. Tesla does not have this problem since it has both compute clusters using Nvidea and their own Dojo chips and are therefore no longer compute constraint like other companies that only use Nvidea might be. Tesla has decided not to produce their own chips, since it takes many years, billions of dollars and great expertise and experience to build a chip fab and run it.
Thank you. Finally a real video and comment about where we are at. All the “yes men” around elon and on youtube all the time is pathetic and made me run from the stock. And company. Drop x and this would not be such a shitshow and he knows and his fan boys know it. He is a tool and anyone is a tool with twitter/x and i mean a total tool
Re x he is in a bad superman phase and until he drops twitter, your average joe thinks he is a bit of a clown, an angry dude. With a chip on his shoulder. He needs to watch superman 3 and think long and hard about x. Credibility is an issue. The cybertruck is amazing but such a mistake
Don't know much about this stuff but I suspect that Tesla now has enough compute to utilize the tremendous amount of data they’ve been collecting for training FSD.. Until now the data they collected far exceeded their ability to process it in a useful way. With recent training results Tesla is now expecting a much different product from what they had originally anticipated. Considering that the vision for FSD was conceived when AGI was still in it's infancy, and even now we can only guess at it's true potential. I'm guessing that Elon's expectations for FSD have greatly increased. FSD will be safer, smarter, and far superior to anything they could have even imagined when they started. With the advent of AGI, Dojo Supercomputer, and accelerated training, FSD will soon exceed all regulatory hurdles and become a part of our world. I’m 80 years old, I never would have believed such a thing when I was a kid.
The important thing is that Nvidia had all the tools and software written specifically for their hardware (not by them, but for their hardware)- that's why they were so investable if you believed in the future of AI back in 2015. I was telling anyone who would listen - most are sorry now that they didn't
It seems like the Dojo V1 is failed.
Recall, Dr Anden Karpathy is back @ Tesla . . For DOJO ! ?
It’s my understanding that DOJO Escalates & Ramps Exponentially @ October 2024-So-WAIT & SEE ! ?
Isn't DOJO much more efficient as far as power usage and isn't that really very important ?
It uses less energy because the computer center is smaller and not as effective as others, like NVIDIA.
@@Xanthopteryxno, that’s not how it works. 😂
@@thescurry In Elons world, this means he can say it uses less energy.
Just like that shitty thing when he said he blew the main grid fuses or whatever it was.... So silly. They really have no idea of what they are doing.
It is not one or the other for Tesla it is both. Indeed it may be AMD is well.
Tesla has the ability to build chip agnoatic AI training clusters. They do not give a shit. They may even start buying Serbus chips as well when their software stock and SDK's are ready to roll out.
I remember when this guy was talking about Tesla's 100 exaflops of compute for 2024, and the time it would take for Nvidia to come up with a new product. Blackwell scales to 645 exaflops. Nvidia is going from two years to one. X100 will be released next year. Meanwhile, Tesla is notorious for delays. As for solving the supply shortage, doesn't matter what you design if fabs cannot keep up with demand. Tesla going to build their own fabs, too?
What if Tesla just keeps doing what they've been doing? Rolling out Dojo while buying as much from Nvidia as they can? They're not selling Dojo hardware, its all for internal consumption.
Look at the picture frame to the left of his head then look to the right of it!!✌🏾 🏃🏾♂️🏃🏾♂️everybody better run from this video
That's a piece from a TSLA plaid motor.
Huh?
It's extremely unfortunate, but DOJO is a big letdown. Nvidia has leapfroged many folds.
100% of training is run on nVidia. Dojo's D1 chip has hardware defects in the design
Source?
@@timcacciatore9033 D2
@@MegaWilderness So HW3 in the cars has defects too, because they developed HW4? What a dumb logic you apply, Tesla never stops where they are, they alway improve things if possible.
Nvidia needs to build a chip facility in the US. With current politics, I’m not sure the west will be able to protect them from China. The Republicans are weak on foreign policy right now!!!
TSMC is setting up in the USA, so they are good
Hardware is almost irrelevant, its the software kernel that count. Nvidia have so much more software capability. Even companies like Groq have better software design than Tesla. This is because they designed the software kernel design before even sketching a hardware architecture. Tesla will stay with Nvidia.
You sound like an expert. I'm a foreigner who came to America in 1983, did a high-tech startup with proprietary hardware and software design, and employed a lot of Americans. I know both hardware and software architecture, and I'm fascinated by your analysis of Tesla as having minimal software skills. How do you think Tesla designed their D1 chip? The hardware design optimizes the software architecture. It's all Tesla.
I'm interested in your hardware design experience. Tell me more.
⭐️⭐️⭐️👍🏽👍🏽👍🏽⭐️⭐️⭐️
Thanks Hans-my simple takeaway is that after seeing Blackwell, Elon realizes that Tesla’s DOJO will never catch the compute capability of Nvidia. As you concluded “the world is not standing still”.
No offence but I am so unbelievably tired of the term “deep dive”. There are overused new terms that come and go and then there are ones that come, get used until people start to bleed through the ears and pray for death, then get used a little less. “Deep dive” seems to be going the second route.
Nvidia will crush Tesla DOJO computer because of the total network package. Tesla is having financial constraints and doesn't have the business contacts to compete at the highest levels.
I don't think tesla are having serious financial constraints?
They don't have infinite money ATM like NVIDIA but they are still in a really strong cashflow position.
What stupid reporter/ expert! The question is speed! Not data center!
How many teraflops faster or slower is one over the other per stacked cabinet!
That's not a good reply. You're not a hardware or software architect. Obviously.