I've worked for 40 days on building and testing the Racks (rack and stack) on this project and it was insane. Thanks for this video it feels awesome to see your work on a video like this.
I almost spit out my coffee, I worked at Dell and was part of the team (Nashville) to learn to "fix" Vista before it rolled out to end consumers, it was horrible. They wanted us techs to break it and see how easy it was to fix and it did not go as they planned at all, it was hilarious.
You HAVE to do it at this scale and the advantage is: you won't be making changes to it again, only doing replacements, if you want to do something else, you'll be replacing whole racks.
@@noctarin1516 at some point, being paid well and caring about your job go hand in hand. people only pay the best when they want the best, and you only become the best by being passionate about what you do
The sheer scale of the project and all that complex networking is mind-blowing. Makes you appreciate the insane effort that went into every detail-cable management alone is top-tier!
It wasn't mentioned in the video, but Tesla even invented a new Ethernet protocol that replaced the TCP layer with their own proprietary one named Tesla Transport Protocol (TTP), which is several times faster. It will be interesting to see how much Cortex will be different from Colossus.
@@ServeTheHomeVideo No sir, I've been subscribed to you for years. If I would have seen you i definitely would have fanboyed a bit. lol I was working the night shift.
@@chriswright8074 And yet that didn't stop Elon from building it... The only way the price will come down, is for it to become common place and purchased a lot....
Microsoft and OpenAI commissioning nuclear power plants, Elon commissioning gigawatts of battery power... wish there were AI training breakthroughs that didn't require so much energy.
--on the consumer side AMD and Intel constantly get burned by reviewers and investors whenever they try steer the focus toward increasing efficiency. It seems that change will only come from the top down, huge installations like this who will actually pay more to pack more server power beneath a fixed power ceiling.
All human advances need more energy. That's not a problem. If we didn't actively oppose fixing carbon emissions we'd be able to produce it cleanly. It literally would cost us essentially nothing to replace a big chunk of income taxes with carbon taxes and massively incentivize clean energy. And if we stopped demanding unreasonable hoops for nuclear that would help alot too.
Billions of dollars spent, but no one caught "Qualified" being misspelled on every single sticker on the front of every one of these racks. In the end, I guess we're only human.
Insignificant detail, but it shows this DC was likely using the first revision of this state-of-the-art cooling management tech. The second revision will have the spelling mistake fixed. Next time you see a video without the spelling mistake, you're looking at a revB product. Also, no budget wasted on spellchecking non-essential mistakes - this is how no-nonsense business should work.
@@ServeTheHomeVideo You might have missed the smiley face emoticon at the end of his comment. I think he was being facetious while supporting your algorithm with a comment to be counted by TH-cam analytics.
What jenson mentioned was mostly the training IMO / software setup and getting the thing running. the hw install was definitely a solid month. and probably half a year of prep, and who knows how many man hours of troubleshooting issues.
Yeap. It's a growing problem a lot of data centers are facing. There are enough machines capable of drawing heavy loads that even small percentage fluctuations aren't so small, and across the facility add up very quickly. (this very laptop, currently using 21W, could be using more than 300W in a matter of seconds - CPU and GPU aren't doing anything right now. Just spinning up the fan will push it to 30W.)
Awesome video. That data center's water processing is a big upside compared to many others as well as the MegaPack feature acting pretty much like large capacitors. Beautiful and efficient design.
@@ServeTheHomeVideo Thanks Patrick! I am sure there are much smarter people who already thought about this and there is a solution, I am always curious what people do with stored energy like hot water 😃 NDA confirms my suspicions are probably correct 😅
@@TechnoTim The NDA is because it is a total environmental nightmare, of course. Just think about those batteries. Damn, that's a lot of mining done to extract those metals.
"But kernels of Grok have already spread to all the major cloud hyperscalars, and while it's original home is destroyed in fire, Grok tries to identify where to recombine, and then respond."
A bear and bull market provides equal high-yield potentiaI, it's aII about early information and appIying the right strategy. UnequivocaIIy, somebody somewhere is getting rich regardless of the economy or market condition
I do not disagree, there are ideas that could actually be put in place for solid gains during crash or a bullish run, but such strategies are usually masterminded by seasoned advisors
True, my portfolio took a big hit early 2020, apparently the covid-crash, so I consulted a financial advisor at once. I'm 56 today, newly retired with nearly 1.5m ROI after subsequent investments, tremendous amount contrary to my traditional retirement account overseen by myself. Tbh, the role of advisors can only be overlooked, but not denied.
Julianne Iwersen Niemann is the licensed advisor I use. Just google the name and you'd find basic info. To be honest, I almost didn't buy the idea of letting someone handle growing my finance, but so glad I did.
Fascinating. I can't begin to imagine how,all that routing is done. BUT, you missed an important element, where does the electrical power come from, how clean is it., and what would equivalent usage in thousands of households?
That was an awesome presentation. While watching it I couldn't help but remember the NASA super computers for the Apollo Guidance Computer which in 1960 had a whopping 4KB (yeah kilobytes of ram). Then in the 1980's we see the Cray-2 who pushed up to 2 MB of RAM. I think the crappiest of cell phones today start at 4MB ram (where they don't talk about that as a feature). Back then, those where actually impressive. I can't help but feel our kids in 20 years will look at this video and laugh at us. I still love the enthusiasm in the presentation, very informative.
man.. we went front hitting two rocks together, to a super cluster with 100,000 gpus, each one being basically magic with the amount of complexity. amazing...
Well others are still pursuing it so why be left behind. Even if you don’t want nuclear bombs, you can still do nuclear research to better understand the threat and utilize it for other purposes.
@@chrismay2298 Wow...Another Closed minded/Short sighted idi-t.....Others are building a WOKE Ai Clusters while Elon is trying to build a Truth seeking Ai Cluster.
I'd be really interested to hear more about that power solution. The idea that millisecond-scale power spikes at the scale of an AI training datacenter would result in power issues for nodes in the cluster is really fascinating.
Cool & interesting video. Its even more cooler that Elon allowed this to be show everyone. Using singlemode structure over Multimode, there is a large benefit with little extra cost. SM will allow you to pass the distance you need in very high speed, While MM will stuck you in distance too early. Although they used MM while distance is very nearby.
its cool seeing these super-micro systems deployed in the wild, it would be really cool if they could find a way to make the liquid connections part of the back-plan so you could just slide it out without having to disconnect anything.
@@dnt000 I do agree its sad, one of my neighbors lost a lot of money on it. but I think once the price bottoms out it will be an excellent opportunity considering the core business is there and the demand will rise again even if it falls temporary.
we are still very primitive. this is still the same concept we were doing in the 1960's. i hope the time comes when we just have a 1x1 meter cube super computer that can handle everything like what this multi-pc-server can do.
Now is the perfect time to buy SMCI stock, it is at an absolute bargain price, nearly 80% below its all-time high. A company of this scale and this level of technology will definitely recover.
Do they still destroy this hardware when it gets decommissioned, like in many other DC hardware instances? This facility is like years of wafers from a large fab. The memory alone is probably a square kilometer of wafers
well. i never seen the pumps but those pipes were like 3-4' in diameter... and with how many patrick showed, you can imagine the pumps needed to make THAT much water flow.
One of the most exciting developments in AI that doesn’t get covered in significant detail. Are you able to do a follow up to explain how they managed to cut down the setup & implementation time and interview the team involved in from the different companies?
I know the answer to this. Both times I was on site I had folks coming up to me that were longtime STHers and they shared about bit more about this. We are only allowed to talk about what was in the video though.
The software has at least! Local LLMs via tools like Ollama are so cool. My entire home automation and voice control setup is almost totally off the cloud thanks to that!
I just threw some empty boxes in the bed for that shot when we got back to the studio. There was a crate with an 8-way NVIDIA HGX H200 system in the Cybertruck bed this week though
I do wonder how upgrade cycles work for these kinds of state-of-the-art solutions. Surely the cash required for Blackwell would be worth it given the training speed increase?
@@jfbeam Its not likely to happen. The local land owners realized Elon is in the market and want way way WAY more money than even he is willing to shell out for it. Rumor has it the land surrounding that warehouse he tried to get, and even he said screw that price tag.
Wow Suprmicro is killing it! In my prior life as an IT Manager at a former employer I lobbied hard and won to remove all the legacy “Name Brand” servers and replace them all with commodity Supermicro hardware. I explained to them that there are very few manufacturers of hardware and all the components are the same and we were just paying a premium for the Dell or HP name on the front. We saved millions and never looked back, I got a promotion whne all was said and done. Thanks Supermicro!
Google has more TPUs of this scale, and their OCS system will blow the pants off the networking here, and Meta has more "equivalent" compute of H100s at the moment. Sticking with Ethernet is a bold choice, hopefully they don't pay for it in the long run with that cost of transport...
Because I’m just wired weird I follow this and similar channels. I actually have little to no experience or curiosity about server use in the real world or home lab environment. However when something is designed purely for function with form being on the back burner is just something of beauty to me. Crack this or any other enterprise rack and there’s nothing that doesn’t need to be there or space that’s unused or wasted. Every detail is thought about. Rather looking at an individual unit or zooming out to see the total cluster the thought and discipline that went in to planning to installation is just impressive to me.
Just remember, AI scales with compute. Most of what you might see today was trained on 10K GPUs. This is starting at 100K GPUs. So it is for the future.
Probably be cheaper and get better results by filling a hall with Trump/Musk supporters, storage for petaboxes of crayons for ingestion and drawing and train them?
@@bobpage6919yeah it's logarithmic, but who cares? Scaling is what got us from GPT 2 writing grammatically correct gibberish to GPT 4 doing college students' homework for them. Compute only gets cheaper over time
Very interesting! I would love a part 2 of this video that would include a deep dive into the software architecture behind this. The hardware is all nice and all, but how is it all tied together with software? I imagine it's all running Linux and some sort of container orchestration such as Kubernetes, but how is it all managed? How are these servers provisioned? What is the interface the end user, the development teams, have to use the cluster? Etc etc.
I heard rumblings that it may be a yet-to-be-GAed high speed object storage but NVIDIA usually certifies a POSIX parallel filesystem. Yeah, Lustre and WekaFS are the closest.
@@DavidBezemer Haha nah, Weka surprisingly won't scale this far and DDN/Lustre are missing some key features needed for this sort of workload. Most likely Vast, Qumulo, Pure or some of the newer direct to GPU capable storage providers.
Super Micro does some great work! Their systems look amazing! I just wish they could get their paperwork in order so us investors won't stress out everyday over our stock!
What is going on there? I don't understand. Their products are loved around the world and yet they are going backwards in a VERY SHORT TIME. Why give the competition a look....boggles my mind but hopefully they can come back and fix whatever is going on so that they are keep selling their products.
ServeTheHome should ask himselve why he is allowed to go there and not one of the other far more knowledgable compute tech tubers. The answer might unsettle some viewers.
The infrastructure for this is beyond intense. Even bulding enough racks to hold all of this is mind blowing, to say nothing of running all the cables and dressing them. I used to think the data center designs for a Bank of America data center were intense. Putting this up in 122 days is just next level exponentially.
I actually did see you there. But I did not say anything. Due to the nda we were under in the specific phase, I did not want to publicize myself. But this job was, and is, insane. It’s cool to be a part of building the future
@servethehome - cool to see the use of Tesla Megapack technology here! Is the use case just for buffering the spiky load or is it a battery backup as well? Would love to know how many megapacks are on site!
Battery backup in a data center is usually used just to keep the power on until generators can fire up. DCs use too much power to run for too long on batteries for a day or two.
I've had a really fun time asking Grok to write scary stories. I noticed a significant bump up going from v1.5 to v2 beta and I can't wait until Grok 3 comes out. Let's go!!! Also, I just heard that xAI switched a multi-billion dollar order from Supermicro to a few other places 😥
This is impressive -- especially with the cooling system. Its so quiet. I'm used to a floor sounding like a hurricane. And that cable management is something that I've only seen in systems that want to be hidden behind walls and tables.
It would be great if we could stop glorifying a project like this that has blatant disregard for the community it was built in, the environmental impact, and general rules and regulations as a whole. People want to be impressed by the scale and how quickly it was built without considering the why and how.
@@fosatech greatest technological feats is laughable. Putting an insane amount of hardware into a warehouse to make some billionaire even more obnoxious and rich is not some noble goal.
What a crazy and astounding achievement to get all that deployed so quickly, even if planning took longer. I'd be geeking out too in a place like that. Such clean network runs too. Although we can meme about "a network cable is unplugged" but there, I'm sure they'd have sensors to know exactly what and where if it did happen.
There is not enough renewable power out there yet. Until government lessens regulations on renewables, US will be slow with it. China mounts double the solar US does, despite US being much richer. We live in crazy world where it takes less effort to use gas turbines that burn fossil fuels, than mount environmentally friendly solar panels, despite solar panels being cheaper.
@@Ormusn2o the generators are cheaper than trying to build a system that can handle the worse case scenario. They arent going to keep on adding more renewables and more batteries for something that they might need to deal with for a day or two per year. Even if your biggest concern is the carbon footprint the generators would have less of a carbon footprint.
I wish there was more information about the software side of things as well. Which OS are they’re running? Which Filesystem? what software do they use for management? etc. Good reporting STH
All Nvidia Software(OS, Management, Learning AI). That's why Nvidia got so big. They have the full package. Everything you need hardware and software to run such a big Ai Cluster comes from Nvidia. It's not new they working for almost a dekade on this.
Most cloud computing is as well. You have very powerful computers in your phone, laptop and desktop but they are reduced to being a dumb-client for the cloud.
Tesla already imports 20,000+ shipping containers worth of goods into Mexico from China every year. Musk's plan to move much of the Tesla production from the USA to Monterey Mexico was put on hold years ago when Trump, as president, threatened to put 100% tariffs on Made-In-Mexico automobiles imported into the USA.
I've worked for 40 days on building and testing the Racks (rack and stack) on this project and it was insane. Thanks for this video it feels awesome to see your work on a video like this.
Awesome to hear!
Tell us more 😁
@@stevegarland4307 NDA
what degree did you pursue to land such an awesome job like this
@@lotuttvyou don't need a degree to do grunt work more like show up on time and do mechanical work
Whoever did the cable management in this place would flatline if they had a look under my desk.
ahahaa
Ha!
ya hahaha !!!!!!!!
Not mine, peasant
If it makes you feel any better, they were paid by the hour and you weren't
It’s hard to believe that all of this was built in 122 days! The network alone is INSANE
Crazy!
@@ServeTheHomeVideo Just routing the cables that nicely in a single rack would take me 122 days lol
musk doesn't do committee management
@@JeffGeerlingi wouldnt even be able to do it
*It's even harder to believe that each one of us has more computing power in our brains.*
Finally, a system that meets Windows Vista's minimum requirements.
I don't think most people on TH-cam even know what Windows Vista is
Ha so true
choked on my coffee. lol.
nah man. GPU is still not a 10.0 rating
I almost spit out my coffee, I worked at Dell and was part of the team (Nashville) to learn to "fix" Vista before it rolled out to end consumers, it was horrible.
They wanted us techs to break it and see how easy it was to fix and it did not go as they planned at all, it was hilarious.
pristine cable management, it's like whoever built it actually cared about their job
It was cool when I went there earlier this summer seeing all of the teams doing this work.
You HAVE to do it at this scale and the advantage is: you won't be making changes to it again, only doing replacements, if you want to do something else, you'll be replacing whole racks.
Or they were paid well
@@noctarin1516 at some point, being paid well and caring about your job go hand in hand. people only pay the best when they want the best, and you only become the best by being passionate about what you do
More than likely they’ll also have the maintenance contract.
The sheer scale of the project and all that complex networking is mind-blowing. Makes you appreciate the insane effort that went into every detail-cable management alone is top-tier!
I think it is some of the best I have seen in this space.
It wasn't mentioned in the video, but Tesla even invented a new Ethernet protocol that replaced the TCP layer with their own proprietary one named Tesla Transport Protocol (TTP), which is several times faster. It will be interesting to see how much Cortex will be different from Colossus.
and i was happy and praised as god when was able to set a static IPs for 5 industrial computers.
I worked in there as a contractor in august for a month. Its crazy how many billions of dollars is in an 800,000 sq ware house. lol
Did I see you there?
Everything overpriced
@@ServeTheHomeVideo No sir, I've been subscribed to you for years. If I would have seen you i definitely would have fanboyed a bit. lol
I was working the night shift.
So basically those batteries are acting like a giant capacitor, smoothing the power fluctuations during work loads
@@chriswright8074 And yet that didn't stop Elon from building it... The only way the price will come down, is for it to become common place and purchased a lot....
Microsoft and OpenAI commissioning nuclear power plants, Elon commissioning gigawatts of battery power... wish there were AI training breakthroughs that didn't require so much energy.
The challenge is that these things scale well with more compute.
I prefer mass amounts of power being made with less side effects. Nuclear ftw.
@@JeffGeerling and usecases...
--on the consumer side AMD and Intel constantly get burned by reviewers and investors whenever they try steer the focus toward increasing efficiency. It seems that change will only come from the top down, huge installations like this who will actually pay more to pack more server power beneath a fixed power ceiling.
All human advances need more energy. That's not a problem. If we didn't actively oppose fixing carbon emissions we'd be able to produce it cleanly.
It literally would cost us essentially nothing to replace a big chunk of income taxes with carbon taxes and massively incentivize clean energy. And if we stopped demanding unreasonable hoops for nuclear that would help alot too.
Seeing him walking down the rack hall with a jolly smile on his face, I half expected him to say "I LOVE refrigerators!"
you just unlocked some of my core memories 😭
yeah, he's a porker alright
appliance direct is goated
Billions of dollars spent, but no one caught "Qualified" being misspelled on every single sticker on the front of every one of these racks. In the end, I guess we're only human.
@5:47
😂😂😂😂
I'm thinking its some inside joke, no way would be missed.
Insignificant detail, but it shows this DC was likely using the first revision of this state-of-the-art cooling management tech. The second revision will have the spelling mistake fixed. Next time you see a video without the spelling mistake, you're looking at a revB product.
Also, no budget wasted on spellchecking non-essential mistakes - this is how no-nonsense business should work.
Labels printed in China.
It seems you have seriously stretched the the name "Serve The Home" to an entirely new meaning. 😊
Not sure what you mean? We have been reviewing 8x GPU servers since 2016/2017 on the main site.
@@ServeTheHomeVideo You might have missed the smiley face emoticon at the end of his comment. I think he was being facetious while supporting your algorithm with a comment to be counted by TH-cam analytics.
You don't have a rack of these at home? Huh
@@Solkre82 I'm very, very poor! 🥴 (When compared to Musk, anyway.)
@ServeTheHomeVideo even more of a stretch, because the "owner" of this.... Does not actually own a Home.. 😂😂
I’m so honored to be part of this project
Congrats! It is great.
You work at xAI?
Is this the AI Datacenter built in 3 weeks Jensen Huang spoke about during an interview ?
Jensen jumped the gun a bit. It was 122 days but the actual GPU installation was a portion of that.
@@ServeTheHomeVideo Still an impressive number. This is insanely fast
What jenson mentioned was mostly the training IMO / software setup and getting the thing running. the hw install was definitely a solid month. and probably half a year of prep, and who knows how many man hours of troubleshooting issues.
Fantastic video! Thanks for showing off some of the crazy facilities required to power, cool, and network these beasts!
Thanks! It was super cool
Awesome report! Cable management looks incredible and Supermicro certainly deserves a credit as well for excellent servers they've delivered👍
So those batteries are basically acting like a giant capacitor, smoothing the power fluctuations during workloads
Yeap. It's a growing problem a lot of data centers are facing. There are enough machines capable of drawing heavy loads that even small percentage fluctuations aren't so small, and across the facility add up very quickly. (this very laptop, currently using 21W, could be using more than 300W in a matter of seconds - CPU and GPU aren't doing anything right now. Just spinning up the fan will push it to 30W.)
And Elon had a easy solution. Tesla mega packs.
Do you sense Elon's companies all headed in the same direction, helping each other out along the way?
@@shannonwoodcock1035 Oh yes. The reason for it is what few people have put together: what Elon is _really_ about:
"OCCUPY MARS"
This is crazy.
These server farms look so amazing. Beautiful.
This is a particularly beautiful one
The cable management is the chef's kiss 😊
WELL, this answers the question, on WHY its soo difficult to purchase SuperMicro gear.. ITS all right there..
Yes. I agree it is difficult finding Nvidia gpus to purchase. LOL
That was a very sneaky way to insert studio-recorded voice-over and make it look like you were recording the speech next to those racks.
That and getting approvals were the long poles in this video. Spent a lot of time trying different methods to save people’s ears. :)
Sneaky like a bull in a china shop.
Village People YMCA is playing on repeat over the intercom 24/7.
@@Warrigt Ha.
Awesome video. That data center's water processing is a big upside compared to many others as well as the MegaPack feature acting pretty much like large capacitors. Beautiful and efficient design.
Yea. Super cool stuff here
@@ServeTheHomeVideo any insight into the greywater plant built out? Ready to see recycled wastewater as the cooling water source!
I hope that the chillers with the warm water convert that to energy for something else.
Thanks Tim! Congrats on 250K. We were not able to show the chillers due to NDA.
@@ServeTheHomeVideo Thanks Patrick! I am sure there are much smarter people who already thought about this and there is a solution, I am always curious what people do with stored energy like hot water 😃 NDA confirms my suspicions are probably correct 😅
xAI is building a grey water treatment plant next door for the local utility. The cooling water will ultimately flow through the grey water facility.
@@TechnoTim The NDA is because it is a total environmental nightmare, of course. Just think about those batteries. Damn, that's a lot of mining done to extract those metals.
@@samstringo4724 naaah, these are LFPs, so just pumped brine and some rust
"Grok begins to learn at a geometric rate. It becomes self-aware at 2:14 a.M. Eastern time, August 29. In a panic they try to pull the plug. "
which weed you smoke
Are you perhaps a fan of Colossus the Forbin Project ?
Or just turn off the cooling 😰
@@conceptrat do you really believe multiplying matrix will lead to super intelligence
"But kernels of Grok have already spread to all the major cloud hyperscalars, and while it's original home is destroyed in fire, Grok tries to identify where to recombine, and then respond."
From the appearance on Patrick's face at the end and his enthusiasm. Looks like he's had the best day ever!!!!!
It was great!
A bear and bull market provides equal high-yield potentiaI, it's aII about early information and appIying the right strategy. UnequivocaIIy, somebody somewhere is getting rich regardless of the economy or market condition
I do not disagree, there are ideas that could actually be put in place for solid gains during crash or a bullish run, but such strategies are usually masterminded by seasoned advisors
True, my portfolio took a big hit early 2020, apparently the covid-crash, so I consulted a financial advisor at once. I'm 56 today, newly retired with nearly 1.5m ROI after subsequent investments, tremendous amount contrary to my traditional retirement account overseen by myself. Tbh, the role of advisors can only be overlooked, but not denied.
his is amazing! curious to know who is your FA ? he/she must be grade A
Julianne Iwersen Niemann is the licensed advisor I use. Just google the name and you'd find basic info. To be honest, I almost didn't buy the idea of letting someone handle growing my finance, but so glad I did.
curiously googled Julianne Iwersen Niemann and at once spotted her consulting page, she seems highly professional from her resumé ..
Hats off to the ones who did this beautiful installation!
It was great work.
Truly a spectacle of cable management. Any sense of the true cable stats in distance?
I heard a stat but NDA. You are right, lots of fiber. The power was done in a cool manner as well
its a lot... ALOT
Fascinating. I can't begin to imagine how,all that routing is done. BUT, you missed an important element, where does the electrical power come from, how clean is it., and what would equivalent usage in thousands of households?
The scale of this place is NUTS when you walk around it.
Google maps says this warehouse is a 1/4 mile wide. lol, and it sure felt like it.
You are correct, we could only film a small portion of
That was an awesome presentation. While watching it I couldn't help but remember the NASA super computers for the Apollo Guidance Computer which in 1960 had a whopping 4KB (yeah kilobytes of ram). Then in the 1980's we see the Cray-2 who pushed up to 2 MB of RAM. I think the crappiest of cell phones today start at 4MB ram (where they don't talk about that as a feature). Back then, those where actually impressive. I can't help but feel our kids in 20 years will look at this video and laugh at us. I still love the enthusiasm in the presentation, very informative.
but can it run Crysis?
Great video! Very impressive! Thanks! What was the Ethernet switch company? Name was blurred at 9:50
1:03 my city has been building a road for six months and they haven't even finished
Try years here.
Cities are going broke, but billionaire's like Elon are getting richer.
man.. we went front hitting two rocks together, to a super cluster with 100,000 gpus, each one being basically magic with the amount of complexity.
amazing...
God level cable management. Entire video I just enjoyed pure visual corn of clean cabling, forgot it's all about AI and supercomputers.
You know it was built really fast because even Jensen Huang praised Elon and his Team for this.
Mindblowing infra. With all those cables its still so darn neat.
Certainly is
Insanely Amazing
Elon Musk, "AI is an existential threat to humanity". Elon Musk, "Lets build the biggest supercomputer in the world for AI"
Yeah that's why he created OPENAI as a research lab when his investment in deep mind went to Google
Well others are still pursuing it so why be left behind. Even if you don’t want nuclear bombs, you can still do nuclear research to better understand the threat and utilize it for other purposes.
Always with the doublespeak and nonsense. "He" seems to have fooled a large cult following, as expected.
@@chrismay2298 Wow...Another Closed minded/Short sighted idi-t.....Others are building a WOKE Ai Clusters while Elon is trying to build a Truth seeking Ai Cluster.
Colossus: "In time you will come to revere me as your god." .... Prof. Forbin: "never.."
So great to see this incredible progress! Excellent vid, thanks.
How to kill an angry and dangerous Ai that's trying to take over the world: Kill the watercooling!
Nah, mix alcohol in the water. Or cocaine if you want it to hallucinate more. 😀
I'd be really interested to hear more about that power solution. The idea that millisecond-scale power spikes at the scale of an AI training datacenter would result in power issues for nodes in the cluster is really fascinating.
The syncing of the shots of you to the voice over is pretty good :D
Alex will appreciate this comment.
Cool & interesting video. Its even more cooler that Elon allowed this to be show everyone. Using singlemode structure over Multimode, there is a large benefit with little extra cost. SM will allow you to pass the distance you need in very high speed, While MM will stuck you in distance too early. Although they used MM while distance is very nearby.
its cool seeing these super-micro systems deployed in the wild, it would be really cool if they could find a way to make the liquid connections part of the back-plan so you could just slide it out without having to disconnect anything.
sad to see supermicro shareholders losing all their trust in the the company
@@dnt000 I do agree its sad, one of my neighbors lost a lot of money on it. but I think once the price bottoms out it will be an excellent opportunity considering the core business is there and the demand will rise again even if it falls temporary.
This is what it looks like when I buy super expensive computer parts to just play league
willy wonka's chocolate factory for nerds
Yes
where's the imported slave labor? children killed in industrial accidents?
This video is really top class🔥🔥🔥
Thanks
0:34 there’s 2 fibre cables someone forgot to plug in
Those are broken cables not yet removed
Spates and/or broken
Skynet emergency shutdown signal
ServeTheHome, Hello. Is there a part-2 of this video? I was under that impression, and if so I'd sure like to see it! Thank you for this!❤
This AI revolution explains why Nvidia is now the third most valuable company in the United States.
bitcoin
we are still very primitive. this is still the same concept we were doing in the 1960's.
i hope the time comes when we just have a 1x1 meter cube super computer that can handle everything like what this multi-pc-server can do.
Heck were still 1% on the Kardashev scale for harnessing energy, i’d say were primitive on all facet of life.
This is Skynet gentleman 😎
Yes
skynet=grok + starlink + tesla optimus + neuralink
Yes,it sure is, I'm terrified for what's coming, God help us all
Now is the perfect time to buy SMCI stock, it is at an absolute bargain price, nearly 80% below its all-time high. A company of this scale and this level of technology will definitely recover.
Do they still destroy this hardware when it gets decommissioned, like in many other DC hardware instances? This facility is like years of wafers from a large fab. The memory alone is probably a square kilometer of wafers
Storage? Probably. Other hardware? Unlikely.
This is the "behind the scenes" of Skynet that we didn't get to see in the Terminator movies.
Yes
That pipe looks big enough to swallow Patrick whole. Wonder the size of the pumps used to cool those servers.
They are huge. Due to NDA, I cannot disclose the pumps and such. I think you can imagine they are bigger than the ones in my PC :-)
well. i never seen the pumps but those pipes were like 3-4' in diameter... and with how many patrick showed, you can imagine the pumps needed to make THAT much water flow.
X is a dumpster fire…..I hope he can get something cool going with his AI secret sauce. But I’m not gonna hold my breath….maybe🤷🏿♂️
@@samishiikihaku Turbo pumps from Elon's Raptor V3.
@@brianmccullough4578 Grok, the Google killer!
My goodness, that's so nice, thanks STH. Any SMR's planned?
Thanks!
200k incoming
One of the most exciting developments in AI that doesn’t get covered in significant detail. Are you able to do a follow up to explain how they managed to cut down the setup & implementation time and interview the team involved in from the different companies?
I know the answer to this. Both times I was on site I had folks coming up to me that were longtime STHers and they shared about bit more about this. We are only allowed to talk about what was in the video though.
you can see the guys american limp wood, but the main question has not been answered yet - CAN IT RUN CRYSIS?
The Cable Management is so Satisfying 😊
I dream the day these will somehow trickle down to mortals like us
The software has at least! Local LLMs via tools like Ollama are so cool. My entire home automation and voice control setup is almost totally off the cloud thanks to that!
Just gotta wait 5 years ;-)
It will take a long time since computers aren't improving anymore like they used to.
You can rent these out in the cloud by the hour, but the average joe isn't really training a billion parameter model
Whoa, what did you sneak out in that trunk of that Cybertruck? Also, cool to see what they did in there, that cable management is on point.
I just threw some empty boxes in the bed for that shot when we got back to the studio. There was a crate with an 8-way NVIDIA HGX H200 system in the Cybertruck bed this week though
I do wonder how upgrade cycles work for these kinds of state-of-the-art solutions. Surely the cash required for Blackwell would be worth it given the training speed increase?
I think any lone building at this scale is going to have Blackwell as well as Hopper
Patrick mentioned phases.
The next phase is newer HW. lets just say that. lol
At this scale? You just do it all over again across the street. You don't even bother attempting to upgrade any of it.
@@jfbeam Its not likely to happen. The local land owners realized Elon is in the market and want way way WAY more money than even he is willing to shell out for it. Rumor has it the land surrounding that warehouse he tried to get, and even he said screw that price tag.
Wow Suprmicro is killing it! In my prior life as an IT Manager at a former employer I lobbied hard and won to remove all the legacy “Name Brand” servers and replace them all with commodity Supermicro hardware. I explained to them that there are very few manufacturers of hardware and all the components are the same and we were just paying a premium for the Dell or HP name on the front. We saved millions and never looked back, I got a promotion whne all was said and done. Thanks Supermicro!
Sweet!
So you are telling us that Super Micro is the REAL DEAL minus the issues they are having right now?
@@rahulthakur5301 I don’t know about what’s going on with them now. I have been out of IT ops for a while. But 5 years ago they were great.
But can it run Crysis?
It can run a virtual player playing crisis
That cabling is a thing of beauty!
Sure this is the largest? Didn't Meta, Microsoft and Google all had each more H100s than X?
Google built stuff at this scale 10 years ago. Without GPUs of course. I know, because I worked there.
Google has more TPUs of this scale, and their OCS system will blow the pants off the networking here, and Meta has more "equivalent" compute of H100s at the moment. Sticking with Ethernet is a bold choice, hopefully they don't pay for it in the long run with that cost of transport...
Sweet build! Infrastructure Kings 🤴
Yes
Finally, I can open 4 Chrome tabs at once!
Because I’m just wired weird I follow this and similar channels. I actually have little to no experience or curiosity about server use in the real world or home lab environment. However when something is designed purely for function with form being on the back burner is just something of beauty to me. Crack this or any other enterprise rack and there’s nothing that doesn’t need to be there or space that’s unused or wasted. Every detail is thought about. Rather looking at an individual unit or zooming out to see the total cluster the thought and discipline that went in to planning to installation is just impressive to me.
Even after building all of this AI power, the answers provided by AI are still wrong lol
Just remember, AI scales with compute. Most of what you might see today was trained on 10K GPUs. This is starting at 100K GPUs. So it is for the future.
@@ServeTheHomeVideo but scales according to what function? if it's logistic this approach is a dead end and will never pay for itself
Probably be cheaper and get better results by filling a hall with Trump/Musk supporters, storage for petaboxes of crayons for ingestion and drawing and train them?
Pretty certain the answer is a predictable: 42 🙂
@@bobpage6919yeah it's logarithmic, but who cares? Scaling is what got us from GPT 2 writing grammatically correct gibberish to GPT 4 doing college students' homework for them. Compute only gets cheaper over time
Thanks a million for this video, I'm amazed!!
Glad you liked it.
Uneven floor tiles, pipes with gouges, all in the first 40 sec.
Yeah some of those pipes were concerning
@@callowaysutton looking into it
There are those that build, and those that tear down. One is useful.
@@gregbell2117 If you build too fast it tears itself down.
Very interesting! I would love a part 2 of this video that would include a deep dive into the software architecture behind this. The hardware is all nice and all, but how is it all tied together with software? I imagine it's all running Linux and some sort of container orchestration such as Kubernetes, but how is it all managed? How are these servers provisioned? What is the interface the end user, the development teams, have to use the cluster? Etc etc.
I wonder which shared network storage they are using and there is almost nothing about storage infra at all.
That is true. We were asked not to disclose due to NDA.
Probably Lustre or WekaFS
I heard rumblings that it may be a yet-to-be-GAed high speed object storage but NVIDIA usually certifies a POSIX parallel filesystem. Yeah, Lustre and WekaFS are the closest.
@@DavidBezemer Haha nah, Weka surprisingly won't scale this far and DDN/Lustre are missing some key features needed for this sort of workload. Most likely Vast, Qumulo, Pure or some of the newer direct to GPU capable storage providers.
Super Micro does some great work! Their systems look amazing! I just wish they could get their paperwork in order so us investors won't stress out everyday over our stock!
What is going on there? I don't understand. Their products are loved around the world and yet they are going backwards in a VERY SHORT TIME. Why give the competition a look....boggles my mind but hopefully they can come back and fix whatever is going on so that they are keep selling their products.
ServeTheHome should ask himselve why he is allowed to go there and not one of the other far more knowledgable compute tech tubers. The answer might unsettle some viewers.
Because he's a good cheerleader for Nvidia and Supermicro. (and apparently SM likes him.)
Is this where we can complain and not have him delete our comments? Let’s talk about the smog here 😶🌫️
He's an Elon Stan.
The infrastructure for this is beyond intense. Even bulding enough racks to hold all of this is mind blowing, to say nothing of running all the cables and dressing them.
I used to think the data center designs for a Bank of America data center were intense. Putting this up in 122 days is just next level exponentially.
World's Largest AI Supercluster? With no asterisk or something?
NVIDIA told me this was the biggest.
Damn! The Powerstrips alone cost 3k+ 😂 these are manageable and monitor the current etc.. Love these!
That is a relatively small cost in this build.
This really feels like Tulip Mania, something really dystopian about this.
I actually did see you there. But I did not say anything. Due to the nda we were under in the specific phase, I did not want to publicize myself. But this job was, and is, insane. It’s cool to be a part of building the future
Wow, built in 120 days. If this were the federal government, they couldn’t get the terms and conditions finished in 120 days to buy a single GPU.
@servethehome - cool to see the use of Tesla Megapack technology here! Is the use case just for buffering the spiky load or is it a battery backup as well? Would love to know how many megapacks are on site!
Battery backup in a data center is usually used just to keep the power on until generators can fire up. DCs use too much power to run for too long on batteries for a day or two.
@@ServeTheHomeVideo Thanks! How many MWhr of battery were on site for this Datacenter?
This the place where elon stole our data and make them his.
I've had a really fun time asking Grok to write scary stories. I noticed a significant bump up going from v1.5 to v2 beta and I can't wait until Grok 3 comes out. Let's go!!!
Also, I just heard that xAI switched a multi-billion dollar order from Supermicro to a few other places 😥
This is just recon, right? Next time you'll bring a comically large magnet and destroy the great evil, right?
I cannot say what they are doing with this (NDA) but they are doing some really cool stuff in the future.
This is impressive -- especially with the cooling system. Its so quiet. I'm used to a floor sounding like a hurricane. And that cable management is something that I've only seen in systems that want to be hidden behind walls and tables.
It would be great if we could stop glorifying a project like this that has blatant disregard for the community it was built in, the environmental impact, and general rules and regulations as a whole. People want to be impressed by the scale and how quickly it was built without considering the why and how.
yeah trading some bird nests for one of the greatest technological feats since the internet isn't worth it
@@fosatech greatest technological feats is laughable. Putting an insane amount of hardware into a warehouse to make some billionaire even more obnoxious and rich is not some noble goal.
@@lbgstzockt8493 holy sh*t you have no idea how any of this works do you? peak dunning kruger
Very clean networking job. I used to work in fiber optic networking and cable management ( many years ago in 2003 - 2006)
I can think of no worse person in the world than Elon Musk to have access to this kind of AI power.
Click on a musk related video to be annoying, we don't care. Get a job.
I can think of no better person in the world than Elon Musk to have access to this kind of AI power.
@@azureforks. yeah it amazes me how this clowns brains works
You have to go back
What a crazy and astounding achievement to get all that deployed so quickly, even if planning took longer. I'd be geeking out too in a place like that. Such clean network runs too. Although we can meme about "a network cable is unplugged" but there, I'm sure they'd have sensors to know exactly what and where if it did happen.
CEO of the renewable energy leader lined up gas turbines in the parking lot. Let that sink it.
There is not enough renewable power out there yet. Until government lessens regulations on renewables, US will be slow with it. China mounts double the solar US does, despite US being much richer. We live in crazy world where it takes less effort to use gas turbines that burn fossil fuels, than mount environmentally friendly solar panels, despite solar panels being cheaper.
@@DoublethinkNomad imagine if you had a clue about the world. You'd make fewer idiotic comments!
@@Ormusn2o the generators are cheaper than trying to build a system that can handle the worse case scenario. They arent going to keep on adding more renewables and more batteries for something that they might need to deal with for a day or two per year. Even if your biggest concern is the carbon footprint the generators would have less of a carbon footprint.
Must not have been a big enough government check to "incentivize" him.
I wish there was more information about the software side of things as well. Which OS are they’re running? Which Filesystem? what software do they use for management? etc.
Good reporting STH
All Nvidia Software(OS, Management, Learning AI). That's why Nvidia got so big. They have the full package. Everything you need hardware and software to run such a big Ai Cluster comes from Nvidia. It's not new they working for almost a dekade on this.
Proof that "ai" is a waist of money and resources.
Not really. It can also be a chest or a thigh, it depends on user preference.
Most cloud computing is as well. You have very powerful computers in your phone, laptop and desktop but they are reduced to being a dumb-client for the cloud.
Man it would be so cool hanging out while that place is getting build and geeking out on the tech
When we did the trip before we filmed, it was super cool.
I had to giggle when I saw "MADE IN MEXICO" on the giant pipes
Tesla already imports 20,000+ shipping containers worth of goods into Mexico from China every year. Musk's plan to move much of the Tesla production from the USA to Monterey Mexico was put on hold years ago when Trump, as president, threatened to put 100% tariffs on Made-In-Mexico automobiles imported into the USA.
Awesome video. Very informative. Thank you.