@@headlibrarian1996 I guess you are both wrong - unless mitch has a really huge network with devices that permanently need their fast connections to random devices OR librarian refers to the standard hardware on the motherboard (that might be enough for most "casual users" for quite a while). But I like the idea of getting a "direct" fast connection to friends. (As long as I ignore the permits required, construction/... destroying cables quite often, ... and the possibility that a friend might move - especially if that friend helped with the connection to another friend who is out of range.) No, I'm not planning to do that. *minimizes table with distances in between family and friends*
@@headlibrarian1996 I don't think we gonna stuck at 1, it gonna be at least 10 imo, mostly because of RJ45 and Cat6. 2.5G is already drop to manageable price at this point, anyone looking to wire a new ethernet network should at least take a look at 2.5G price, or at least wire your network with Cat6. That way when you upgrade your speed, you just need to swap out the switch.
Fact: I actually came to check the player's settings, for I thought that at some point I must have had inadvertently set playback speedz to 1.25x. It was not...
@@retarepo Its so insanely fast that you could plug 50 Geforce 4090 there, no, wait, its PCIe can't transfer 1TB/s, only its memory stack can. You could probably switch hundreds of GPUs.
Looking at the thickness of the board on 09:03, and looking at the vast amount of signals the single chip needs just for the front ports, this has to be one of the most complex PCB ever made (if not the most). That's an incredible amount of high speed signal traces that need to be finely adjusted,one by one just to have good timing constraints and good impedance matching between the IC and the transceivers
@@ServeTheHomeVideo kinda interested to see the UNDERSIDE of that board, I saw they blocked it off with metal on the backside so no airflow down there which is kinda odd since all the optical ports had fins on them but if only the top row gets air how hot do the other ones get then? and what is on the backside? back with a screwdriver!
In 2019 we started talking about PCIe Gen4 signaling in servers being a challenge leading to cables instead of PCB risers. We see that more now. PCIe Gen6 and Gen7 will get worse. The switch guys have to deal with crazy signaling much earlier.
My guess is easily 36 layers, but possibly 10 more. Very thick PCB, but also special materials and processes to control impedance and thin trace tolerances, and for sure also things like buried vias and blind vias. Pretty high tech.
@@movax20h Maybe you need the z height to get a bunch of controlled impedance traces on top of each other, rather than just a needs for a crazy number of layers as I'm not sure why you'd need that if all the buses are fanned out intelligently.
Still remember my first connection on my 8-bit C64 using an old 300 baud dialup modem, however I was able to connect up to a Color BBS but it was a painfully slow load :p When I trained in the USAF we used the original Acuostic Modems, I think these were like 50 ~ 60 baud ? You placed the phone's handset onto the couplers and it communicated that way, still a dialup type connection but there you go.
I used to brag about my home lab with 100g switching.... I am so glad that you are able to bring us amazing content like this it pleases the hardware nerd that I am greatly
I appreciate your detailed look at high-end networking equipment-it's something we network engineers don’t often see. For future content, could we delve into the more technical aspects that are critical for those of us in the industry? Some things I wanted to see mentioned, specifically as a networking professional in the HPC space: What type of ASICs and RPs are in a piece of equipment along with their L2/L3 forwarding rates for determining subscription. PPS capability would be helpful too. You went into radix, which is very helpful for considerations about network topology like fat tree, and I do appreciate that you briefly went over spine/leaf in some diagrams. A bit about the CAM/TCAM would be helpful for understanding RIB and FIB sizing and ACL capability. Covering the different redundancy modes the switches support would be helpful as well. Such as whether they handle active/active (MLAG) or active/passive (stacking) scenarios would tell us a lot about their control plane function from a redundancy perspective. In line with that I'd like to see some exploration into how L2/L3 multipathing is implemented. I'd also like to see some breakdown of what is meant by 'large buffers' and the specific Quality of Service methodologies implemented (inbound shaping, colored policing, DSCP tagging support, WRED or ECN). I'd also like to see some mention of RDMA/RCoE support for HPC centric switching platforms, because it's very important. Maybe too detail oriented... But network engineers don't typically get to read anything but whitepapers, datasheets, and slide decks, so the format is very captivating.
Yeah.. I was really wondering about whether the ASIC would be capable of line-rate routing for the underlays when implementing overlay fabrics in more complex setups. I'd have to assume at the very least that something in this class would at least do RDMA, MLAG, and VXLAN as a minimum.
I do not know about marvel, but juniper sometimes releases technical details about their new products to the public. For example I highly recommend reading the MX304 deep-dive (their newest "small" high performance router) community.juniper.net/blogs/reema-ray/2023/03/28/mx304-deepdive
And here's me trying to find a 25/40gb switch that I can actually use at home without my apartment sounding like a wind tunnel. Seriously though it's crazy how quickly we went from 40GbE to 800. Feels like me remembering installing a 40Gbe switch for the first time was a recent thing, videos like this make me realize that was 6-7 years ago.
I work in high performance computing and yeah, high end switches are insane. I don't work with the hardware itself, but I do work on the software side of it and make use of these sorts of speeds in parallel with many terabits of network bisection bandwidth. What's crazy is these networks can even be a big bottleneck when you put in too many GPUs per node.
@@ServeTheHomeVideo Agreed. Oh, I know (about the war between Ethernet and IB). My Mellanox ConnectX-4 dual port cards can do either PHY LINK_TYPE. Running my 100 Gbps IB cards as ETH ports, I can use the as a Linux Ethernet Bridge in Proxmox, but VM VM between nodes max out at around 23 Gbps, which is SIGNIFICANTLY lower than what the port can do (which is about 96.9 Gbps out of 100 Gbps possible). On the IB side, my applications has topped out around 12 GB/s (96 Gbps), and ib_send_bw tests show a possible 97-ish Gbps out of a possible 100 Gbps. IB side doesn't support bridge networking like ETH side, so I tried to enable SR-IOV on the NICs/ports, but the OpenSM subnet manager for IB from the Debian package doesn't support virtual functions/virtualisation at the SM level (requires RHEL and/or its derivatives for that), which then causes configuration issues for my Proxmox server, because I'd have to do this very funky setup where I pass one of the ports to a CentOS VM, start the RHEL opensm, only for it then try and enable virtualisation on my switch so that the Proxmox host's VFs *MIGHT* work. It gets REALLY complicated, REALLY fast. There are a whole host of issues that arise with a config like that, so I stopped trying/testing it for now, and am keeping it as just single point-to-point connections via the switch (for now). (Don't really *need* the VFs to work because I'm using virtiofs for VMhost communications, but it might be useful if I want to try and spin up a cluster later so that I would be able to start the VM/Container with the HPC app that I want to run, rather than physically swapping out hard drives or multi-booting via the node's BIOS.)
Patrick, I am curious if the switch handles mirror/sniff ports at full 800Gbps, or faster (multiple ports, LACP, etc)? We have been working on 100/400/800 capture to disk solutions, and would be curious if this switch would be able to provide 800+ Gbps of traffic to feed our server. Thanks!
OSFP is still kicking? I thought it got killed by QSFP-DD. It is definitely a better designed standard especially when concerning powering and cooling modules, but the backwards compatibility seemed to draw too many to QSFP-DD.
I mean, under 3000W power consumption is not absurd. All the Cisco 6500-E chassis we ever purchased came with minimum dual 4500W PSUs. But bottom line, great to see competition against Broadcom and their almost monopoly in this segment.
The non obvious elephant in the room. The CFO, "What if this switch has some crazy CPU bug like we saw with M silicone or Meltdown?" Me: "hold my beer."
Nice. I saw their products and they dont have enough exposure compared to broadcom. I do have small 100Gb switch using Marvell chip and it works wonderful, stable, low power, really nice. In fact i am surprised the chip asic die is so small, i was expecting way bigger die.
Probably more than that for a brand name manufacturer. Back in 2017, one of my customers building a high-speed campus network (100GbE switching & routing) was already looking at an easy US$13k per 100GbE transceiver. For a project at US$2 mil total, the transceivers alone came in at around US$1.1 mil.
@@BigBenAdv and why you would need 1000km OSPFs for campus? lol. If you are talking about normal 800g OSFPs with like 500m-10km range then cheapest is around 1k usd and more expensive one 3k usd from FS website
Brand optics are more expensive. Like by a factor of 10. But you can already get third party modules between 2000 and 4000 euros from third party (eg flexoptix), depending on SM/MM and power budget
With the advances in manufacturing process nodes, power efficiency, and increase in switch chip IPC that we're seeing at this extreme top end, do you see anything coming down the pike with regards to 1U *fanless* all SFP+ or all 10GB RJ45 (24-52 ports) or are the heat dissipation requirements still too high for that switch segment?
Here is a different way to think of it. If you use a newer and more expensive process node then costs go up. That is tough in a market 1/80th the speed of the current top-end where people are paying for that kind of advancement. A big challenge today is that 10GbE is so slow that there are not the investment justifications behind spinning those lower end chips. That can seem strange but it is not just about what should be possible, but also what people are willing to spend R&D on. 25GbE will be the next challenge. With PCIe Gen6 x1 you get 50Gbps so we will see investments shift even in 25GbE
You mention AI a lot, but IMO the best use case for these is for storage and compute resource back bones in a Hypervisor Cluster. Low latency disk access to a large SSD JBOD or ZFS Pool rack 🤔yes plz.
In a few months when we get PCIe Gen6 CPUs you can do an 800GbE port off of a Gen6 x16 slots. These 51.2T are really going to get deployed en masse in 2025 as well. Just showing this off before it becomes more common
Maybe next year. But for now people will be using these to connect using 128x200G and 256x400G connections to server or directly to accelerators (bypassing CPU, memory and PCI), mostly ai stuff, and maybe some FPGAs sometimes. There is no info on any product or system supporting pcie 6, but I do expect Nvidia mighty be the first to deliver something in their systems capable of 800Gbe. After that maybe IBM and then Intel and AMD. As far as we are aware AMD Zen 5 on desktop and server will be Pcie 5 still. But IBM Power 11 is most likely to support PCIe 6.
Thanks for this! Just a couple of questions: - What are those PSUs rated to? - That central processor, is that an FPGA or just an ASIC? I mean - you'll have to produce rather a few of these to make up the cost for an ASIC and if you need an upgrade, it'll be a PITA
Well over 2kW. You can see ratings and voltages on the main site where there is a label pic. It is a 5nm ASIC. Projections are for tens of millions of 800G ports installed by 2028
nothing wrong with looking at technology... if you haven't noticed it... such stuff comes in at the big corps first and then finds it's way down to us "simpletons" and homelabbers etc.
Imagine dropping a screwdriver inside while disassembling this switch 😂 And look at that vapor chamber on the radiator - it's a massive one, even RTX 4090 would be proud.
Do you have a chance to compare it to other 800G network switches like Mellanox/NVIDIA Quantum X800 Infiniband or Spectrum 4 Ethernet or Broadcom Tomahawk5 Ethernet based chipsets?
Hi, can you tell me if switch can do multicast, and if so, is Multicast Latency same like Unicast latency? also when you send multicast flow into one port and replicate to 60 ports, is latency for each egress port same? More, what are latencies between different speeds, like 800gb to 100gb or 800g to 400g in both directions. Do they ship QSFP verstion? Can FEC be disabled?
Used to work at the factory that built the Cisco Catalyst 6500 and CRS-1 back in the day. You can tell this is a testing / qual / development build since it has a lot of 10 pin headers and clock signaling test ports on it that would likely get depopulated on final release. Everyone talking about how thick that PCB is, but to be honest, my question is how many balls does that ASIC have? I'm guessing at least 5K, and at least half of them are for just for power. 🤣 While I'm sure a lot of that PCB thickness is for signals, another reason is with that much power and heat on that ASIC, a thin PCB can warp under the strain and cause one of the solder balls to pop open. (it's what was causing the red ring of death on the xbox360, I know first hand cus my company built those too, and I spoke directly with the engineer who was working on the problem about it) Anyway, sometimes you make PCB's extra thick, just to make them stiff and more heat tolerant, with large power and ground planes to not only provide shielding for the signal traces, but to help spread out that heat thru the board.
Last week: "Yeah, 2.5 GBit/s for all smaller devices [those that can't be upgraded easily] and 10 GBit/s for connections to the big servers, our PCs and other switches is enough." Yesterday: "This will be such a big upgrade. Let's put in the fibre and replace the switches." Today: "OK, let's watch a video while installing the new cards into the systems so I can finally test the fastest connections. This will be so cool..." Big mistake. :D
@@ServeTheHomeVideo I know... - my hardware maxes out at ~40 GBit/s (best value of a single device) - the network would max out at ~100 GBit/s (in between switches in case I use multiple switches based on the old connections, ...) - many smaller connections won't be that fast (only 2.5 Gbit/s - upgrade impossible or could result in more upgrades) - daily use per user often is in the range below 1 GBit/s (or even less thanks to SSH, ...) - ... I didn't randomly pick the parts for the upgrade... :) But it feels slow* anyway... so thank you for that... ;) (* no, it's actually quite nice with 2.5 - 10x the speed I could get before [per device] while the connections don't slow down each other as much as before) --- And I'll ask if we can upgrade the cluster at work... I guess there are ~200 nodes (if not upgraded already) on their own network that uses quite a few switches. So thanks for the information! (Really useful!!)
They can be, but most of the power is being used by the up to 28W OSFP modules. The 500W TDP switch chip is not too bad to cool. We will see 500W server CPUs air cooled this year
@@ServeTheHomeVideo it's just that the two aspects I've come across are: 1. Spinning air fans can take up to 30% of power usage 2. It's easier to collect heat with water block from high power chipsets But then again I can imagine the airflow is more important for qsfps in this case
Switch pricing is very strange. For example, when we saw early 32-port 100GbE switches Arista might sell for $20K, Facebook was buying the Wedge100's for like $2500 each. Do not forget the optics though. 800G optics are not cheap.
if there is 500W needed for 512x 100G ports.. thats an insane efficiency. Please integrate the management cpu and make some passive home switches based on this technology - 16x 100G, splittable to 64x25G, in the 16W range! On the other note.. nobody mentioned what such TL10 switch costs? :D
Marvell chips aren't known for its features, but if it's marketed towards AI then probably they will use as few features as possible for the raw speed.
Because it greatly increases the torque tolerances. In the V100 era HPE cracked many GPUs because their pre screened thermal paste was too thick. On consumer devices, you need to account for people who are installing their first CPU not following a guide to be successful or else it will cause problems
1000km distance without amplification?? For real? This blows my mind more than the 800G. I thought there was some physical law that limited it to about 100km, apparently not.
Can't wait to install this in my home for $700 in 2034
Nah, we’ll still be stuck at 1 gigabit.
I feel like the switch will probably be around that price point... But the cables won't be much cheaper. 😅
@@headlibrarian1996 I guess you are both wrong - unless mitch has a really huge network with devices that permanently need their fast connections to random devices OR librarian refers to the standard hardware on the motherboard (that might be enough for most "casual users" for quite a while).
But I like the idea of getting a "direct" fast connection to friends.
(As long as I ignore the permits required, construction/... destroying cables quite often, ... and the possibility that a friend might move - especially if that friend helped with the connection to another friend who is out of range.)
No, I'm not planning to do that. *minimizes table with distances in between family and friends*
And pay $700 a month for running this 2kw switch? 😢
@@headlibrarian1996 I don't think we gonna stuck at 1, it gonna be at least 10 imo, mostly because of RJ45 and Cat6.
2.5G is already drop to manageable price at this point, anyone looking to wire a new ethernet network should at least take a look at 2.5G price, or at least wire your network with Cat6. That way when you upgrade your speed, you just need to swap out the switch.
Suddenly my 100Gb switch feels inadequate.
Let me know if you ever need some footage
"always has been". (moon meme)
Hey, if you want to send your old inadequate switch to me. I’ll take it, no questions 😂
And here I was thinking 25 Gbps was pretty hot.
@@JeffGeerling I'm happy with 1Gbit.
800Gbit/s is pretty fast. But nothing is faster than Patrick when he speaks about crazy fast technology 😂
You can ask my wife, I slow down for videos.
@@ServeTheHomeVideo If your son comes like you, he finishes growing to 21 in half the time 😁
Listening to STH videos at 1.5x is a great way to sharpen your ears!
Fact: I actually came to check the player's settings, for I thought that at some point I must have had inadvertently set playback speedz to 1.25x. It was not...
@@galician-tolo8149 only Chuck Norris is faster ... perhaps...
This thing is a monster! Being able to break out into 512 x 100GBe ports is mind blowing!
Yea, or to have 256x 100GbE and 128x 200GbE for uplinks as an example.
@@ServeTheHomeVideo how about making a 51.200 port gigabit switch for your home network or LAN party?
@@retarepo Its so insanely fast that you could plug 50 Geforce 4090 there, no, wait, its PCIe can't transfer 1TB/s, only its memory stack can. You could probably switch hundreds of GPUs.
yes and 8000 x 100MBe ports, the best type of port
Looking at the thickness of the board on 09:03, and looking at the vast amount of signals the single chip needs just for the front ports, this has to be one of the most complex PCB ever made (if not the most). That's an incredible amount of high speed signal traces that need to be finely adjusted,one by one just to have good timing constraints and good impedance matching between the IC and the transceivers
Yea! It is super cool routing that much high speed signaling
@@ServeTheHomeVideo kinda interested to see the UNDERSIDE of that board, I saw they blocked it off with metal on the backside so no airflow down there which is kinda odd since all the optical ports had fins on them but if only the top row gets air how hot do the other ones get then? and what is on the backside? back with a screwdriver!
@@LiLBitsDK It's so thick and so much metal vias it will heatsink itself through connections to cooled devices I bet.
9:25
HOW MANY LAYERS IS THAT PCB!? That's like HALF A RJ45 port height in PCB depth. I've never seen one that thick before.
That's what she said...
In 2019 we started talking about PCIe Gen4 signaling in servers being a challenge leading to cables instead of PCB risers. We see that more now. PCIe Gen6 and Gen7 will get worse. The switch guys have to deal with crazy signaling much earlier.
My guess is easily 36 layers, but possibly 10 more. Very thick PCB, but also special materials and processes to control impedance and thin trace tolerances, and for sure also things like buried vias and blind vias. Pretty high tech.
@@movax20h Maybe you need the z height to get a bunch of controlled impedance traces on top of each other, rather than just a needs for a crazy number of layers as I'm not sure why you'd need that if all the buses are fanned out intelligently.
1.8kW of power planes is no joke either
It's crazy this 2U switch can replace 16U of 100Gb switches. Hate to imagine what 8 cables coming out each port looks like though.
More than that if you want those 16x 32-port switches to be connected at a reasonable speed
I would imagine that anyone sane will be using breakout MTP cables into patch panels.
This sounds like a good upgrade for dialup BBS
Still remember my first connection on my 8-bit C64 using an old 300 baud dialup modem, however I was able to connect up to a Color BBS but it was a painfully slow load :p
When I trained in the USAF we used the original Acuostic Modems, I think these were like 50 ~ 60 baud ?
You placed the phone's handset onto the couplers and it communicated that way, still a dialup type connection but there you go.
do yourself a favour and wait for the next generation
I thought I was living the life when I got a 9600 baud modem.
@@headlibrarian1996 You were - Trellis coding - magic back then.
I used to brag about my home lab with 100g switching.... I am so glad that you are able to bring us amazing content like this it pleases the hardware nerd that I am greatly
Our pleasure!
next week a 4port 800gbit switch from AliExpress :D LOL
9:07 Dammn, that is one thicc PCB!
Very. 51.2Tbps!
I need this for my homelab. It’s non-negotiable. Instagram won’t work without it.
Perfect timing, been looking for a new switch for the pool house.
Marvell is coming back with a vengeance with this one. The density in this thing is insane
I appreciate your detailed look at high-end networking equipment-it's something we network engineers don’t often see. For future content, could we delve into the more technical aspects that are critical for those of us in the industry? Some things I wanted to see mentioned, specifically as a networking professional in the HPC space:
What type of ASICs and RPs are in a piece of equipment along with their L2/L3 forwarding rates for determining subscription. PPS capability would be helpful too.
You went into radix, which is very helpful for considerations about network topology like fat tree, and I do appreciate that you briefly went over spine/leaf in some diagrams.
A bit about the CAM/TCAM would be helpful for understanding RIB and FIB sizing and ACL capability.
Covering the different redundancy modes the switches support would be helpful as well. Such as whether they handle active/active (MLAG) or active/passive (stacking) scenarios would tell us a lot about their control plane function from a redundancy perspective.
In line with that I'd like to see some exploration into how L2/L3 multipathing is implemented.
I'd also like to see some breakdown of what is meant by 'large buffers' and the specific Quality of Service methodologies implemented (inbound shaping, colored policing, DSCP tagging support, WRED or ECN).
I'd also like to see some mention of RDMA/RCoE support for HPC centric switching platforms, because it's very important.
Maybe too detail oriented... But network engineers don't typically get to read anything but whitepapers, datasheets, and slide decks, so the format is very captivating.
Yeah.. I was really wondering about whether the ASIC would be capable of line-rate routing for the underlays when implementing overlay fabrics in more complex setups.
I'd have to assume at the very least that something in this class would at least do RDMA, MLAG, and VXLAN as a minimum.
Good feedback. We are still learning and trying to get the right level of information
@@ServeTheHomeVideo Packing all that into the video would probably hurt viewership. I'd put it on your site and mention that in the video.
I do not know about marvel, but juniper sometimes releases technical details about their new products to the public. For example I highly recommend reading the MX304 deep-dive (their newest "small" high performance router)
community.juniper.net/blogs/reema-ray/2023/03/28/mx304-deepdive
And here's me trying to find a 25/40gb switch that I can actually use at home without my apartment sounding like a wind tunnel. Seriously though it's crazy how quickly we went from 40GbE to 800. Feels like me remembering installing a 40Gbe switch for the first time was a recent thing, videos like this make me realize that was 6-7 years ago.
I work in high performance computing and yeah, high end switches are insane. I don't work with the hardware itself, but I do work on the software side of it and make use of these sorts of speeds in parallel with many terabits of network bisection bandwidth. What's crazy is these networks can even be a big bottleneck when you put in too many GPUs per node.
The RJ45 cables connected for management look very cute.
My homelab need this switch! I need to use it to watch videos in my Jellyfin.
480i never felt so quick
Where's the affiliate link to buy one? 😂
I wish!
And here I am, sitting at home, with my piddly 36-port 100 Gbps Infiniband switch, that's sitting in my basement.
There is a war brewing between IB and Ethernet! Just be thankful you do not have an OPA switch
@@ServeTheHomeVideo
Agreed.
Oh, I know (about the war between Ethernet and IB).
My Mellanox ConnectX-4 dual port cards can do either PHY LINK_TYPE.
Running my 100 Gbps IB cards as ETH ports, I can use the as a Linux Ethernet Bridge in Proxmox, but VM VM between nodes max out at around 23 Gbps, which is SIGNIFICANTLY lower than what the port can do (which is about 96.9 Gbps out of 100 Gbps possible).
On the IB side, my applications has topped out around 12 GB/s (96 Gbps), and ib_send_bw tests show a possible 97-ish Gbps out of a possible 100 Gbps.
IB side doesn't support bridge networking like ETH side, so I tried to enable SR-IOV on the NICs/ports, but the OpenSM subnet manager for IB from the Debian package doesn't support virtual functions/virtualisation at the SM level (requires RHEL and/or its derivatives for that), which then causes configuration issues for my Proxmox server, because I'd have to do this very funky setup where I pass one of the ports to a CentOS VM, start the RHEL opensm, only for it then try and enable virtualisation on my switch so that the Proxmox host's VFs *MIGHT* work.
It gets REALLY complicated, REALLY fast.
There are a whole host of issues that arise with a config like that, so I stopped trying/testing it for now, and am keeping it as just single point-to-point connections via the switch (for now).
(Don't really *need* the VFs to work because I'm using virtiofs for VMhost communications, but it might be useful if I want to try and spin up a cluster later so that I would be able to start the VM/Container with the HPC app that I want to run, rather than physically swapping out hard drives or multi-booting via the node's BIOS.)
Patrick, I am curious if the switch handles mirror/sniff ports at full 800Gbps, or faster (multiple ports, LACP, etc)? We have been working on 100/400/800 capture to disk solutions, and would be curious if this switch would be able to provide 800+ Gbps of traffic to feed our server. Thanks!
OSFP is still kicking? I thought it got killed by QSFP-DD. It is definitely a better designed standard especially when concerning powering and cooling modules, but the backwards compatibility seemed to draw too many to QSFP-DD.
I agree, though choosing OSFP as an acronym was not the greatest decision.,,
I mean, under 3000W power consumption is not absurd. All the Cisco 6500-E chassis we ever purchased came with minimum dual 4500W PSUs. But bottom line, great to see competition against Broadcom and their almost monopoly in this segment.
Linus and Jake: "Upgrade time!"
Not kidding, they get switches from me. I sent a big one to Jake last year.
My 10Gbit Network seems like Peanuts against this 😂
But your 10G network is training for working on this. 2025 will be the big 800GbE year and then 2027 will be the 1.6T generation
Thinking of getting one of these for my pi cluster.
13:08 Nostalgia! That's what my bedroom sounded like back when I was overclocking in the AthlonXP days!
Ha!
Insane switch. Time to ServeTheEnterprise channel?
The non obvious elephant in the room.
The CFO, "What if this switch has some crazy CPU bug like we saw with M silicone or Meltdown?"
Me: "hold my beer."
Check our coverage of the AVR54 bug on the main site
DAMN!! thats a beast Patrick you are rocking this stuff !!
Pretty nice. Not going to lie. Two of our cores right now only give us 288x 100g ports. Waiting on upgrading some of our routers to 800g shortly
I reported the bots, holy hell. Love the uber sfp moster of a switch.
OSFP modules are quite big.
Nice. I saw their products and they dont have enough exposure compared to broadcom. I do have small 100Gb switch using Marvell chip and it works wonderful, stable, low power, really nice.
In fact i am surprised the chip asic die is so small, i was expecting way bigger die.
I imagine one of those OSFP modules already costs 10K.
Probably more than that for a brand name manufacturer. Back in 2017, one of my customers building a high-speed campus network (100GbE switching & routing) was already looking at an easy US$13k per 100GbE transceiver. For a project at US$2 mil total, the transceivers alone came in at around US$1.1 mil.
Ye... The switch you will probably be able to get for less than a thousand bucks in 2034 but the cables and transceivers though... 😅
@@BigBenAdv and why you would need 1000km OSPFs for campus? lol. If you are talking about normal 800g OSFPs with like 500m-10km range then cheapest is around 1k usd and more expensive one 3k usd from FS website
Brand optics are more expensive. Like by a factor of 10. But you can already get third party modules between 2000 and 4000 euros from third party (eg flexoptix), depending on SM/MM and power budget
Marvell ain’t messing around 😳
Not at all
This is astounding.
Very
I am still on 1 gbit ;)
Maybe I should consider upgrading to 2.5?
Would love to know how much one would cost.
With the advances in manufacturing process nodes, power efficiency, and increase in switch chip IPC that we're seeing at this extreme top end, do you see anything coming down the pike with regards to 1U *fanless* all SFP+ or all 10GB RJ45 (24-52 ports) or are the heat dissipation requirements still too high for that switch segment?
Here is a different way to think of it. If you use a newer and more expensive process node then costs go up. That is tough in a market 1/80th the speed of the current top-end where people are paying for that kind of advancement. A big challenge today is that 10GbE is so slow that there are not the investment justifications behind spinning those lower end chips. That can seem strange but it is not just about what should be possible, but also what people are willing to spend R&D on. 25GbE will be the next challenge. With PCIe Gen6 x1 you get 50Gbps so we will see investments shift even in 25GbE
Yeah and in some companies people must fight with accounting to be able to buy 10Gb NIC.
Nice video.
I hope this helps. 10GbE in an era of 800GbE is getting close to 100M in the era of 10GbE.
you're looking a little fitter than usual, bro. Good on you !
Not really. Have to get back to pre-COVID shape. Hoping the arm heals and the baby sleeps.
How are you supposed to get those internal ports casbled and reachable externally exactly???
This is exactly what my apartment homelab needs
But cam I link aggregate that into one connection?
would be smarter to only use HALF the ports into 1 connection :D else it is just a dead end
You mention AI a lot, but IMO the best use case for these is for storage and compute resource back bones in a Hypervisor Cluster. Low latency disk access to a large SSD JBOD or ZFS Pool rack 🤔yes plz.
1MW saved is more than a megawatt considering that you also save that much in heat energy 🤔
Those fans, that heatsink! Just for the one chip!
What certifications/degrees do these manufacturers/designers look for?
Ah, yes. Just perfect for home -defense- labbing.
How many first borns and right arms does this thing cost?
Feeding my first born right now. I would not trade him for a switch
Audio is all over the place in this video 🤯
Holy sh*t how thicc is that PCB?? This looks like a cutting board 🔥.
Very
5:31 this thing is amazing, its like 50 Geforce 4090 transferring things to memory.
Yea. Wild
So there's just no way to directly connect a server to this running 800GbE?
In a few months when we get PCIe Gen6 CPUs you can do an 800GbE port off of a Gen6 x16 slots. These 51.2T are really going to get deployed en masse in 2025 as well. Just showing this off before it becomes more common
Maybe next year. But for now people will be using these to connect using 128x200G and 256x400G connections to server or directly to accelerators (bypassing CPU, memory and PCI), mostly ai stuff, and maybe some FPGAs sometimes. There is no info on any product or system supporting pcie 6, but I do expect Nvidia mighty be the first to deliver something in their systems capable of 800Gbe. After that maybe IBM and then Intel and AMD. As far as we are aware AMD Zen 5 on desktop and server will be Pcie 5 still. But IBM Power 11 is most likely to support PCIe 6.
Hey Patrick, is this switch able to do logging of packet metadata if it was deployed in ISP scenario, where its required to log metadata for a year?
came here for 800GbE, stayed for a 2U switch.
The cable management will be nuts!
9:12 How thick is that PCB!?
Very
Thanks for this! Just a couple of questions:
- What are those PSUs rated to?
- That central processor, is that an FPGA or just an ASIC? I mean - you'll have to produce rather a few of these to make up the cost for an ASIC and if you need an upgrade, it'll be a PITA
Well over 2kW. You can see ratings and voltages on the main site where there is a label pic. It is a 5nm ASIC. Projections are for tens of millions of 800G ports installed by 2028
But can it run Crysis?
Sure. COM express modules and PCIe
This is a switch. You might have more luck with TOTK.
Great info. Thanks for sharing.
Thanks for watching!
That switch uses more power than my whole house lol
Finally a new switch to replace my Netgerar GS305.
On the next LTT: "Installing 800Gb between my room and the shed"
I had 64 ports of 400GbE in my Austin house :-)
Would this be good for home use with gigabit fiber?
Not kidding, I had a 64-port 400GbE switch running in my 1596 fiber home in Texas.
Heh I got to see Cisco's Plans for 1600GbE a few months back, So Crazy.
Yea. 1.6TbE is 2027
this is servethehome, this is a channel meant for homes
this is a *little* overkill i think
Not sure where you got that impression. We have been doing big switches for many years and 8 GPU servers since 2016-2017
nothing wrong with looking at technology... if you haven't noticed it... such stuff comes in at the big corps first and then finds it's way down to us "simpletons" and homelabbers etc.
So... where do I get one for my homelab?
This opens up so many possibilities. Should be fast enough to hash model loads…… unlimited vram?
That is a big reason the industry is moving to 1.6TbE in 2027
@@ServeTheHomeVideo yeah. I can’t wait. I guess then that frees up the controls in place with vram limits / pci lanes
Imagine dropping a screwdriver inside while disassembling this switch 😂
And look at that vapor chamber on the radiator - it's a massive one, even RTX 4090 would be proud.
sooo are the TAA compliant? because I know a few places that may be in the market for something like this...
This switch is perfect for my 1gbps home network
Do you have a chance to compare it to other 800G network switches like Mellanox/NVIDIA Quantum X800 Infiniband or Spectrum 4 Ethernet or Broadcom Tomahawk5 Ethernet based chipsets?
Hopefully we will get to do the Broadcom and NVIDIA ones too
Cooling this would be a challenge
Hi, can you tell me if switch can do multicast, and if so, is Multicast Latency same like Unicast latency? also when you send multicast flow into one port and replicate to 60 ports, is latency for each egress port same? More, what are latencies between different speeds, like 800gb to 100gb or 800g to 400g in both directions.
Do they ship QSFP verstion? Can FEC be disabled?
Used to work at the factory that built the Cisco Catalyst 6500 and CRS-1 back in the day. You can tell this is a testing / qual / development build since it has a lot of 10 pin headers and clock signaling test ports on it that would likely get depopulated on final release.
Everyone talking about how thick that PCB is, but to be honest, my question is how many balls does that ASIC have? I'm guessing at least 5K, and at least half of them are for just for power. 🤣
While I'm sure a lot of that PCB thickness is for signals, another reason is with that much power and heat on that ASIC, a thin PCB can warp under the strain and cause one of the solder balls to pop open. (it's what was causing the red ring of death on the xbox360, I know first hand cus my company built those too, and I spoke directly with the engineer who was working on the problem about it) Anyway, sometimes you make PCB's extra thick, just to make them stiff and more heat tolerant, with large power and ground planes to not only provide shielding for the signal traces, but to help spread out that heat thru the board.
My first “corporate” job was at the old Stratacom Cisco factory off of Blossom Hill in San Jose
This is how you disaggregate memory and aggregate compute clusters. Wow.
I want a 10G 48port, any recommendations from Cisco? Extreme Networks
they should sleeve and structure fan wires for a neater more manageable switch - it is like staging a house - first impressions count #spaghetti
Awesome video!
I thought I was pretty good at grasping scale, but my brain is PSODing at this one.
I couldn't even afford to keep this thing powered on
Last week: "Yeah, 2.5 GBit/s for all smaller devices [those that can't be upgraded easily] and 10 GBit/s for connections to the big servers, our PCs and other switches is enough."
Yesterday: "This will be such a big upgrade. Let's put in the fibre and replace the switches."
Today: "OK, let's watch a video while installing the new cards into the systems so I can finally test the fastest connections. This will be so cool..."
Big mistake. :D
These are not really edge switches :) Still fun to see how things are changing at the high-end
@@ServeTheHomeVideo I know...
- my hardware maxes out at ~40 GBit/s (best value of a single device)
- the network would max out at ~100 GBit/s (in between switches in case I use multiple switches based on the old connections, ...)
- many smaller connections won't be that fast (only 2.5 Gbit/s - upgrade impossible or could result in more upgrades)
- daily use per user often is in the range below 1 GBit/s (or even less thanks to SSH, ...)
- ...
I didn't randomly pick the parts for the upgrade... :)
But it feels slow* anyway... so thank you for that... ;)
(* no, it's actually quite nice with 2.5 - 10x the speed I could get before [per device] while the connections don't slow down each other as much as before)
---
And I'll ask if we can upgrade the cluster at work... I guess there are ~200 nodes (if not upgraded already) on their own network that uses quite a few switches.
So thanks for the information! (Really useful!!)
I'm surprised that with this power usage in 2U factor it's not water-cooled in any way
They can be, but most of the power is being used by the up to 28W OSFP modules. The 500W TDP switch chip is not too bad to cool. We will see 500W server CPUs air cooled this year
@@ServeTheHomeVideo it's just that the two aspects I've come across are:
1. Spinning air fans can take up to 30% of power usage
2. It's easier to collect heat with water block from high power chipsets
But then again I can imagine the airflow is more important for qsfps in this case
This is the switch I need for my home :p
how much is this? $20,000 ?
Switch pricing is very strange. For example, when we saw early 32-port 100GbE switches Arista might sell for $20K, Facebook was buying the Wedge100's for like $2500 each. Do not forget the optics though. 800G optics are not cheap.
if there is 500W needed for 512x 100G ports.. thats an insane efficiency. Please integrate the management cpu and make some passive home switches based on this technology - 16x 100G, splittable to 64x25G, in the 16W range! On the other note.. nobody mentioned what such TL10 switch costs? :D
Why is it so difficult to make faster switches? Is it more silicon bound?
Interesting that you have to deal with wires when swapping fans.
Marvell chips aren't known for its features, but if it's marketed towards AI then probably they will use as few features as possible for the raw speed.
Just to be clear, there is the lower-end line with chips like the Prestera, then the Innovium acquistion Teralynx line that we are showing here.
Whenever I see switches like this I do not understand how some ISP‘s cheap out on peering when there are solutions
Why dont our desktop cpus are packaged like that? Direct die with a metal frame around to support the cooler
Because it greatly increases the torque tolerances. In the V100 era HPE cracked many GPUs because their pre screened thermal paste was too thick. On consumer devices, you need to account for people who are installing their first CPU not following a guide to be successful or else it will cause problems
Next step: Just use PCIe to connect systems, as it was initially intended to.
Teardown of a 402 Tbs switch when?
Not sure. Hoping a 102.4T in two years
I'm puzzled why you would need a 10gbe base-t inside a 64 port 800gbe switch
Finally something for my 1 user (me) jellyfin server!
How do we start a petition for Nvidia to let Patrick have a look at what they’re working on??
Just one of those OSFP module uses more power than my mini PC...
Imagine if your mini PC had two PCIe Gen5 x16 slots of I/O and could pump them both hundreds of kilometers.
1000km distance without amplification?? For real? This blows my mind more than the 800G. I thought there was some physical law that limited it to about 100km, apparently not.
Quite the beastie!! No mention of cost?
" If you have to ask, you can't afford it "
OSFP is going to go the way of XFP. QSFP-DD is the future.