Some corrections on Cisco's Cloud Scale ASICs (since that is what I primarily work with in ACI mode) - They do not use that additional ASIC to do VXLAN encap/decap with routing. That was one of the main purposes of them developing their own silicon. Only the Gen 1 N9K switches used the Broadcom dual ASIC hack. Regarding store and forward versus cut through, with the N9300 for example, it depends on speed of ingress/egress. Given it's a fabric, this makes the most sense. i.e egress interface is same speed or faster it uses cut through and if there is a CRC error, it has to "stomp" the CRC by changing the frame FCS letting other egress ports know that it has already been forwarded but it has errors. If the ingress interface is slower than egress, it defaults to store and forward. I imagine this is the opposite of what you described in relation to speed mismatch. In this case it seems they are smoothing out the difference in signal rate on egress by using the buffer. This of course is different than congestion-related speed mismatch.
As far as the different feature sets baked in (again only focused on Cisco Nexus) - Microburst detection can be useful in some situations. AFD/ETRAP/DPP are all very unique ways to work around being fair to bursty traffic as well as being fair to elephant and mice flows. As far as the gPB telemetry, I've seen a large uptick in this becoming more important. Primarily with the way this data can be used with no CPU overhead to deliver non-sampled full flow records directly to a collector. Built in tools such as Nexus Dashboard Insights can use this data beyond troubleshooting these days to do things like dependency mapping using connectivity analysis allowing you to stitch together security policy based on what's actually happening versus limited logging capabilities of these switches (they are not firewalls and are limited by control plane policing usually). The main problem with telemetry is scale of the collector and Cisco's NDI app is limited to 20,000 flows/sec.
Nice talk. I have a question. In a fixed-function bare metal switch, with some kind of Linux OS installed on it (ASIC not programmable), is it possible to use a CPU to process packets with headers that the ASIC is not capable of processing in the pipeline? VxLAN for example? I don't care about performance here. If so, how is it possible?
We could make a rule to forward such a packet to the internal CPU that can do/is programmed to do the necessary processing and feed the output modified/processed packet back to the Packet processing pipeline that can then just forward the packet on the desired output port.
One of my favorite PacketPusher's presentations thus far. Well done.
I keep re-watching this video cause its full of info that needs to be understood by each network engineer , it clears a lot
Some corrections on Cisco's Cloud Scale ASICs (since that is what I primarily work with in ACI mode) - They do not use that additional ASIC to do VXLAN encap/decap with routing. That was one of the main purposes of them developing their own silicon. Only the Gen 1 N9K switches used the Broadcom dual ASIC hack. Regarding store and forward versus cut through, with the N9300 for example, it depends on speed of ingress/egress. Given it's a fabric, this makes the most sense. i.e egress interface is same speed or faster it uses cut through and if there is a CRC error, it has to "stomp" the CRC by changing the frame FCS letting other egress ports know that it has already been forwarded but it has errors. If the ingress interface is slower than egress, it defaults to store and forward. I imagine this is the opposite of what you described in relation to speed mismatch. In this case it seems they are smoothing out the difference in signal rate on egress by using the buffer. This of course is different than congestion-related speed mismatch.
As far as the different feature sets baked in (again only focused on Cisco Nexus) - Microburst detection can be useful in some situations. AFD/ETRAP/DPP are all very unique ways to work around being fair to bursty traffic as well as being fair to elephant and mice flows. As far as the gPB telemetry, I've seen a large uptick in this becoming more important. Primarily with the way this data can be used with no CPU overhead to deliver non-sampled full flow records directly to a collector. Built in tools such as Nexus Dashboard Insights can use this data beyond troubleshooting these days to do things like dependency mapping using connectivity analysis allowing you to stitch together security policy based on what's actually happening versus limited logging capabilities of these switches (they are not firewalls and are limited by control plane policing usually). The main problem with telemetry is scale of the collector and Cisco's NDI app is limited to 20,000 flows/sec.
Great presentation! thank you !!
Hello, can we get the ppt of this slide too ?
Brilliant presentation
very informative talk. 👍
Nice talk. I have a question. In a fixed-function bare metal switch, with some kind of Linux OS installed on it (ASIC not programmable), is it possible to use a CPU to process packets with headers that the ASIC is not capable of processing in the pipeline? VxLAN for example? I don't care about performance here. If so, how is it possible?
We could make a rule to forward such a packet to the internal CPU that can do/is programmed to do the necessary processing and feed the output modified/processed packet back to the Packet processing pipeline that can then just forward the packet on the desired output port.
Great job
Nice summary, good job.
(Minor typo: StrataXGS, not StrataSGX)
ASICs, not ASCIs