UCS Fabric Failover Examined

Fabric failover is a unique feature of the Cisco Unified Computing System that provides a “teaming-like” function in hardware . This function is entirely transparent to the operating system running on the server and does not require any configuration inside the OS. I think that this feature is quite useful, because it creates resiliency for UCS blades or rack servers, without depending on any drivers or configuration inside the operating system.

I have often encountered servers that were supposed to be redundantly connected to the network, as they were physically connected to two different switches. However, due to missing or misconfigured teaming, these servers would still lose their connectivity if the primary link failed. Therefore, I think that a feature that offers resiliency against path failures for Ethernet traffic without any need for teaming configuration inside the operating system, is very interesting. This is especially true for bare-metal Windows or Linux servers on UCS blades or rack servers.

In this post I do not intend to cover the basics of Fabric Failover, as this has already  been done excellently by other bloggers. So if you need a quick primer or refresher on this feature, then I recommend that you read Brad Hedlund’s classic post “Cisco UCS Fabric Failover: Slam Dunk? or So What?“.

Instead of rehashing the basic principles of fabric failover, I intend to dive a bit deeper into the UCSM GUI, UCSM CLI and NX-OS CLI to examine and illustrate the operation of this feature inside UCS. This serves a dual purpose: Gaining more insight in the actual implementation of the fabric failover feature and getting more familiar with some essential UCS screens and commands.

Continue reading