UCS Fabric Failover Examined

Fabric failover is a unique feature of the Cisco Unified Computing System that provides a “teaming-like” function in hardware . This function is entirely transparent to the operating system running on the server and does not require any configuration inside the OS. I think that this feature is quite useful, because it creates resiliency for UCS blades or rack servers, without depending on any drivers or configuration inside the operating system.

I have often encountered servers that were supposed to be redundantly connected to the network, as they were physically connected to two different switches. However, due to missing or misconfigured teaming, these servers would still lose their connectivity if the primary link failed. Therefore, I think that a feature that offers resiliency against path failures for Ethernet traffic without any need for teaming configuration inside the operating system, is very interesting. This is especially true for bare-metal Windows or Linux servers on UCS blades or rack servers.

In this post I do not intend to cover the basics of Fabric Failover, as this has already  been done excellently by other bloggers. So if you need a quick primer or refresher on this feature, then I recommend that you read Brad Hedlund’s classic post “Cisco UCS Fabric Failover: Slam Dunk? or So What?“.

Instead of rehashing the basic principles of fabric failover, I intend to dive a bit deeper into the UCSM GUI, UCSM CLI and NX-OS CLI to examine and illustrate the operation of this feature inside UCS. This serves a dual purpose: Gaining more insight in the actual implementation of the fabric failover feature and getting more familiar with some essential UCS screens and commands.

So to start, I created a service profile for a VMware ESXi host that has two separate vNICs, named eth0 and eth1. vNIC eth0 has fabric A as the primary fabric with failover to B and vNIC eth1 has fabric B as the primary fabric with failover to A.

Note: This setup is not typical for an ESXi deployment, but more common for Windows or Linux bare-metal deployments. In a VMware setup, failover and load-balancing is usually configured at the vSwitch level. However, for this specific example I decided to use ESXi because I already had it running in the lab anyway and it also illustrates how VM MAC addresses are handled by the fabric failover mechanism.

The screenshot below shows the vNIC setup for my service profile:

vNICs configured for fabric failover

In order to analyze the behavior of the fabric failover feature we need to find the vEthernet interfaces that are associated with these vNICs. Using the UCSM GUI we can find them in the “VIF Paths” tab:

VIF Paths

As you can see vNIC eth0 is associated with two virtual Ethernet interfaces: veth 703 on fabric A and veth 704 on fabric B. Likewise, vNIC eth1 is associated with veth 705 on fabric B and veth 706 on fabric A. This view also illustrates the physical component-to-component path that is being used by these interfaces. This same information can also be obtained from the UCSM CLI using the show service-profile circuit command:

UCS-60-B# show service-profile circuit name POD60-ESX-1 | egrep "eth|Fabric|VIF"
    Fabric ID: A
        VIF        vNIC            Link State  Overall Status Prot State    Prot Role   Admin Pin  Oper Pin   Transport
               703 eth0            Up          Active         Active        Primary     0/0        0/1        Ether
               706 eth1            Up          Active         Passive       Backup      0/0        0/1        Ether
    Fabric ID: B
        VIF        vNIC            Link State  Overall Status Prot State    Prot Role   Admin Pin  Oper Pin   Transport
               704 eth0            Up          Active         Passive       Backup      0/0        0/2        Ether
               705 eth1            Up          Active         Active        Primary     0/0        0/2        Ether

This command also reveals some additional detail about the failover roles of the VIFs. VIF 703 on fabric A is marked as role primary and state active for eth0, while VIF 704 on fabric B is marked as role backup and state passive for that same vNIC. For eth1 we can see that VIF 705 on fabric B is marked as primary/active and VIF 706 on fabric A is marked as backup/passive. Another way to view the same information from the UCSM GUI is to set the scope to the physical adapter and view the vNICs from there:

UCS-60-A# scope adapter 1/1/1
UCS-60-A /chassis/server/adapter # show host-eth-if

Eth Interface:
    ID         Dynamic MAC Address Name       Operability
    ---------- ------------------- ---------- -----------
             1 00:25:B5:60:00:0E   eth0       Operable
             2 00:25:B5:60:00:0F   eth1       Operable
UCS-60-A /chassis/server/adapter # scope host-eth-if 1
UCS-60-A /chassis/server/adapter/host-eth-if # show vif

VIF:
    ID         Fabric ID Transport Tag   Status      Overall Status
    ---------- --------- --------- ----- ----------- --------------
           703 A         Ether         0 Allocated   Active
           704 B         Ether         0 Allocated   Passive
UCS-60-A /chassis/server/adapter/host-eth-if # up
UCS-60-A /chassis/server/adapter # scope host-eth-if 2
UCS-60-A /chassis/server/adapter/host-eth-if # show vif

VIF:
    ID         Fabric ID Transport Tag   Status      Overall Status
    ---------- --------- --------- ----- ----------- --------------
           705 B         Ether         0 Allocated   Active
           706 A         Ether         0 Allocated   Passive

Now let’s move to the NX-OS CLI and see how this same information is represented in the networking components of the UCS Fabric Interconnects. Fabric Interconnect A shows the following:

UCS-60-A# connect nxos a
UCS-60-A(nxos)# show int veth 703
Vethernet703 is up
    Bound Interface is Ethernet1/1/1 
  Hardware: Virtual, address: 000d.ecf3.1140 (bia 000d.ecf3.1140)
  Description: server 1/1, VNIC eth0
  Encapsulation ARPA
  Port mode is trunk
  EtherType is 0x8100 
  Rx
    2729 unicast packets  0 multicast packets  36 broadcast packets
    2765 input packets  695320 bytes
    0 input packet drops
  Tx
    1553 unicast packets  76282 multicast packets  116278 broadcast packets
    194113 output packets  18743038 bytes
    0 flood packets
    0 output packet drops

UCS-60-A(nxos)# show int veth 706
Vethernet706 is up
    Bound Interface is Ethernet1/1/1 
  Hardware: Virtual, address: 000d.ecf3.1140 (bia 000d.ecf3.1140)
  Description: server 1/1, VNIC eth1
  Encapsulation ARPA
  Port mode is trunk
  EtherType is 0x8100 
  Rx
    0 unicast packets  0 multicast packets  0 broadcast packets
    0 input packets  0 bytes
    0 input packet drops
  Tx
    0 unicast packets  0 multicast packets  0 broadcast packets
    0 output packets  0 bytes
    0 flood packets
    0 output packet drops
UCS-60-A(nxos)# show mac address-table int veth 703
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link
   VLAN     MAC Address      Type      age     Secure NTFY    Ports          
---------+-----------------+--------+---------+------+----+------------------
* 198      0025.b560.000e    static    0          F    F  Veth703
* 199      0050.56b3.3173    dynamic   10         F    F  Veth703
* 932      0050.567e.66fb    dynamic   40         F    F  Veth703
UCS-60-A(nxos)# show mac address-table int veth 706

The output confirms that veth 703 is acting as the active interface for vNIC eth0. The statistics show that this interface has been forwarding packets and in the MAC address table we can see both static (vNIC MAC) and dynamic (VM or VMK MAC) entries associated with the interface. Interface veth 706 is passive for vNIC eth1 and this is confirmed by the fact that all packet counters are set to 0 and there are no MAC addresses associated with the interface.

To complement this information let’s obtain the same output from FI-B:

UCS-60-B(nxos)# show int veth 704, veth 705
Vethernet704 is up
    Bound Interface is Ethernet1/1/1 
  Hardware: Virtual, address: 000d.ecf0.7a00 (bia 000d.ecf0.7a00)
  Description: server 1/1, VNIC eth0
  Encapsulation ARPA
  Port mode is trunk
  EtherType is 0x8100 
  Rx
    0 unicast packets  0 multicast packets  0 broadcast packets
    0 input packets  0 bytes
    0 input packet drops
  Tx
    0 unicast packets  0 multicast packets  0 broadcast packets
    0 output packets  0 bytes
    0 flood packets
    0 output packet drops

Vethernet705 is up
    Bound Interface is Ethernet1/1/1 
  Hardware: Virtual, address: 000d.ecf0.7a00 (bia 000d.ecf0.7a00)
  Description: server 1/1, VNIC eth1
  Encapsulation ARPA
  Port mode is trunk
  EtherType is 0x8100 
  Rx
    34 unicast packets  53 multicast packets  72 broadcast packets
    159 input packets  24105 bytes
    0 input packet drops
  Tx
    33 unicast packets  89330 multicast packets  138303 broadcast packets
    227666 output packets  21926655 bytes
    0 flood packets
    0 output packet drops

UCS-60-B(nxos)# show mac address-table interface veth 704
UCS-60-B(nxos)# show mac address-table interface veth 705
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link
   VLAN     MAC Address      Type      age     Secure NTFY    Ports          
---------+-----------------+--------+---------+------+----+------------------
* 198      0025.b560.000f    static    0          F    F  Veth705
* 199      0050.569a.67a3    dynamic   100        F    F  Veth705
* 199      0050.56b3.6001    dynamic   780        F    F  Veth705

This shows similar results: Interface veth 704 is passive for vNIC eth0 and not forwarding any traffic. Interface veth 705 on the other hand is active for eth1, as can be seen from the traffic statistics and MAC address table.

Now that we have examined the baseline setup in UCSM and NX-OS it is time to let the fabric failover feature kick in and see what happens. To force a failover I disable the uplinks to fabric interconnect A on the upstream switches. This will leave the vEthernet interfaces on FI-A without an uplink to be pinned to and consequently they will go down. This should trigger the fabric failover mechanism for vNIC eth0 and make interface veth 704 on fabric B the active vNIC for eth0. It should also cause the MAC addresses that were learned on fabric A for eth0 to move to fabric B.

So I disable the uplinks to FI-A and then reexamine the situation from NX-OS on FI-A:

UCS-60-A(nxos)# sh int veth 703, veth 706 brief

--------------------------------------------------------------------------------
Vethernet     VLAN   Type Mode   Status  Reason                   Speed
--------------------------------------------------------------------------------
Veth703       198    eth  trunk  down    ENM Source Pin Fail        auto
Veth706       198    eth  trunk  down    ENM Source Pin Fail        auto

As expected the vEthernet interfaces have gone down by lack of an uplink to be pinned to. Now let’s have a look on FI-B:

UCS-60-B(nxos)# show int veth 704
Vethernet704 is up
    Bound Interface is Ethernet1/1/1 
  Hardware: Virtual, address: 000d.ecf0.7a00 (bia 000d.ecf0.7a00)
  Description: server 1/1, VNIC eth0
  Encapsulation ARPA
  Port mode is trunk
  EtherType is 0x8100 
  Rx
    78 unicast packets  0 multicast packets  16 broadcast packets
    94 input packets  15359 bytes
    0 input packet drops
  Tx
    48 unicast packets  4282 multicast packets  8613 broadcast packets
    12943 output packets  1255967 bytes
    0 flood packets
    0 output packet drops

UCS-60-B(nxos)# show mac address-table int veth 704
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link
   VLAN     MAC Address      Type      age     Secure NTFY    Ports          
---------+-----------------+--------+---------+------+----+------------------
* 198      0025.b560.000e    static    0          F    F  Veth704
* 199      0050.56b3.3173    dynamic   110        F    F  Veth704
* 932      0050.567e.66fb    dynamic   460        F    F  Veth704

As can be seen from the packet counters, interface veth 704 is now carrying traffic for vNIC eth0 and the MAC addresses have moved to Fabric Interconnect B as well.

Not only have the MAC addresses moved inside UCS from FI-A to FI-B, but UCS has also sent out gratuitous ARP packets for each of these MAC addresses to notify the upstream switches of the moved MAC addresses. This mechanism prevents packets to these MAC addresses from being black-holed in the LAN. To show the gARPs being sent through Fabric Interconnect B, I ran ethanalyzer from NX-OS using the following command:

UCS-60-B(nxos)# ethanalyzer local interface inbound-hi display-filter "arp" limit-captured-frames 0

When I shut down the uplinks to FI-A ethanalyzer captured the following packets:

2012-09-27 14:55:06.465188 00:50:56:7f:68:d8 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.465210 00:50:56:7f:68:d8 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.470990 00:25:b5:60:00:0c -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.471884 00:25:b5:60:00:0c -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.472197 00:25:b5:60:00:0c -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.480495 00:50:56:b3:31:73 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.480518 00:50:56:b3:31:73 -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.480524 00:50:56:7e:66:fb -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.480545 00:50:56:7e:66:fb -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.488295 00:25:b5:60:00:0e -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.488723 00:25:b5:60:00:0e -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)
2012-09-27 14:55:06.488974 00:25:b5:60:00:0e -> ff:ff:ff:ff:ff:ff ARP Gratuitous ARP for 0.0.0.0 (Request)

This clearly shows that gratuitous ARPs are sent for the MAC addresses that were associated with the failed vEthernet.

Now let’s revisit the UCSM CLI commands that we looked at before triggering the failover:

UCS-60-B# show service-profile circuit name POD60-ESX-1 | egrep "eth|Fabric|VIF"
    Fabric ID: A
        VIF        vNIC            Link State  Overall Status Prot State    Prot Role   Admin Pin  Oper Pin   Transport
               703 eth0            Error       Error          No Protection Primary     0/0        0/0        Ether
               706 eth1            Error       Error          No Protection Backup      0/0        0/0        Ether
    Fabric ID: B
        VIF        vNIC            Link State  Overall Status Prot State    Prot Role   Admin Pin  Oper Pin   Transport
               704 eth0            Up          Active         Active        Backup      0/0        0/2        Ether
               705 eth1            Up          Active         Active        Primary     0/0        0/2        Ether
UCS-60-B# scope adapter 1/1/1
UCS-60-B /chassis/server/adapter # show host-eth-if

Eth Interface:
    ID         Dynamic MAC Address Name       Operability
    ---------- ------------------- ---------- -----------
             1 00:25:B5:60:00:0E   eth0       Operable
             2 00:25:B5:60:00:0F   eth1       Operable
UCS-60-B /chassis/server/adapter # scope host-eth-if 1
UCS-60-B /chassis/server/adapter/host-eth-if # show vif

VIF:
    ID         Fabric ID Transport Tag   Status      Overall Status
    ---------- --------- --------- ----- ----------- --------------
           703 A         Ether         0 Allocated   Link Down
           704 B         Ether         0 Allocated   Active

The output here also indicates that vNIC eth0 has shifted its traffic to fabric B.

Finally, let’s see what the situation looks like in the “VIF Paths” tab in the GUI:

VIF paths after the failover

So let’s summarize the behavior that we observed:

When fabric failover is enabled for a vNIC, this causes two vEthernet interfaces to be created, one on each fabric. These two VIFs work together as an active/standby failover pair. By default, the vEthernet interface on the primary fabric carries all the traffic, but when the primary fabric fails, the vEthernet on the other fabric assumes the forwarding role. The failover also triggers MAC address learning, both within UCS, as well as on the upstream switches through the gARP process.

In addition, this post has shown how to use the UCSM GUI, UCSM CLI, and NX-OS CLI to analyze the frame forwarding inside UCS.

19 thoughts on “UCS Fabric Failover Examined

  1. Do you know a CLI command or way to either down the active vEthernet interface or failover to the backup if you wanted to test it without taking down the FI uplinks or IOM module.

  2. Hi Joe,

    I don’t know any action in the GUI or CLI that would allow you to shut a specific vEthernet interface. However, there are several alternative actions that come to mind:

    – If you select a specific NIC on the blade in the equipment tab, there is a “reset connectivity” button. I haven’t tested this, but with a bit of luck this would trigger the failover, at least for a brief period of time.

    – If you use static pinning between the FI and the IOM (instead of a port-channel), then you could disable the link between the FI and the IOM that the blade is pinned to. However, this will also affect other blades that are pinned to that same uplink.

    None of these options does exactly what you want, but maybe one of them works well enough for your scenario. This of course really depends on what you are trying to achieve. I suppose you are looking at testing failover for a single blade on a live UCS system?

    Tom

    P. S. I didn’t have the opportunity to test these options in the lab yet, so you would have to test them out yourself to see if they do what you want.

  3. Hi Tom,

    Nice post, and I have one clarification.
    Veth 705 is active in Fab-B. But from Esxi I have two nics one connected to FAB-A and second to FAB-B.
    From Esxi
    ========
    vmnic 0
    vmnic 1

    Even Veth 705 is active via FAB-B in Esxi when go to networking I can see
    vmnic 0 = full
    vmnic 1 = standby

    why vmnic 1 showing standby? please explain.
    And Appreciate if you can explain how this technology work for vm-fex.
    I heard for vm-fex one of vmnic we have to use for vm-traffic and other vmnic for vmotion, that case how I can give redundancy?.

    • Hi Abdul,

      It’s a bit hard to answer this question without knowing a bit more about your configuration, but the “full” and “standby” states that you are seeing in ESXi do not map to the UCS active and standby states, but refer to the configuration in vSphere/ESXi. The hardware failover function is performed by UCS at the adapter and transparent to the OS (ESXi).

      Normally, I would either provide redundancy at the network adapter hardware layer through fabric failover, or at the software layer (vSwitch, teaming), not both.

      In this lab exercise I have created two vNICs with failover enabled. Each of these vNICs has two associated vEthernet interfaces (so four vEths in total) and is protected at the hardware layer. The two vNICs are presented to ESXi, which may then apply its own failover mechanisms at the vSwitch level. If you have this set up as “active/standby” then this may explain what you’re seeing. Keep in mind that it is a bit hard for me to judge this without seeing your actual setup, but this might explain the behavior you are seeing.

      Hope this helps,

      Tom

  4. Hi Many Thanks for your reply,

    We used M81-KR mezzanine card ( two ports, one connecting to IOM-1 and second to IOM-2).
    From Fabric interconnect we created two vnics. In Esxi I can see two VNICS but one of them is standby.

    vnic 0 and vnic 1 (standby).

    My question is that why vnic 1 is showing standby from esxi?.

    • Hi Abdul,

      As said, I think this has more to do with your ESXi vSwitch configuration than with UCS. Hardware failover is transparent to the operating system, so it would not be visible at the ESXi level.

      In order to provide a better answer it would be useful if you could send me a screenshot of what it exactly is that you are seeing and screenshots of the vSwitch configuration. You can send them directly to me at tom.lijnse@layerzero.nl and I can have a look to see if I can give you a better explanation.

      Tom

      • Many Thanks,

        I observed this during my UCS study class. So at the moment I don’t have any screenshot to share with you.

        As you said if we create two vNICS ( one for FAB-A and other for FAB-B and enable failover for the same vlan’s ) both of them will be active for data traffic? That case no chance of duplicate MAC learning and MAC flooding by uplink switch?
        Please explain if both IO modules are actively forwarding the traffic how it going to load balance the traffic?
        If we enable failover form UCS side is it required to create two VNICs for the same vlan from UCS side?.

        • Hi Abdul,

          If you have two vNICs connected to an ESXi host, the load balancing is determined by the VMware vSwitch and the default method is to load balance on a per-VM basis. So the first VM would use the first vmnic, the next VM would use the second vmnic, and so on. So this means that both fabrics are actively used, but each individual VM MAC address only appears on a single fabric.

          In this scenario, you do not really need the hardware failover, because the vSwitch will not only load balance, but also fail over to the other vmnic if a vmnic would fail.

          On non-hypervisor servers, such as Windows servers, the hardware failover can function as a replacement for software teaming/bonding in the OS. In this case, I would only create a single vNIC for a VLAN, not two. On the server side, you would only see one adapter, configured with the IP address for the VLAN. To the Windows admin it looks like the server is single-homed. However, the NIC is still protected by UCS, without the need to create an adapter team in Windows.

          Tom

          • Well Explained and many thanks, Really appreciate if you can share some technical notes with screenshots or examples for vmfex ( normal and universal pass through), explaining how the vmkernal and vmtraffic is distrubuted over vswitch and vmfex-pass through-switch.

  5. Hi Tom,

    Any specific reason why VMs share the same veths XX? It that because of trunk configured for interfaces

    Thank you !!

    • Hi Santosh,

      A virtual ethernet interface in UCS represents the port that the vNIC on a physical server (blade or rack mount) connects to. So the veth interface essentially connects to the ESXi host. This implies that multiple VMs on that host could connect through that same vethernet interface.

      In this particular case, two vNICs were configured for the host (one on fabric A and one on B). It is then up to the vSwitch on the host to decide how to balance the VMs over these two uplinks.

      I hope this answers your question.

      Tom

  6. Hi,
    I have an issue with a virtual machine not being able to get an dhcp/pxe request when booting. each vm has 2 vnics (1 to fabric A and 1 to Fabric B) fabric a is on 1 FI and fabric B is on the other. when the vm is using vnic1 (A) and the pxe server is using vnic 2 (fabric B) the server does not boot, if they are connected to the same fabric the server boots. it looks like something is stopping broadcast requests between the 2 FI’s , does anyone have any clues?
    thanks

    • Hi Justin,

      I am not sure what is going on there.

      The only obvious thing to point out is that there is no direct connectivity between the A and B fabric inside the UCS system, so these DHCP requests have to go from Fabric Interconnect A to Fabric Interconnect B across the upstream LAN switches. Is there any chance that these DHCP packets are dropped by the LAN switches? (VLAN does not exist/VLAN not allowed on trunks/…)

      Kind regards,

      Tom

  7. Great Article.

    I have one doubt, I have created two MAC address ranges. e.g.

    MAC-Adreess Range 1- to be used on all vNIC’s attached to FI-A
    MAC-Adreess Range 2- to be used on all vNIC’s attached to FI-B

    In such scenario when my vNIC1 configugured with MAC-Range-1 if failsover to FI-B there will not be any issue moving this MAC to FI-B?

    • Hi Khurram,

      Sorry for the late reply.
      To answer your question: If a vNIC has an address in MAC range 1, then that address will fail over to the B side when the A fabric fails for that vNIC. A gratuitous ARP sent by fabric interconnect B will notify the upstream network that this MAC address has now moved to FI B. So even though in the general case the range 1 addresses only show up on the A fabric, during a failure condition (some) of these MAC addresses will be used on the B fabric.

      Hope this helps,

      Tom

Leave a Reply to Khurram Cancel reply

Your email address will not be published. Required fields are marked *