OTV and LISP on the CSR 1000v

OTV and LISP are two interesting new data center technologies that are worth examining when you are studying for a Cisco Data Center certification, such as CCNP or CCIE Data Center. Unfortunately, not everybody can afford a couple of Nexus 7000s to play with. As an instructor for Fast Lane I regularly have access to Nexus based labs, but I still thought that it would be nice to have a lab setup of my own to experiment with. Fortunately, there is now a very nice way to get some hands-on experience with these protocols through the Cisco Cloud Services Router (CSR) 1000v, which I blogged about earlier.

The CSR 1000v is based on the same IOS XE code that runs on the ASR 1000, which supports both OTV and LISP. So I decided to try to build a lab to test VM mobility using OTV and LISP in my home lab using a number of CSR 1000v instances.

Note: The CSR runs IOS-XE, not NX-OS, and as a result it uses a different command set to configure OTV and LISP. In that sense, it cannot replace practicing with actual Nexus gear for the purposes of exam preparation. However, it does allow you to examine the underlying structures and mechanisms of the technologies, and it allows you to get an idea of the common configuration elements in an OTV or LISP configuration.

The Basic Lab Setup

I decide to implement the following topology:

LISP and OTV lab diagram

I create two separate VLANs  on my vSWITCH which I intend to bridge together using OTV between the routers “dc1-otv” and “dc2-otv”. I create two VMs, one in each VLAN, to which I assign IP addresses from the same IP subnet 192.168.200.0/24. VM “VM-1” gets IP  address 192.168.200.1 and “VM-2” gets 192.168.200.2. The routers “dc1-xtr”, “dc2-xtr”, and “branch-xtr” will be configured as LISP xTRs later.

I put the following basic configuration on router dc1-otv:

!
hostname dc1-otv
!
enable secret cisco
!
no ip domain lookup
!
interface GigabitEthernet1
 ip address 10.200.200.1 255.255.255.0
 no shutdown
!         
router ospf 1
 router-id 10.200.200.1
 network 10.200.200.1 0.0.0.0 area 0
!
line vty 0 4
 exec-timeout 0 0
 password cisco
 login    
!

And I put a similar configuration on router dc2-otv:

!
hostname dc2-otv
!
enable secret cisco
!
no ip domain lookup
!
interface GigabitEthernet1
 ip address 10.200.200.2 255.255.255.0
 no shutdown
!
router ospf 1
 router-id 10.200.200.2
 network 10.200.200.2 0.0.0.0 area 0
!
line vty 0 4
 exec-timeout 0 0
 password cisco
 login    
!

Configuring OTV

The next thing I do is preparing the OTV join interface for multicast operation. I enable multicast routing, set the IGMP version to 3 and enable PIM in passive mode on the OTV join interface:

!
ip multicast-routing distributed
!
interface GigabitEthernet1
 ip pim passive
 ip igmp version 3
!

Note: Unlike the Nexus 7000, the CSR requires multicast routing to be enabled in order to enable the IGMP functionality that is required for OTV. On the Nexus 7000 it is not necessary to enable multicast routing and PIM. Simply setting the IGMP version to 3 is sufficient on that platform.

Next I configure the OTV site ID and create the Overlay interface on router dc1-otv with the following parameters:

otv site-identifier 0001.0001.0001
!
interface Overlay1
 otv control-group 239.37.37.37
 otv data-group 232.37.37.0/24
 otv join-interface GigabitEthernet1
 no shutdown

Of course I configure router dc2-otv in a similar manner:

otv site-identifier 0002.0002.0002
!
interface Overlay1
 otv control-group 239.37.37.37
 otv data-group 232.37.37.0/24
 otv join-interface GigabitEthernet1
 no shutdown

I verify the OTV configuration and confirm that the adjacency between the two routers has been established:

dc1-otv#show otv overlay 1
Overlay Interface Overlay1
 VPN name                 : None
 VPN ID                   : 1
 State                    : UP
 AED Capable              : No, site interface not up
 IPv4 control group       : 239.37.37.37
 Mcast data group range(s): 232.37.37.0/24 
 Join interface(s)        : GigabitEthernet1
 Join IPv4 address        : 10.200.200.1
 Tunnel interface(s)      : Tunnel0
 Encapsulation format     : GRE/IPv4
 Site Bridge-Domain       : None
 Capability               : Multicast-reachable
 Is Adjacency Server      : No
 Adj Server Configured    : No
 Prim/Sec Adj Svr(s)      : None

dc1-otv#show otv adjacency
Overlay 1 Adjacency Database
Hostname                       System-ID      Dest Addr       Up Time   State
dc2-otv                        001e.bd03.a200 10.200.200.2    00:00:47  UP

One thing that is worth noticing, is that the OTV devices are not marked as “AED capable” yet. This is caused by the fact that the OTV site VLAN is not configured and operational at this point. The site VLAN configuration is done slightly differently on the CSR compared to the Nexus 7000. The CSR is not a switch, and therefore does not support direct configuration of VLANs. Instead of a site VLAN, a site bridge-group is configured. The bridge group represents the broadcast domain and can be linked to interfaces and VLAN tags using so-called “service instances”. To setup the site VLAN I use the following commands on router dc1-otv:

!
otv site bridge-domain 2001
!
interface GigabitEthernet2
 no shutdown
 service instance 2001 ethernet
  encapsulation dot1q 2001
  bridge-domain 2001

And similarly, I configure the following on router dc2-otv:

!
otv site bridge-domain 2002
!
interface GigabitEthernet2
 no shutdown
 service instance 2002 ethernet
  encapsulation dot1q 2002
  bridge-domain 2002

These commmands essentially create a bridged domain on the router, which is then associated with interface GigabitEthernet2 for frames that carry an 802.1Q VLAN tag of 2001. For more information about Ethernet Service Instances refer to Configuring Ethernet Virtual Connections on the Cisco ASR 1000 Series Router in the ASR 1000 configuration guide.
At this point the OTV overlay has become fully operational on both sides:

dc1-otv#show otv overlay 1 
Overlay Interface Overlay1
 VPN name                 : None
 VPN ID                   : 1
 State                    : UP
 AED Capable              : Yes
 IPv4 control group       : 239.37.37.37
 Mcast data group range(s): 232.37.37.0/24 
 Join interface(s)        : GigabitEthernet1
 Join IPv4 address        : 10.200.200.1
 Tunnel interface(s)      : Tunnel0
 Encapsulation format     : GRE/IPv4
 Site Bridge-Domain       : 2001
 Capability               : Multicast-reachable
 Is Adjacency Server      : No
 Adj Server Configured    : No
 Prim/Sec Adj Svr(s)      : None

In this lab the site bridge-domain configuration is pretty meaningless, because there is only  a single OTV edge device per site. Therefore, I just attached the bridge-group to an arbitrary interface and VLAN tag, to simply ensure that the overlay interface would become operational. In reality, you should take care that the VLAN selected for the site bridge-domain is properly extended between OTV edge devices within the site, but not carried across the overlay.

The final piece in this configuration is to actually extend some VLANs across the OTV overlay. Again, this is done through a bridge-group and corresponding service instance configurations. I add the following configuration to router dc1-otv:

!
interface GigabitEthernet2
 service instance 201 ethernet
  encapsulation untagged
  rewrite ingress tag push dot1q 200 symmetric
  bridge-domain 200
!
interface Overlay1
 service instance 201 ethernet
  encapsulation dot1q 200
  bridge-domain 200
!

And I add a similar configuration on dc2-otv:

!
interface GigabitEthernet2
 service instance 202 ethernet
  encapsulation untagged
  rewrite ingress tag push dot1q 200 symmetric
  bridge-domain 200
!
interface Overlay1
 service instance 202 ethernet
  encapsulation dot1q 200
  bridge-domain 200
!

This configuration is a little peculiar, which has to do with the specifics of my lab setup. The intention is to create a single VLAN 200 stretched across the two DC sites. However, in my lab this is all setup on a common virtual infrastructure. To still create two separate “VLAN 200” instances I essentially created two VMware port-groups and associated VLANS (VLAN 201 and VLAN 202). CSR dc1-otv and VM-1 are both connected to VLAN 201. Similarly, CSR dc2-otv and VM-2 are connected to VLAN 202 (see diagram). As a result, the “VLAN 200” frames arrive as untagged frames on the internal interfaces of CSR dc1-otv and dc2-otv. These frames then need to be bridged across the cloud as VLAN 200 frames. In a more realistic scenario the frames would already arrive with VLAN 200 tags on the internal interfaces and the rewrite commands would be unnecessary.

With these final steps the OTV configuration is finished and I can put it to the test. So I ping from VM-1 in DC-1 (192.168.200.1) to VM-2 in DC-2 (192.168.200.2):

Ping across OTV overlay

The ping succeeds, confirming that OTV is operational.

Update: Brantley Richbourg and Brandon Farmer pointed out in the comments that this ping fails if you don’t have your VMware vSwitch set to accept promiscuous mode. Initially, I didn’t notice this behavior, because I already had my vSwitch set to accept promiscuous mode for different reasons. I retested the lab with promiscuous mode set to “reject” and confirmed that this stops the ping from working. The explanation for this behavior is that the frames from VM-1 to VM-2 have the VM-2 MAC address as the destination MAC address, which is not registered to the CSR virtual NIC. For unicast MAC frames, the vSwitch normally only sends frames with a particular destination MAC address to the VM that is associated with this MAC address (if it is local) or to an uplink (if it is remote). Therefore, the VM-1 to VM-2 frame is not sent to the CSR VM, so the CSR never sees the frame. As a result, the frame cannot be forwarded across the overlay. When promiscuous mode is set to “accept” on the vSwitch, the CSR receives all traffic on VLAN 201, allowing it to forward the VM-1 to VM-2 traffic across the overlay. So, if this ping fails in your lab, make sure that you have your vSwitch set to accept promiscuous mode! Thanks to Brantley and Brandon for pointing out this potential issue with the lab setup!

 Now, let’s verify the MAC address entries and ARP entries on the OTV edge devices:

dc1-otv#show otv route 

Codes: BD - Bridge-Domain, AD - Admin-Distance,
       SI - Service Instance, * - Backup Route

OTV Unicast MAC Routing Table for Overlay1

 Inst VLAN BD     MAC Address    AD    Owner  Next Hops(s)
----------------------------------------------------------
 0    200  200    000c.296a.a4ad 40    BD Eng Gi2:SI201
 0    200  200    000c.297c.e283 50    ISIS   dc2-otv

2 unicast routes displayed in Overlay1

----------------------------------------------------------
2 Total Unicast Routes Displayed

dc1-otv#show otv arp-nd-cache
Overlay1 ARP/ND L3->L2 Address Mapping Cache
BD     MAC            Layer-3 Address  Age (HH:MM:SS) Local/Remote
200    000c.297c.e283 192.168.200.2    00:00:33       Remote

Also, the internal OTV IS-IS database can be examined to confirm that the MAC addresses are advertised by the OTV edge devices:

dc1-otv#show otv isis database detail 

Tag Overlay1:
IS-IS Level-1 Link State Database:
LSPID                 LSP Seq Num  LSP Checksum  LSP Holdtime      ATT/P/OL
dc2-otv.00-00         0x0000000C   0xBED7        933               0/0/0
  Area Address: 00
  NLPID:        0xCC 0x8E 
  Hostname: dc2-otv
  Metric: 10         IS-Extended dc1-otv.01
  Layer 2 MAC Reachability: topoid 0, vlan 200, confidence 1
    000c.297c.e283 
dc1-otv.00-00       * 0x0000000C   0xE51A        959               0/0/0
  Area Address: 00
  NLPID:        0xCC 0x8E 
  Hostname: dc1-otv
  Metric: 10         IS-Extended dc1-otv.01
  Layer 2 MAC Reachability: topoid 0, vlan 200, confidence 1
    000c.296a.a4ad 
dc1-otv.01-00       * 0x0000000A   0x2916        753               0/0/0
  Metric: 0          IS-Extended dc1-otv.00
  Metric: 0          IS-Extended dc2-otv.00

Although it is usually not necessary to dive into the IS-IS database that is used by the OTV control plane, it is nice to be able to take a peek under the hood.

So now that we have a working OTV setup that extends VLAN 200 and the corresponding subnet 192.168.200.0/24 across the two DC sites, it is time to add LISP to optimize the inbound routing for mobile VMs.

Preparing for LISP

To start, I add two additional routers to the network, which will act as a LISP ingress tunnel router (ITR) and egress tunnel router (ETR) for their respective sites: router dc1-xtr and router dc2-xtr. I connect these routers to the IP core and enable HSRP on the interface that faces the stretched VLAN 200. This results in the following basic configurations:

!
hostname dc1-xtr
!
enable secret cisco
!
no ip domain lookup
!
interface GigabitEthernet1
 ip address 10.200.200.3 255.255.255.0
 no shutdown
!
interface GigabitEthernet2
 ip address 192.168.200.252 255.255.255.0
 standby 200 ip 192.168.200.254
 no shutdown
!
router ospf 1
 router-id 10.200.200.3
 network 10.200.200.3 0.0.0.0 area 0
!
line vty 0 4
 exec-timeout 0 0
 password cisco
 login
!

and

!
hostname dc2-xtr
!
enable secret cisco
!
no ip domain lookup
!
interface GigabitEthernet1
 ip address 10.200.200.4 255.255.255.0
 no shutdown
!
interface GigabitEthernet2
 ip address 192.168.200.253 255.255.255.0
 standby 200 ip 192.168.200.254
 no shutdown
!
router ospf 1
 router-id 10.200.200.4
 network 10.200.200.4 0.0.0.0 area 0
!
line vty 0 4
 exec-timeout 0 0
 password cisco
 login
!

While doing my verifications I notice something interesting in the behavior of HSRP:

dc1-xtr#show standby brief
                     P indicates configured to preempt.
                     |
Interface   Grp  Pri P State   Active          Standby         Virtual IP
Gi2         200  100   Active  local           unknown         192.168.200.254

dc2-xtr#show standby brief
                     P indicates configured to preempt.
                     |
Interface   Grp  Pri P State   Active          Standby         Virtual IP
Gi2         200  100   Active  local           unknown         192.168.200.254

Both routers dc1-xtr and dc2-xtr consider themselves to be the active router and do not list a standby router. Is OTV not properly bridging the traffic of these routers across the overlay? Let’s try a quick ping to see if these routers have connectivity across the extended VLAN 200:

dc1-xtr#ping 192.168.200.253
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.200.253, timeout is 2 seconds:
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 1/1/2 ms

So OTV seems to properly bridge the packets between routers dc1-xtr and dc2-xtr across the overlay. Let’s have a closer look on the OTV edge devices:

dc1-otv#show otv detail | include FHRP
 FHRP Filtering Enabled   : Yes
dc1-otv#show otv route                

Codes: BD - Bridge-Domain, AD - Admin-Distance,
       SI - Service Instance, * - Backup Route

OTV Unicast MAC Routing Table for Overlay1

 Inst VLAN BD     MAC Address    AD    Owner  Next Hops(s)
----------------------------------------------------------
 0    200  200    0000.0c07.acc8 40    BD Eng Gi2:SI201
*0    200  200    0000.0c07.acc8 50    ISIS   dc2-otv
 0    200  200    000c.290b.cab0 50    ISIS   dc2-otv
 0    200  200    000c.295d.29e5 40    BD Eng Gi2:SI201
 0    200  200    000c.296a.a4ad 40    BD Eng Gi2:SI201
 0    200  200    000c.297c.e283 50    ISIS   dc2-otv

6 unicast routes displayed in Overlay1

----------------------------------------------------------
6 Total Unicast Routes Displayed
dc1-otv#sh otv isis database detail 

Tag Overlay1:
IS-IS Level-1 Link State Database:
LSPID                 LSP Seq Num  LSP Checksum  LSP Holdtime      ATT/P/OL
dc2-otv.00-00         0x00000025   0xF13B        1178              0/0/0
  Area Address: 00
  NLPID:        0xCC 0x8E 
  Hostname: dc2-otv
  Metric: 10         IS-Extended dc1-otv.01
  Layer 2 MAC Reachability: topoid 0, vlan 200, confidence 1
    0000.0c07.acc8 000c.290b.cab0 000c.297c.e283 
dc1-otv.00-00       * 0x00000025   0x5EE0        1180              0/0/0
  Area Address: 00
  NLPID:        0xCC 0x8E 
  Hostname: dc1-otv
  Metric: 10         IS-Extended dc1-otv.01
  Layer 2 MAC Reachability: topoid 0, vlan 200, confidence 1
    000c.295d.29e5 000c.296a.a4ad 
dc1-otv.01-00       * 0x0000000D   0x2319        659               0/0/0
  Metric: 0          IS-Extended dc1-otv.00
  Metric: 0          IS-Extended dc2-otv.00

As it turns out, OTV on the CSR has FHRP filtering built-in and enabled by default. This means that it is not necessary to configure customized access-lists to filter HSRP hellos across the overlay. Interestingly enough, it does seem to advertise the HSRP MAC address through OTV IS-IS.  When you configure manual FHRP filtering on a Nexus 7000 you would usually suppress these advertisements as well as the actual HSRP packets. It looks like this behavior on the CSR could lead to continual MAC updates for the HSRP MAC address, which in turn could affect control plane stability. This may be a point worth investigating further if you are considering deploying CSR/ASR-based OTV in production. On the other hand, I really like the fact that the CSR has FHRP filtering straight out of the box and that it is controllable through a simple command (otv filter-fhrp), rather than a cumbersome access-list configuration.

Configuring LISP

Now that we have verified the basic setup we can start on the actual LISP configuration. I start by configuring my LISP map-server and map-resolver. I want to make these vital functions redundant and I want to separate these functions from the LISP tunnel routers (xTR) to keep the configurations a bit cleaner. Although it is definitely possible to put the map-server/map-resolver functions on a router that also acts as a LISP xTR, I think this might make the configurations a bit harder to understand. Also, I intend to take some LISP sniffer traces on this lab later and having separate IP addresses for the different functions will make these traces more easily readable.

Of course I could have added two more routers to my lab, but since the OTV routers are separate from the LISP xTRs anyway, I decide to make these routers perform the role of LISP map-servers and map-resolvers. First, I add the LISP map-server function to routers dc1-otv and dc2-otv:

router lisp
 site DC1-DC2
  authentication-key DC1-DC2-S3cr3t
  eid-prefix 192.168.200.0/24 accept-more-specifics
 !
 ipv4 map-server

I added the accept-more-specifics keyword to the EID prefix to allow registration of individual mobile /32 routes later.

Next, I set up the map-resolver function on routers dc1-otv and dc2-otv:

interface Loopback37
 description LISP map-resolver anycast address
 ip address 10.37.37.37 255.255.255.255
!
router ospf 1
 network 10.37.37.37 0.0.0.0 area 0
!
router lisp
 ipv4 map-resolver

To make the LISP resolver function redundant I add an anycast address (10.37.37.37) to both dc1-otv and dc2-otv. This address will be configured as the LISP map-resolver address on the LISP xTRs.

Note: When configuring anycast IP addresses on routers, you should take proper care that these addresses never get selected as a router ID for any routing protocol. This is why I specifically configured the OSPF router ID on these routers using the router-id command.

Now that the LISP map-server and map-resolver have been set up, I configure the routers dc1-xtr and dc2-xtr as LISP xTRs for the 192.168.200.0/24 EID space that is associated with the OTV extended VLAN 200. On router dc1-xtr I add the following commands:

router lisp
 locator-set DC1
  10.200.200.3 priority 10 weight 50
 !
 database-mapping 192.168.200.0/24 locator-set DC1
 !
 ipv4 itr map-resolver 10.37.37.37
 ipv4 itr
 ipv4 etr map-server 10.200.200.1 key DC1-DC2-S3cr3t
 ipv4 etr map-server 10.200.200.2 key DC1-DC2-S3cr3t
 ipv4 etr

And on router dc2-xtr I add the following commands:

router lisp
 locator-set DC2
  10.200.200.4 priority 10 weight 50
 !
 database-mapping 192.168.200.0/24 locator-set DC2
 !
 ipv4 itr map-resolver 10.37.37.37
 ipv4 itr
 ipv4 etr map-server 10.200.200.1 key DC1-DC2-S3cr3t
 ipv4 etr map-server 10.200.200.2 key DC1-DC2-S3cr3t
 ipv4 etr

The only real difference between these configurations is the locator IP address, which is 10.200.200.3 for DC1 and 10.200.200.4 for DC2.
Next, I verify that the EID prefix 192.168.200.0/24 has been registered on the map-servers:

dc1-otv#show lisp site detail
LISP Site Registration Information

Site name: DC1-DC2
Allowed configured locators: any
Allowed EID-prefixes:
  EID-prefix: 192.168.200.0/24 
    First registered:     00:04:05
    Routing table tag:    0
    Origin:               Configuration, accepting more specifics
    Merge active:         No
    Proxy reply:          No
    TTL:                  1d00h
    State:                complete
    Registration errors:  
      Authentication failures:   0
      Allowed locators mismatch: 0
    ETR 10.200.200.3, last registered 00:00:09, no proxy-reply, map-notify
                      TTL 1d00h, no merge, hash-function sha1, nonce 0x24CDDA34-0x0CF777A1
                      state complete, no security-capability
                      xTR-ID 0x78ED0ACF-0x8B46F5F7-0xF896252E-0xEC47696E
                      site-ID unspecified
      Locator       Local  State      Pri/Wgt
      10.200.200.3  yes    up          10/50 
    ETR 10.200.200.4, last registered 00:00:19, no proxy-reply, map-notify
                      TTL 1d00h, no merge, hash-function sha1, nonce 0xEB7D2794-0x64B6413D
                      state complete, no security-capability
                      xTR-ID 0x58ABCEE0-0x5DF04443-0xBB0C7B58-0x4AEB3FD8
                      site-ID unspecified
      Locator       Local  State      Pri/Wgt
      10.200.200.4  yes    up          10/50

To test the LISP functionality, I need to have a third site that I can run connectivity tests from. This is the role of the branch router in the lab topology. I add the following basic configuration to that router:

!
hostname branch-xtr
!
enable secret cisco
!
no ip domain lookup
!
ip dhcp pool BRANCH-LAN
 network 172.16.37.0 255.255.255.0
 default-router 172.16.37.254
!
interface GigabitEthernet1
 ip address 10.200.200.5 255.255.255.0
 no shutdown
!         
interface GigabitEthernet2
 ip address 172.16.37.254 255.255.255.0
 no shutdown
!         
router ospf 1
 router-id 10.200.200.5
 network 10.200.200.5 0.0.0.0 area 0
!         
line vty 0 4
 exec-timeout 0 0
 password cisco
 login    
!

Because this router is part of the LISP enabled network and the prefix 172.16.37.0/24 is part of the EID space, we need to create a corresponding LISP site configuration for this site on the LISP map-servers. So I add the following commands to routers dc1-otv and dc2-otv:

router lisp
 site BRANCH
  authentication-key Br@nch-S3cr3t
  eid-prefix 172.16.37.0/24

Now that the map-servers have been set up, we can add the LISP xTR configuration on the branch router. I add the following to router branch-xtr:

router lisp
 database-mapping 172.16.37.0/24 10.200.200.5 priority 10 weight 100
 ipv4 itr map-resolver 10.37.37.37
 ipv4 itr 
 ipv4 etr map-server 10.200.200.1 key Br@nch-S3cr3t
 ipv4 etr map-server 10.200.200.2 key Br@nch-S3cr3t
 ipv4 etr

And again, I confirm that the EID prefix has been properly registered on the map-servers:

dc1-otv#show lisp site name BRANCH
Site name: BRANCH
Allowed configured locators: any
Allowed EID-prefixes:
  EID-prefix: 172.16.37.0/24 
    First registered:     00:00:11
    Routing table tag:    0
    Origin:               Configuration
    Merge active:         No
    Proxy reply:          No
    TTL:                  1d00h
    State:                complete
    Registration errors:  
      Authentication failures:   0
      Allowed locators mismatch: 0
    ETR 10.200.200.5, last registered 00:00:11, no proxy-reply, map-notify
                      TTL 1d00h, no merge, hash-function sha1, nonce 0x4A43B6D4-0xF8540179
                      state complete, no security-capability
                      xTR-ID 0xE8323230-0xFABD8623-0x2448E48B-0x2B80C3A0
                      site-ID unspecified
      Locator       Local  State      Pri/Wgt
      10.200.200.5  yes    up          10/100

So now it is time to put the LISP configurations to the test and ping from VM-3 to VM-1 and VM-2. The pings succeed as expected:
Ping across LISP
However, at this point we have only implemented a straightforward LISP setup without introducing the VM mobility concept. The branch router does not know where to forward packets for individual VM IP addresses. It only knows how to reach the EID prefix 192.168.200.0/24 of VLAN 200. When I actually perform a traceroute to VM-1 and VM-2 from VM-3 I see that the traffic to both VMs is routed through DC-1:

Traceroute across LISP

This means that traffic from VM-3 to VM-2 actually needs to go across the OTV interconnect between DC1 and DC2 in order to reach VM-2. Clearly, this is sub-optimal and this is where LISP mobility comes in. By registering individual /32 EID prefixes for each VM, the location of the VMs can be tracked in the LISP enabled network and the traffic flow can be optimized. So I add the following commands to router dc1-xtr:

router lisp
 dynamic-eid MOBILE-VMS
  database-mapping 192.168.200.0/28 locator-set DC1
  map-server 10.200.200.1 key DC1-DC2-S3cr3t
  map-server 10.200.200.2 key DC1-DC2-S3cr3t
  map-notify-group 224.0.0.100
!
interface GigabitEthernet2
 lisp mobility MOBILE-VMS
 lisp extended-subnet-mode

And I add similar commands on dc2-xtr:

router lisp
 dynamic-eid MOBILE-VMS
  database-mapping 192.168.200.0/28 locator-set DC2
  map-server 10.200.200.1 key DC1-DC2-S3cr3t
  map-server 10.200.200.2 key DC1-DC2-S3cr3t
  map-notify-group 224.0.0.100
!
interface GigabitEthernet2
 lisp mobility MOBILE-VMS
 lisp extended-subnet-mode

For the mobile VMs I selected a sub-prefix of the overall 192.168.200.0/24 prefix that belongs to the twin-DC site. Of course, I could also have used the complete prefix for mobility. In that case the regular LISP database mapping for that prefix should be removed.

Note: There is one other thing that is specifically worth noting about this configuration: I configured the link-local multicast group 224.0.0.100 as the LISP map-notify group. This is not a best practice and I should have used a regular ASM multicast group (for example 239.1.1.1). However, when I first configured LISP using a random 239.0.0.0/8 ASM group I was experiencing all sorts of problems. My VMs were reporting duplicate IP addresses and the traceroutes from VM-3 were inconsistent. After a couple of hours of troubleshooting, I finally figured out that this was caused by the fact that the LISP map-notify multicast group wasn’t properly forwarded across OTV between routers dc1-xtr and dc2-xtr. I tried to tackle this problem in various ways, including converting my OTV configuration to a unicast setup with adjacency servers, but to no avail. In the end I started suspecting IGMP snooping (or the lack thereof) in the whole chain of vSwitches, bridge-domains, and OTV to be the cause of the problem. To test this hypothesis I decided to change the multicast group to a link-local group, because those groups should always be flooded within a VLAN regardless of IGMP snooping. It is a bit of a hack, and clearly it isn’t a real solution for the underlying multicast problem. But at least implementing this workaround allowed me to further concentrate on LISP rather than an obscure multicast issue. I am hoping that this is a VMware vSwitch problem, rather than an OTV multicast problem, but further testing is needed to pinpoint the issue.

So with the proper LISP  configuration in place we can now test the traceroutes from VM-3 to VM-1 and VM-2 again and see if the path was optimized:

Optimized traceroute

The two different RLOC addresses for the VMs can also be verified in the LISP map-cache on the branch router branch-xtr:

branch-xtr#show ip lisp map-cache 
LISP IPv4 Mapping Cache for EID-table default (IID 0), 3 entries

0.0.0.0/0, uptime: 00:47:51, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
192.168.200.1/32, uptime: 00:45:47, expires: 23:54:47, via map-reply, complete
  Locator       Uptime    State      Pri/Wgt
  10.200.200.3  00:05:12  up          10/50 
192.168.200.2/32, uptime: 00:46:54, expires: 23:55:38, via map-reply, complete
  Locator       Uptime    State      Pri/Wgt
  10.200.200.4  00:35:25  up          10/50

As a final test I move VM-1 from DC-1 to DC-2. Due to the way the lab is set up this is not a vMotion, but simply a change of port-group from VLAN 201 to VLAN 202. When I trace again after the move I confirm that the traceroute now goes through router dc2-xtr (10.200.200.4). During the move I also ran a continuous ping from VM-3 to VM-1 and I only lost a couple of packets. Of course, the LISP map-cache on router branch-xtr also reflects the change in RLOC IP address:

branch-xtr#show ip lisp map-cache 192.168.200.1
LISP IPv4 Mapping Cache for EID-table default (IID 0), 3 entries

192.168.200.1/32, uptime: 00:53:55, expires: 23:59:16, via map-reply, complete
  Sources: map-reply
  State: complete, last modified: 00:00:43, map-source: 10.200.200.4
  Active, Packets out: 195 (~ 00:01:56 ago)
  Locator       Uptime    State      Pri/Wgt
  10.200.200.4  00:00:43  up          10/50 
    Last up-down state change:         00:00:43, state change count: 1
    Last route reachability change:    never, state change count: 0
    Last priority / weight change:     never/never
    RLOC-probing loc-status algorithm:
      Last RLOC-probe sent:            never

At this point, we have a working configuration combining OTV with LISP mobility, which is a good base for further experimentation with these protocols. Despite the multicast problems that I experienced, I feel that a virtual lab setup with a couple of CSR 1000v’s is a nice addition to the toolbox for testing advanced routing and data center technologies like LISP and OTV.

Note: For those that want to try this out in their own labs, I published my complete configurations for reference here.

41 thoughts on “OTV and LISP on the CSR 1000v

  1. Hi Tom ,

    Erg leuke blog ! Ik was zelf al aan het testen met de CSR en 1000V maar dit ga ik ook zeker proberen !

    Groeten
    Fred

  2. Thanks for the effort on this blog, I was just thinking the same thing as my first CSR 1000v project. So google brought me here.

    I was able to get OTV working in my lab using your config, and also with Adjacency server in unicast mode.

    Having issues getting OTV to continue working when I bring up a 3rd site at my corporate data center on a spare ESXi host.

    Apparently OTV’s ISIS uses IIH padding to 1422 byte PDU’s for the hello packets, and I tried setting the otv isis lsp-mtu to 1000, to force it to drop down, and well no change.

    Then I set the CLNS Mtu on the join interface and the overlay interface, and the OTV came up, but I couldn’t get traffic to pass across it.

    Its weird like the 3rd site at the Data Center became the root of some sort of spanning tree and if 2 way traffic wasn’t possible between my lab and that site, then traffic between my two sites at the lab stopped working also. Which took me off guard. As soon as I shut down the data center overlay interface the rest of the OTV network works perfectly…

    So I need a good config for using this with a less than 1500 byte MTU on the join interface, in order to lab it across my VPN tunnel.

    I already set the IPSEC tunnel to clear the DF-bit, and configured otv fragmentation, but that doesn’t help. Apparently my packets arrive out of order or something maybe…

    • Hi Brian,

      Glad to see this post was useful to somebody else getting their feet wet with the CSR 1000v.

      I just quickly labbed up a simple three site scenario (all sites on the same vSwitch, no IPSec, no MTU tweaking) and that worked fine, both in multicast and unicast mode. Of course packets larger than 1458 bytes won’t go across the overlay, because of the 1500 byte MTU on the join interface. Other than that, OTV seems to work fine in the straightforward three-DC case.

      So it does indeed seem that your IPSec VPN setup is causing the trouble, not OTV itself. Are you configuring the IPSec tunnels on the CSR? Or is the tunnelling and encryption performed on a separate router behind the join interface?

      To be honest, what you are trying to achieve probably isn’t even possible to begin with. As far as I know, the transport network for OTV should simply have an MTU that is 42 bytes larger than the MTU on your inside interfaces. So to me it looks like carrying OTV across an IPSec-based network with a significantly smaller MTU won’t be possible. However, it is always worth giving it a try anyway, to better understand the exact limitations of the technology. Also it sounds like you almost got it to work, in the sense that you at least managed to the OTV control plane online. If you want you can unicast me your configurations and I can have a look to see if I can get it to work, or if I can spot where it goes wrong.

  3. Fantastic Blog! I have not attempted the LISP portion yet, but this write up was just what I needed to get OTV working on my lab ESXi host. Now I can demo this to customers and also use it for preparing for the CCNP Data Center.

    Thanks for taking the time to document this. Like Brian, I wanted to use the CSR to configure all of this stuff to learn, and google led me to your blog. This saved me A LOT of time.

    Not sure what your vCenter configuration looks like, but I configured this as an vApp called OTV, so that I can turn on my OTV lab when I need it and power it off when I don’t.

    Another thing I ran into was all vSwitches need to be set to allow promiscuous mode. I ran into issues getting things to work under the default vSwitch settings which reject promiscuous mode. I don’t recall if you mentioned that, but if not, I thought I would.

    -Brantley

    • Hi Brantley,

      Thanks for the positive feedback! It is nice to hear that my post was helpful in getting you started with OTV on the CSR.

      I also appreciate the note about promiscuous mode on the vSwitch. I didn’t actually notice that this was a requirement for the lab to run, because I already had my vSwitch set to promiscuous.
      I use the same ESXi host to run a nested ESXi environment, which also requires promiscuous mode. So for me it just worked from the start. I’ll have to experiment a bit to see which part of the setup breaks if I turn off promiscuous mode.

      Were you able to pinpoint exactly which functionality was lost without promiscuous mode?

      • Hi Brantley,

        I retested the lab with promiscuous mode set to “reject” on my vSwitch and indeed, as expected based on your feedback, the ping from VM-1 to VM-2 fails.

        The OTV control plane between the CSRs is still functional and also ARP between the VMs still works, but the ICMP echo from VM-1 to VM-2 never makes it across the overlay.

        The reason for this behavior is that these frames are addressed to the VM-2 MAC address, which is not associated with the CSR. So without promiscuous mode, the CSR never sees these frames and as a result, it cannot forward them across the overlay.

        I added a note to the post to point out this issue. Thank you very much for bringing this to my attention and helping me to improve this post!

  4. This is a great write-up — thank you for all your efforts.

    In regards to Promiscuous mode on the vSwitch — it seems that disabling this prevents OTV from functioning, so definitely make a note that promiscuous mode must be enabled. As a side note, and I believe it’s because of the promiscuous mode, but I got side tracked digging into what I thought was a configuration problem because I was seeing duplicate ping responses no matter what I targeted. Turns out that Linux boxes will show you the duplicate responses, but Windows boxes wont! So in the end my OTV was working properly the whole time, even though, as you mentioned, the peculiarities of using the VM environment mean it’s not exactly the same as if you were using dedicated hardware.

    • Hi Brandon,

      Glad you liked my write-up!

      Thanks for confirming the issue with promiscuous mode. I just retested the lab and added a note to point out this issue.

      With regards to the duplicate ping responses: When I retested the lab, I also checked to see if I could replicate this issue, but I didn’t see any packet duplication. I took a sniffer trace on VLAN 201 and I only see a single ICMP echo-reply coming back from VM-2 for every ICMP echo sent by VM-1. I am wondering what is happening in your setup. Do you have any idea which device is generating the duplicate ping responses?

      Thanks for helping me improve this post!

      • Hi Tom,

        It seems that my xTR routers are the source the duplicates, and I can “resolve” the issue by disabling lisp mobility on the interfaces facing that network. But I’m still curious as to why you aren’t seeing the dups in your setup. I’ll keep digging…

        On another note, the command reference mentions that all LISP VM-router interfaces (the ones configured with a dynamic-eid policy) must have the same MAC address — did you configure a static MAC on both DC1/DC2-XTR interfaces?

        • Hi Brandon,

          It turns out that the reason that I didn’t see the duplicates during my retest was quite simple: Laziness 🙂

          Because I was focusing on the “OTV doesn’t work without promiscuous mode” aspect, I only turned up the OTV routers, not the LISP xTRs. So if your xTRs are generating the duplicate packets then it makes sense that I didn’t see them in the simple OTV scenario.

          I’ll do a proper retest with the complete setup and let you know if I also get the duplicates in that case.

          With regards to the static MAC address: I didn’t configure static MACs in my lab. I would have noted that in the post, if I had.

          However, I am wondering whether this is really a strict requirement for the LISP mobility, or just something that you would do to keep the default gateway MAC address identical so you don’t have to wait for ARP timeout on the VM after the move.

          In my case having an identical HSRP configuration on both sites combined with the built-in HSRP filtering in the CSR OTV implementation ensures that the default gateway MAC is identical on both sites. For that reason, I did not bother adjusting the actual interface MACs.
          I may have overlooked something there, so if you could point me to the statement in the command reference I’ll have a look.

          I’ll get back to you with more feedback as soon I have done a proper retest.

          • Hi Brandon,

            I repeated my tests with the complete lab setup, including the xTRs, but I am still not seeing any packet duplication.

            It looks like there is some subtle difference in your lab setup. I added a note to the end of my post with a link to a zip file that contains my complete configurations for reference. You could have a look at them and see if you can spot any differences with your configurations.

  5. Hi Tom,

    Thanks for putting those configs up. I examined your configs and couldn’t find any meaningful difference that would cause the duplicate packets.

    I’m running ESX 5.1U1, using the latest CSR 1000V release (3.10.0s). It occurred to me that I could be running different code, so I went and downloaded the CSR version that predated your post (3.9.0aS) and copied my exact XTR config onto those devices. And what do you know — no duplicates.

    So now all of the routers except the XTR’s are running 3.10.0s, and the XTR’s have 3.9.0aS. I looked over the release notes and I couldn’t find anything between the two versions that I thought could cause the behavior I’m seeing. I wanted to try 3.9.1S as well, but I was unable to download it without a contract.

    So again, thanks for digging into this problem with me. At this point I’m going to chalk it up to a code difference, although I don’t have a definitive cause.

    • Hi Brandon,

      My routers were indeed on 3.9.0aS, so that is consistent with what you’re seeing. To me it looks like you’re hitting some bug that was introduced in the 3.10.0S code.

      From a technology standpoint I can’t come up with any good reason why you would want the xTR to replicate packets, so the fact that it doesn’t do this on 3.9.0aS, but then suddenly does on 3.10.0S makes me think that this behavior is simply a bug in the code.

      It may be worth retesting when the next release comes out, but for now this seems to be the best explanation.

      • Its the promiscious Port Group. You can’t have anything else in that promiscous group with your OTV Internal. You have to use a Layer 2 switch to bridge the port groups. Promiscuous mode is like a multi port span session. You see every packet on every interface inbound and outbound for every other port.

        • Ah, okay. That’s good to know.

          I hadn’t really thought of the port-group as a multi-port span session, but that makes sense and also explains the potential for creating bridging loops. Thanks for digging into this issue and providing feedback!

  6. Yes you have to have the Port Group in Promiscuous Mode. Also you only want the OTV Internal Interface and a single Uplink interface on that port group. If you try to do this in a Cisco UCS environment and you have 2 uplinks, you will create a network loop.

    The reason you saw the HSRP virtual Mac Address on the OTV was because you had other interfaces from other VmWare hosts including your HSRP interface on that same port group. Promiscuous mode is like a Port Mirror, you only want your CSR and your UPLink to be “Mirrored” any other interfaces will confuse the Mac Table. I have my OTV in production and we did DR testing today. I have been fighting the promiscuous mode issues for a while. I now see the value of the Nexus 1000v… However with just the OTV-Internal and the single Uplink on the port group the CSR works fine.

    I am running my CSR from UCS Chassis to UCS Chassis across MPLS. I have solved MTU issue with Fragmentation, and I have solved unicast core with OTV Adjacency server. Works Great!! Who needs a Nexus 7k! 🙂

    • Cool! Very nice to see somebody actually run OTV on the CSR in production!

      Also, great job on solving the fragmentation and promiscuous mode issues! It looks like the CSR OTV solution is actually even more flexible than I thought it to be.

      Many thanks for digging into the issues and providing such great feedback!

      • Well Lets be honest here. I went to Google and Searched for CSR OTV and it brought me here 🙂 So without your blog I might never have gotten it to run in production. We were under a deadline to solve some problems and this was the best solution to meet our deadline. All the other engineers think I’m crazy for doing it, but hey, someone has to try it right 🙂

        • An update, I have moved one of the Call Managers to the DR site, replicating DB to sub over OTV. We also did a partial failover test this weekend, failing 12 of our servers and running them from DR while they used resources in Primary site.

  7. Hi,

    I’m doing some LISP on CSR1000v proof of concept labs with production PxTRs and MS/MRs.
    Configuring the CSR1000v with the “ip tcp adjust-mss 1360” command on the interface did not seem to change the MSS in the TCP packets.
    Has anyone experimented with MSS adjustment on the CSR1000v? I’m still investigating and doing some packet captures to see what is happening.
    From a PC behind the CSR, I would have most of the cisco.com pages load, but not others. Sites like yahoo.com and google.com would not load at all.
    I do not believe that it was a LISP transport issue.

    Regards,
    Lawrence Suciu

    • Hi Lawrence,

      I personally never tried the combination of LISP and TCP MSS adjustment. However, it is really hard to judge why this might not work without knowing a bit more about your configuration.

      Are you doing just LISP? Or LISP and OTV? And on which interface are you configuring the MSS adjustment. It would be helpful to have some sort of a diagram and the relevant pieces of the configuration.

      Regards,

      Tom

  8. hi,
    using hsrp option “use-bia”,i could ping to hsrp vip without permitting promiscuous mode.
    esxi5.5
    IOS XE 3.11.00

  9. Hi guys

    would you please send me or post the esx configuration as sceanshot to build the lab same as yours

    Thanks in advance

    Ali

    • Hi Ali,

      Which part of the configuration are you looking for exactly? It is a bit hard to capture all the details of an ESXi configuration in a single screenshot. Could you maybe specify which parts of my setup are unclear or missing?

      Just to give you a general idea, I am running a pretty basic setup: I have installed ESXi free on a Dell desktop with 16 GB of RAM and a quad-core i5 processor. I use iSCSI and NFS to connect to the data stores on my Synology NAS.
      Because I am using ESXi free, the networking setup isn’t very fancy either. Just a vSwitch with a bunch of port groups to create some VLANs to connect my CSRs to. In some cases I use a port group configured for “VLAN 4095” to allow 802.1Q trunking from the port on the CSR.

      Please let me know if there are specific details that you’re looking for.

      Tom

  10. Hi Tom

    thanks for ur kind reply

    just the vSwitches screen shots configuration & the integration with CSR

    Thanks in advance

    Ali

  11. Thanks for the excellent write up. Just passed my CCIE DC written exam and i was wondering how i was going to lab some of the items.

  12. GREAT article and just as good dialog in the comments.

    Tom, could you send over those same screenshots for the port groups, I’m just interested how to place the connections.

    I have all 5 csr’s built on a pair of ESX hosts , with 3750 switches between so I can do about any setup needed.

    Thanks, just finished ccie DC Lab so , anxious to just tinker with LISP for awhile.

    • Hi Eric,

      Thanks for the positive feedback! I am always happy to hear that the post was useful to somebody.

      I sent the screenshots as requested. Let me know if you have further questions.

      Tom

  13. Tom,

    Great info here I have referred to your blog many times in constructing an OTV config in my lab. However one item that perhaps should be checked is that in my lab multicast forwarding would not function until the command:
    ip igmp snooping querier was added to each router. In addition the other commands you listed:
    ip multicast-routing distributed
    !
    interface GigabitEthernet1
    ip pim passive
    ip igmp version 3

    did not appear to have any effect one way or another on multicast forwarding.

    Maybe something is changing in the code from revision to revision (I’m using 3.12) were you actually able to demonstrate bidirectional multicast forwarding with your config?

    • Hi Alex,

      That is interesting. As far as I know “ip pim passive” should also enable IGMP on the interface. (And from my note about the difference between CSR and Nexus it seems like I tested without the “ip pim passive” first and then added it to make it work).
      However, it is too long ago that I ran this setup to know for sure if I had bidirectional multicast working or not.

      Was your lab setup exactly the same as mine (everything on a single vSwitch)? Or did you have physical switches somewhere in the path?
      I am wondering whether this has anything to do with IGMP snooping operation in the L2 switched part of the network.

      Regards,

      Tom

        • Okay,

          Then the possible culprit could be IGMP snooping on the physical switch. Your situation is different from mine, because I had both CSRs on the same vSwitch, which means I didn’t have to deal with multicast forwarding on the physical network. I still think that “ip pim passive” should also enable the IGMP querier function, and therefore IGMP snooping should have operated correctly, but it is hard to say without rebuilding the lab and debugging the multicast operation. Also, IGMP snooping operation is implemented differently on different switches, so it may depend on the type of physical switch that you are using in your lab.

          This would be worth digging into if at some point I rebuild my lab the same way you set it up, using two different hosts.

          Regards,

          Tom

  14. Hey there,

    Great write-up and article. I actually just set up a lab this past week with OTV/LISP on CSR1000v.

    Did you have any strange issues in your lab with guest traffic to the VM breaking once you turned up the overlay? A debug ip packet just shows my CSR stops receiving traffic the minute I turn up the overlay, which is bizarre since the join interface is totally separate from my internal facing SI.

    The join interfaces and the internal interfaces of the CSRs are on different port-groups and my “mobility” is facilitated by a port-group change on the guest. Promiscuous mode doesn’t seem to fix it, either.

    Thanks, hopefully someone has run in to it. 🙂

    • Hi Ryan,

      No, I haven’t seen that behavior. Hopefully somebody else has run into it and can provide you some clues. It’s been quite a while since I last tested this lab.

      Regards,

      Tom

  15. Hi Tom,

    Thank you for this great and helpful blog post.
    I’m trying to setup a lab with one router acting as MS/MR and xTR, and one router which act as xTR only.
    You’ve mentioned that configuring a router to act as MS/MR and xTR is possible, but it seems that I can’t get it working, no matter what.
    Do you know if there’s some special configuration need or maybe you can assist me with the configuration?

  16. This is a great article. Utilizing LISP with OTV and VxLAN will eliminate the sub-optimal traffic patterns and is a must if you are considering L2 stretch. I have build this in production and if anyone is interested send me an email: geekmicrochip@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *