cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
953
Views
0
Helpful
2
Replies

OpenStack + ACI. Help me to understand what just happened

mmacdonald70
Level 1
Level 1

We are setting up an Openstack environment (Juno) and using ACI as the network transport.  We originally were using the plugin but decided against it and just treat the Openstack servers are bare metal.

We just had an outage that I don't understand Openstack enough (and the openstack people don't understand network enough) to understand.  I'm hoping that somebody has an idea of what might have happened.

The outage symptoms were that we stopped being able to communicate with Openstack virtual machines from outside the openstack.  Then we stopped being able to communicate with the Leaf nodes as well.

Eventually I found that there were a large number of IP addresses from outside the ACI and Openstack that had been discovered as Endpoints in the Floating IP EPG.  These IP addresses all showed up with the same mac address as the floating IPs from one of our newtron nodes.

The issue was eventually fixed by enabling "enforce subnet check" in the floating IP BD and clearing all the endpoints but I can't seem to figure out why ACI would be learning external IP addresses from the Newtron nodes.

2 Replies 2

Jason Williams
Level 1
Level 1

In short, this sounds like a loop in the network. 

If external IPs were learned as endpoints in an EPG, then traffic sourced from that external IP entered an ACI leaf on that EPG/VLAN. That is the only way for an endpoint to be learned.

Using the subnet check feature might be a workaround, but it doesn't mean that the issue is not ongoing.

If you do not have anything documented about the unexpected endpoint learning info, then you could try checking the EP Tracker and lookup an external IP which was learned on an EPG. If the search in EP Tracker is successful, then it should provide you with MAC address, EPG, and the interface which it was learned on. Once you know the interface of where external IP endpoint learning happened, then check the neighboring device. EP Tracker can be found in the GUI under Operations > EP Tracker. 

If the incident occurred recently (last night or today), then immediately collect tech supports on the leaf switches (which learned the external IP as endpoints). If the EPM-trace.txt logs have not rolled over, then you will details about how the endpoint was learned. To collect tech supports on a switch, you could use the techsupport local command on each leaf switch. Takes about 5 minutes or so to complete. The tech support files will be located on the leaf switch under the /data/techsupport directory. If you decide to go down this route, then you may need assistance from TAC in parsing through EPM logs. 

Thanks

Jason

Thanks.  EP tracker didn't list the endpoint anymore but I since during the issue "show endpoint" listed all the EPs at a specific Openstack Neutron server, I am reasonably sure that this is the neighbor that caused the issues.

I am collecting show tech as I type this.  Hopefully it will help but If the issue is caused by Openstack sending traffic with the source IP of an external EP and its own mac address, I'm not sure what help that would be,

Review Cisco Networking for a $25 gift card

Save 25% on Day-2 Operations Add-On License