cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3842
Views
0
Helpful
9
Replies

ACI Ep learning issue in vpc mode

compterds
Level 1
Level 1

Hello evereyone,

 

Recently, we had some virtual machine on hypervisor connected to aci fabric (3.0.2h) in vpc to 2 leaves (93108)

When booting up new virtual machines, the private ip and mac are learned but only on one leaf (via show endpoint ip) causing some connectivity issue 

==> When shutting the link toward leaf where ep table is not filled the connectivity is restored.

 

I can see data coming on both leaves when both link are up with elam and tcpdump tool.

 

The hypervisor has an ip in infrastructure vlan 3967 (integration) and as the virtual machine, ep in this vlan is only learned on one side but from arp table perspective, the ip is shown with 

 

my server : 10.205.232.104 and 48df.3706.c778 and connected to Po29 of my vpc pair

 

1102# show ip arp | grep 48df.3706.c778
10.205.232.104 00:01:25 48df.3706.c778 vlan3

==> vlan3 is in fact vlan 3967

 

1102# show endpoint ip 10.205.232.104
Legend:
s - arp O - peer-attached a - local-aged S - static
V - vpc-attached p - peer-aged M - span L - local
B - bounce H - vtep
+-----------------------------------+---------------+-----------------+--------------+-------------+
VLAN/ Encap MAC Address MAC Info/ Interface
Domain VLAN IP Address IP Info
+-----------------------------------+---------------+-----------------+--------------+-------------+

1102#

 

1102# show mac address-table address 48df.3706.c778
Legend:
* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
age - seconds since last seen,+ - primary entry using vPC Peer-Link,
(T) - True, (F) - False
VLAN MAC Address Type age Secure NTFY Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 36 48df.3706.c778 dynamic - F F po29
* 21 48df.3706.c778 dynamic - F F po29
* 3 48df.3706.c778 dynamic - F F po29

 

Advanced output from ep learning in vlan 3967 for my server :

 

MAC : 48df.3706.c778 ::: Num IPs : 0
Vlan id : 3 ::: Vlan vnid : 16777209
::: BD vnid : 16777209
VRF name : overlay-1 ::: VRF vnid : 16777199
phy if : 0x1600001c ::: tunnel if : 0 ::: Interface : port-channel29
Ref count : 4 ::: sclass : 0
Timestamp : 02/11/1971 06:09:25.581000
::: Learns Src: EPM
EP Flags : local|vPC|MAC|sclass|timer|
Aging: Timer-type : HT ::: Timeout-left : 767 ::: Hit-bit : Yes ::: Timer-reset count : 1412

PD handles:
[L2]: Hdl : 0xd291609 ::: Hit: Yes
::::

 

As you will noticed, we don't have any IP associated with this mac on vlan 3 (3967) and the line L3 is missing.

 

 

For me it's strange, because from arp perspective the ip is seen in the good vlan and with good mac address.

 

Best regards,

 

Thank you all and best wishes for the festive season 

 

Yoann

 

 

9 Replies 9

Jayesh Singh
Cisco Employee
Cisco Employee

Hi Yoann,

 

There are few queries if you could assist me please,

1. Do you have VMM integration in this setup? If yes, then which one? eg. SCVMM, Vcenter etc

2. Also, do we have AVE installed in this setup with VMM integration?

3. Infra VLAN 3967 is for control communication between fabric nodes to bring up the fabric, unable to understand how the hypervisors belong that VLAN?

 

Regards,

Jayesh

 

Hello,

 

It's an openstack integration (DVS). (no AVE)

It's using opflex for pushing information from openstack to aci.

 

Each hypervisor will receive an ip address into the vlan 3967, and will provide lot of visibility directly from the fabric.Basically it builds a tunnel between leaves and compute (hypervisor).

 

Until now everything went well, but now seems a bit weird.

As we attach our hypervisor in vpc, both member of vpc pair should have the EP table filled in but it appears that 1102 is not "learning" hypervisor ip at all.

 

Could it be the vpc pair which having an issue ? 

 

Yoann

 

 

 

Hi,

So do we have any switch(blade switch) in between or physical server is directly connected to leaves in VPC?

 

We can verify few things,

1. VPC protection group policy (whether we have a policy with both leaves in same VPC domain)

2. Status of VPC and port channel on both leaves

3. Vswitch configuration in VMM domain

 

IP of hypervisors must be from fabric TEP Pool as per my understanding, tunnel information can be seen in GUI:

Fabric -> Inventory -> Pod# -> Leaf# -> Interfaces -> tunnel interfaces (probably the last ones)

 

I have no working experience with openstack integration but we can definitely have basic configs verified.

 

Regards,

Jayesh

Thanks for your reply.

 

It's a direct link between hypervisor and aci fabric. 2x 10G link in vpc

We've already more than 15 hypervisor running on our vpc pair (not a first installation)

vpc and port channel are good on both leaves (showing good lacp partner and so on)

We've 3 vlans par hypervisor (2 for openstack services purpose, 1 for TEP pool connectivity between fabric and compute)

The 2 openstack vlans are working : we're seeing both mac and ip addresses learned on leaf.

But for the infrastructure vlan, we're only learning the mac address whereas there is communication between leaves and hypervisor on ip 10.205.232.104 as I showed before from ELAM capture.

 

For me, it definetely should learn the ip address on both vpc member (it is the case for our others hypervisor) and it is learning correctly the ip on the first link toward our 1101 leaf.

 

Yoann 

Hi Yoann,

 

Thanks for the details.

 

Okay, so what I understand is data vlan are working fine, its just infra segment with which we are having trouble..

Host connects to leaf switches via VPC (port-channel at server end) on infra vlan and gets the IP address from TEP pool, and forms Vxlan tunnel with the leaf switches.

 

However, you are just seeing host being learnt from single leaf and not the other one. Can you verify the status of tunnel interfaces on the leaf switches corresponding to connectivity with the hypervisor?

Tunnel information can be seen in GUI:

Fabric -> Inventory -> Pod# -> Leaf# -> Interfaces -> Tunnel interfaces (probably the last ones)

 

Regards,

Jayesh

Hello,

 

Yeah that's a good point and that was something I checked in the beginning : 

 

tunnel_interface_toward_compute.png

Really strange behavior to be honest, everything is "green" and there shouldn't be any issue of ep learning with our vpc pair (as it is working correctly for 15 other servers)

 

Yoann

csco10387876
Level 1
Level 1

Good morning,

 

What is the version of openstack ?

 

I faced a lot of issue with opflex in 2018 and it is now stable running code 3.2 on both aci and opflex plugin version (ml2)

I think the best for you would be to open a case with TAC as release before 3.2 all had some nasty issue with opflex/openstack and after upgrading to 3.2 you might have to run a script on all your openstack node to refresh the EP objects.

 

Hope this helps.

 

Morning,

We're running an old feature of opflex. (ml2)
As you noticed, we had a lot of issue since the beginning, it seems that opflex wasn't stable at all.
We'll be upgrading on a new release, but that's not simple as we're in production.

For now, I'd like to understand where the issue come from ? Why one leaf in vpc domain is not learning the EP ? (whereas no issue on the other member)

Thanks for the advice

Yoann

 

Hi again, 

 

Found something regarding epm/epmc trace when I reconfigured one of our faulty hypervisor :

 

I reconfigured the server which has mac 1402.ec8b.5b24 and ip 10.205.232.106

(the issue was that fabric doesnt learn the ip on that specific leaf which is in vpc with other leaf that correctly learned ip address of 10.205.232.106)

 

I've done a reconfiguration of vpc toward this server 

on epm-trace.txt

[2019 Jan  8 17:38:16.953143426:284488422:epmc_sdk_learn_display:800:t]

Notify: key-type=L3 V4 vrf=1 mac=1402.ec8b.5b24

         ip-type=0x0 ip=10.205.232.106

 

On epmc-trace.txt

[2019 Jan  8 17:38:06.870838652:1564666194:epm_debug_dump_epm_ep_req:359:t] log_collect_ep_event

req_src=MTS_SAP_COOP

optype = UPD; mac = 1402.ec8b.5b24; num_ips = 1

ip[0] = 10.205.232.106

--

 

But EP table is empty

BEWF1102# show endpoint ip 10.205.232.106

Legend:

s - arp              O - peer-attached    a - local-aged       S - static

V - vpc-attached     p - peer-aged        M - span             L - local

B - bounce           H - vtep

+-----------------------------------+---------------+-----------------+--------------+-------------+

      VLAN/                           Encap           MAC Address       MAC Info/       Interface

      Domain                          VLAN            IP Address        IP Info

+-----------------------------------+---------------+-----------------+--------------+-------------+

 

what could prevent the fabric to learn an EP from dataplane perspective ? How does it work with opflex integration ? 

 

Yoann

Review Cisco Networking for a $25 gift card

Save 25% on Day-2 Operations Add-On License