12-13-2020 04:06 PM
I'm in the middle of a migration from Nexus 7K to 5K to Cisco ACI. When we move servers over to copper Leafs, N9K-C93216TC-FXs, most of the time it seams that only one port comes up on the server. The other port remains in the down/not-connected state in ACI. I have to disable/enable the port in ACI to get it to come up. If I move the same server to other ports on the same switch, we're fine. If I move the server to ports on another set of VPC Leafs, one port comes up and the other one has to be bounced. I get the same issue when I move the server back to the original copper ports.
1. In this test the ports were setup as VPC and auto-negotiate.
2. We rebooted the server and it we still had to bounce the port.
3. I switched a down VPC to no auto-negotiate and the downed port came up.
4. I rebooted a leaf and the problem persisted. After the reboot, all connected servers came up with no issue.
5. Our test were with LACP ports and Microsoft active/active with LACP. We need to test this on non LACP ports as well.
I have a ticket open with Cisco and they're seeing an auto-negotiate issue. It seems to be occurring on servers with different model NICs so it's not one model of NIC. We updated the NIC drivers on the server but the problem persisted. We don't have a resolution yet. Once the port is up we have no issues.
Has anyone else seen this issue?
12-13-2020 04:24 PM - edited 12-13-2020 04:26 PM
Hi!
I usually trust the info on Nexus Switches when the port status says "notconnec" or not connected,
which usually indicates that there is no link pulse detected on the link.
If this is reproducable on various different switches and different ports that usually work well..
I would suspect the other end to have the issue here..
What exactly is the other end?
You mentioned Microsoft NIC..
Are there Microsoft Servers installed baremetal or on some Hypervisor like VMWare ESXi?
I would also engage the support of the vendor on the other end, if it is a Cisco UCS Server by coincidence, ask the ACI TAC Engineer to involve a UCS colleague.
BR
Juls
12-13-2020 08:00 PM
12-13-2020 11:46 PM
Hi!
Okay.. then we would need to check all settings of the VPCs.
Can you post all settings of one of the VPCs?
All settings in the policy group would be nice, to see which interface policies are left default and which have custom settings (and hopefully see by the name of the interface policy what setting that is
BR
Juls
12-14-2020 09:39 AM
12-13-2020 07:30 PM
What Policy Group settings do you have configured (which specific policies and settings)? If the Leafs are showing the other port as 'down/not-connect' then that's typically the fault of something on the Server side. How much time between disconnecting the host from the 5K/7K and reconnecting to an ACI Leaf? I'd suggest you test the same move operation with a Server using a static port channel (no LACP) and see if the issue persists. Could be an issue with MS LACP negotiation/timers.
Robert
12-14-2020 09:47 AM
I posted the VPC policy above. It seems like server side to me but it's happened on different servers with different NIC cards. We disconnect from the 7K/5Ks and plug right into the ACI leafs. Less than 5 minutes. We'll test the static port-channel next. Maybe delaying for a short time will resolve it. I did move a server from one ACI set of leafs to another set and experienced the same issue. Once again, it was a disconnect and reconnect. Less than 5 minutes. I'm not on site, but it could be more like 2 minutes.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide