cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1968
Views
5
Helpful
2
Replies

Nexus 3548 ARP issue

eekman
Level 1
Level 1

We're having some kind of problem with ARP on Nexus 3548. Here is the topology:

topologytopology

The Checkpoint is connected via LACP to the two Nexus switches. When the checkpoint FW tries to ARP request the MAC on Nexus 1 (VLAN interface 99), the following happens:

1. the ARP request is sent out from the checkpoint to Nexus 2, and always on that interface, I'm guessing the load balancing hash on the FW decides on that interface.

2. Nexus 2 recieves the ARP req on Po3, tagged with VLAN 99. It forwards it via Po1 to Nexus 1 (according to debug)

3. Nexus 1 gets the ARP req on Po1, and answers it out on Po3, directly to the FW.

4. FW accepts the reply and the ARP resolution is complete.

 

As long as the above happens all works fine. BUT, occantionally, without any obvious explanation and with completely inconsistent timing, this happens:

 

1. the ARP request is sent out from the checkpoint to Nexus 2, as usual

2. Nexus 2 recieves the ARP req on Po3, tagged with VLAN 99. But it does not forward it! (according to debug ip arp packet)

3. Nexus 1 never gets the ARP req (according to debug), so it never responds. The FW does not get the MAC address for Nexus 1 and all traffic to the switch itself does not work.

 

Here's what I have checked:

1. There is no obvious timing in the failure. It can work fine for hours, it can be down for hours, it can be down a few seconds, it can be faulty 10 times within an hour...

2. This only happens to switch 1, never to switch 2, but that could be explained by ARPs always going to switch 2 first.

3. Traffic THOUGH the switch seems unaffected

4. Switch 2 can always reach and ARP req switch 1. So can other units on the same VLAN. The  difference is that the FW is connected though vpc

5. vpc consistency parameters are all OK, there no change when the error occurs

6. NOTHING is logged when the error occurs. Logging i local, level is warning

7. No spanning tree issues as far as I can see, and no changes when the error occurs

 

Software is 6.0(2)A8

 

I cant find a bug report that matches my situation.

 

Any ideas?

2 Replies 2

islow1303
Level 1
Level 1

We've had similar issues with our nexus 3548 <<6(2)A8>>.  There's a known nexus bug which keeps the CPU usage running sky high.

Try *show processes cpu history* you'll be able to observe the device's CPU usage...due to the software bug my nexus is regularly above 70% on 'CPU% per hour (last 72 hours)'. The only way to solve this issue was by rebooting the nexus (well in my case).

It was a bug. TAC confirmed it. The bug is in 6.0(2)A8(3), and was resolved when we upgraded to 6.0.2(A8)4a.

 

Cisco have updated a bugID on this (earlier I think it said it only affected HSRP):

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvc55268/?reffering_site=dumpcr

Review Cisco Networking products for a $25 gift card