Re: Ping issue / ARP / Mac address table

pboegli77 · ‎08-23-2013

I've two 4500X configured in VSS mode and I use it as my main gateway.

I try to ping 192.168.125.247 (vlan 125) from a machine in vlan 120 (192.168.120.164 / 255.255.254.0 / 192.168.120.254 ) and the ping is not working.

The 4500X cluster is the gateway for these two vlans :

interface Vlan120

ip address 192.168.120.252 255.255.254.0

standby version 2

standby 1 ip 192.168.120.254

standby 1 priority 110

standby 1 preempt

interface Vlan125

ip address 192.168.125.252 255.255.255.0

standby version 2

standby 1 ip 192.168.125.254

standby 1 priority 110

standby 1 preempt

I can see an arp entry in the 4500X :

gw01#sh arp 192.168.125.247

Protocol Address Age (min) Hardware Addr Type Interface

Internet 192.168.125.247 32 00e0.8615.8775 ARPA Vlan125

The mac address is correct but I can not see it in the mac address-table :

gw01#sh mac address-table address 00e0.8615.8775

No entries present.

if I try this command on my 4500X, it's working :

gw01#ping 192.168.125.247 source 192.168.120.252

Type escape sequence to abort.

Sending 5, 100-byte ICMP Echos to 192.168.125.247, timeout is 2 seconds:

Packet sent with a source address of 192.168.120.252

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/11/36 ms

Then I can see the mac in the address-table :

gw01a#sh mac address-table address 00e0.8615.8775

Unicast Entries

vlan mac address type protocols port

---------+---------------+--------+---------------------+-------------------------

125 00e0.8615.8775 dynamic ip,ipx,assigned,other Port-channel13

Now, I can also ping from my machine in vlan 120 but after 5 minutes, this entry will disapear and I will not be able to ping it from my machine in the vlan 120.

If I clear the arp entry it's also working for 5 minutes...

Any idea ?

Seb Rupik · ‎08-23-2013

Hi there,

This is due to a diffenrence between the CAM and ARP aging timer. Default ARP is 4 hours, where as CAM is 5 minutes as you point out.

Try using the 'mac-address-table aging-timer' :

http://www.cisco.com/en/US/docs/ios/lanswitch/command/reference/lsw_m1.html#wp1141826

...either globaly or per-VLAN.

cheers,

Seb.

JACQUES DU PLESSIS · ‎08-23-2013

Hi, we had a very similiar problem a while back, but with vpc and nexus 7000s. One VLAN, and one VLAN only, certain hosts could not communicate out the subnet. I also picked up clearing the arp table of their entry fixes the problem for about 5 minutes. One 7000 could not ping them at all. TAC had us do packet captures to see where the packets are dropping, and it turned out one 7000 was not forwarding the packets over its peer link. But at the end of the day the problem could only be resolved by an IOS upgrade. They never identified the exact problem or issue, but presumably it was a bug. This just started out of nowhere, no config changes or anything.

So try tracing where the packets are getting lost. That might help you track the problem.

Jacques

pboegli77 · ‎08-27-2013

My VSS cluster is connected to 2 Nexus 5548P (VPC) (Port-channel13)

Now I've changed mac address-table aging to 14400 for vlan 125 and everything is ok, but for me, this is not normal.

I'll check later when I've more time...

Thynk you guys

r.waning · ‎08-30-2013

I think we might have got exactly the same problem.

This is my theory:

- The switches are connected via etherchannels

- If you could follow the interface which is followed within the etherchannel, I think the source client is coming in on the first 4500x. The destination client is coming in on the second 4500x.

- It is most likely a destination with little network traffic. If it would connect to many other devices (dhcp, ad, dns etc) it will probably also sent a packet to the other interface within the etherchannel, and that solves the problem.

A ping on the 4500x also solves the problem temporarily.

Question:

Unfortunately I can not check my theory because on the VSS 4500 I can not determine the interface which is used within an etherchannel.

This is important because maybe it only goes wrong when also the path from the 4500x to the source and destination is on a different interface.

r.waning · ‎08-30-2013

After some tests I had to adjust mij theory:

If a device only sends packets via the interface of an etherchannel to the passive VSS switch, the mac address entry is lost in mac address table.

If a ping from a source is via the active 4500x it works. If a ping from a source via the passive 4500x it doesn't come up.

To be continued...

pboegli77 · ‎09-04-2013

That's interresting Rudi !

I also see this open Caveats for Cisco IOS XE Release :

Packets that are routed on the same Layer 3 interface (or SVI) that entered on are dropped if received on the VSS standby switch.

Workaround: None. CSCub63571

Guru Mysoruu · ‎09-04-2013

what is the default gateway in the pc (4500 switch or firewall).

You are able to ping vlan 125 with source addresss of vlan 120 in 4500 switch,

try this in your pc in command prompt with admin privilage and check.

route add 192.168.125.252 mask 255.255.255.0 192.168.120.252 -p

r.waning · ‎09-05-2013

I have done some additional tests:

New theory:

- Packet coming in on the passive switch which can be directly sent to an interface on the passive switch don't update the mac address table.

Result:

- If incoming packet on the passive switch has destination to lost mac address it is dropped. (prevents flooding)

- If incoming packet on the active switch has destination to lost mac address it is sent to all interface (This can be seen via a wireshark pc on the destination vlan. 1 ICMP packet is seen, coming from source to destination ip address.)

Question is if really no incoming packets on the passive switch updates the mac address table, or that it depends on more variables in the path from source to destination.

r.waning · ‎09-05-2013

Maybe to point which might be relevant

- The passive 4500X was completely broken, it was replaced by a new one.

- We have an additional module:

2 8 10GE SFP+ C4KX-NM-8

I don't know if only these interfaces have problems

r.waning · ‎09-10-2013

Problem found:

sw4500#sho platform hardware floodset vlan 4

Executing the command on VSS member switch role = VSS Active, id = 1

Vlan 4:

Unicast Floodset:

FloodToCpu: -

RetIndex: 4

Po16(848) Po21(853) Po23(855) Po24(856) Po31(863) Po51(883) Po14(846)

Executing the command on VSS member switch role = VSS Standby, id = 2

Vlan 4:

Unicast Floodset:

FloodToCpu: -

RetIndex: 4

PROBLEM NO INTERFACES

Workaround:

Solution might be: add port in the specific vlan which triggers that in the portlist above a new port is added.

pboegli77 · ‎03-18-2014

I have corrected this issue by adding two links between my 4500X Cluster and my two Nexus. Now, each 4500X is connected to each Nexus (So, I've four links in the etherchannel instead of two).

I remove also the following command mac address-table aging-time 14400 vlan 125 and everything is working.

I think that the problem was that some packets had to cross the VSL links between 4500X when I had only two links.

I also upgrade the 4500X cluster to version 03.04.03.SG

kataan003 · ‎11-20-2017

I know this is an old thread but this might help someone.

We also had this problem. For us it was a software bug. See here https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvb78700/?reffering_site=dumpcr

Workaround 1 and 2 worked for us however, Cisco recommended an IOS upgrade. Which we will do. I'll post after the upgrade with results.

Sauron · ‎12-04-2017

Hi Kataan

Did you do the IOS upgrade yet ? if so which version did Cisco recommend and what was the outcome of the upgrade.

aaronleech · ‎05-15-2018

Hey Sauron,

Not sure if you ever tried anything - but we have had this problem with our 4500X VSS setups. Upgrading to Cisco's recommended release 3.6.7E(MD) has resolved our issues.