cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1263
Views
0
Helpful
10
Replies

Duplicate MAC (correctly) in two VLAN's, lost packets

We have a weird situation where Motorola devices are attached to our network in multiple points, and two of those devices form virtual IP addresses and use fabricated MAC addresses, and two are duplicated.

These are on isolated VLAN's, so at least in theory (as I understand it) they should not conflict, but they are.

Motorola says they cannot change the MAC's (they reside on part of their voice network for E911 radios that they manage, we just provide the transport). 

Specifically, we have a 3650 that appears to be the problem, running 16.12.05b.  The duplicated MAC addresses are on VLAN's 2982 and 2984, which transit this switch going elsewhere (these run all over a county over microwave links).  The switch appropriately sees them in separate VLAN's as below, and those are the appropriate egress ports to reach each VLAN. 

 

#show mac add | incl 0101
2982    0000.5e00.0101    DYNAMIC     Gi1/0/2
2984    0000.5e00.0101    DYNAMIC     Gi1/1/1

 

Generally we do not even have VLAN IP addresses on these VLAN's, but I have put a temporary one on 2984, and when I ping outbound from this switch toward the 2984 IP that has that virtual IP, it fails intermittently (about half or more of the packets are lost, in bunches).  If I ping an IP that does not have the duplicate MAC no packets are lost. 

We arranged downtime (this is a live, E911 radio traffic network so that is not easy to arrange) and unplugged the 2982 device with that MAC address, and the problem vanished.  So we are certain the duplicate is at fault, though there are a LOT of devices in between these paths that have both VLAN's so it is somewhat speculation that the Cisco 3650 is the culprit, but due to the topology and testing are fairly confident.

Duplicate MAC's are unusual - I get that. But I appear stuck with it.

But on separate VLAN's -- should the switch be keeping these separate, on doing the MAC table lookup should it not do the lookup with the VLAN tag? 

Any way to definitively tell if this switch is the culprit, to somehow catch it in the act with debug?  (Rearranging the topology is not practical). 

No, there are no loops, all the traffic works perfectly other than the IP with the duplicated MAC.

Linwood

1 Accepted Solution

Accepted Solutions

To put this one to bed, we found that this issue was not occurring inside the Cisco's, but was in a microwave link (specifically the one in the diagram from 3650#2 to 9402 (which I think is a 9200, typo).  

We set up our own sniffer and one poor guy had to carry it to various places around the county while we tested, and eventually could see packets entering that microwave link, and not coming out the other end reliably (just occasionally).

That microwave link input from the Cisco feeds a large ring on a NEC iPaso network (this is an iPaso 650), leading both to the [sic] 9402 and also off to a ERP protected ring of sites around the county shown with the blue "Consumers of VLAN 2982", so the built-in switch in front of the microwave radios (five of them if I recall in different paths) is the likely culprit, as it would have in its MAC table both VLAN's. 

Unfortunately NEC sold that business shortly after these were installed, we are unable to get any real support, and so we have no one to work with to figure out the issue. I have admin access but the management interface's MAC address table display shows empty (i.e. clearly the admin display is broken).

We have leaned harder on Motorola and they have "discovered" a way to reset the virtual MAC to be unique, and now it is working.

So I do not know WHY it was failing in the NEC device, could not fix it there, but that is the culprit, not the Cisco. Which is a bit surprising, as we have generally treated those microwaves as a wire -- what goes in (always tagged) comes out.  But apparently not so. 

View solution in original post

10 Replies 10

marce1000
VIP
VIP

 

 - Ref : https://mac-address.alldatafeeds.com/mac-address-lookup/923MQNA0k7 , seems to be related to VRRP , a redundancy protocol  , so it won't harm , 

 M.



-- ' 'Good body every evening' ' this sentence was once spotted on a logo at the entrance of a Weight Watchers Club !

Yeah, I do not think it is incorrectly constructed.  But I am dead certain the problem does away if one of the two devices is removed from the network, so something about the duplicated MAC is at fault. 

I just made yet another pass by hand, and another pass by program (NetDisco) to make sure there's no point at which these connect together, the VLAN's are isolated, and not routed, and all access ports (where untagged traffic could migrate) leave into the Motorola network only.  Plus I think if they did loop I would see spanning tree BPDU errors appearing somewhere. 

I've been looking for a way to debug packet switching on that switch, such as with an access list, but cannot find any. I've turned off any IP CEF options I can find (though they really should not be involved as that is layer 3). 

Anyone?  (Note I went ahead and opened a TAC call also, no word yet.  Motorola has a couple tickets open but just say "we can't change the MAC address". 

Friend' issue in vrrp

And I think there are two l3sw master.

Can I see 

Show vrrp of both l3sw 

The VRRP devices are not on our network, they are downstream on a Motorola switch (actually HP brand) to which we do not have access.

show vrrp all 

shows nothing on any of our switches, including one closest to the HP (Motorola) switch that has one of the VRRP devices (from it there is an HP switch and then the motorola routers forming one of the virtual IP's). The other motorola router forming it is several switches downstream (switches to which we do not have access, belonging to a different county). 

Motorola insists that this all be on two separate, flat VLAN's, each of which has a pair of these VRRP devices (one actually has a two such pairs but one of those pairs does not duplicate any MAC's). 

Picture our topology as a large X, where each VLAN is on one diagonal.  One of the VRRP's are at the top right, one at the bottom right, separate VLANs.  At the center where they cross there are two cisco switches involved, plus one microwave segment and a L2 netgear; on all these devices the traffic should be separated by VLAN ID, and all overt indicators (show mac) shows that it is properly tagged.  But when one cisco (the one closest to the bottom right leg but on the shared area) pings packets are lost. When a cisco further down that same leg pings it is successful, so it is coming from the overlapping segment, but that overlap is many links from the actual VRRP devices.

Motorola of course says "we do this all the time it should work, the problem is in your network".  And of course they may be right.

I have run DEBUG IP PACKET and watched at L3 the pings, and they do select the right egress VLAN.  What I cannot see in debug is what happens at L2, is it selecting the right egress port.  Or is there such a command? 

00-00-5E-00-01-{VRID} (VRRP)
so both MAC represent group 1 of VRRP.
use debug vrrp (hope it work) 
debug vrrp packets

and check if both L3SW send receive VRRP packets.
 
try add IP igmp snooping 

We may be overlapping in responses, but I do not know this area well (or at all). 

These router pairs are not on adjacent Cisco routers, but down inside the Motorola network, joining ours at two different points (one for each pair) and different VLAN's.  We have several routers in between (as well as a lot of microwaves), but nothing adjacent .

We can see the mac address, but I see nothing with vrrp debugging or show vrrp. 

I have done a variety of DEBUG IP PACKET DETAIL and can see IP traffic at layer 3 leaving on the right vlan when I ping. I also can see traffic related to the multicast from the vrrp's, but I do not know how to interpret it (other than I would not expect it to be routable, and am not trying to route it).  10.205.100.248 is the physical IP of one of the routers forming the VRRP pair way downstream in the VLAN from where I captured this.  This does show the correct VLAN. 

Jul 22 12:39:14.278 cdt: FIBfwd-proc: Default:224.0.0.0/24 receive entry
Jul 22 12:39:14.278 cdt: FIBipv4-packet-proc: packet routing failed
Jul 22 12:39:14.278 cdt: IP: s=10.205.100.248 (Vlan2984), d=224.0.0.18 (nil), len 40, unroutable, proto=112
Jul 22 12:39:14.278 cdt: FIBipv4-packet-proc: route packet from Vlan2984 src 10.205.100.248 dst 224.0.0.18
Jul 22 12:39:14.279 cdt: FIBfwd-proc: Default:224.0.0.0/24 receive entry
Jul 22 12:39:14.279 cdt: FIBipv4-packet-proc: packet routing failed
Jul 22 12:39:14.437 cdt: FIBipv4-packet-proc: route packet from Vlan2984 src 10.205.100.248 dst 224.0.0.5
Jul 22 12:39:14.437 cdt: FIBfwd-proc: Default:224.0.0.0/24 receive entry
Jul 22 12:39:14.437 cdt: FIBipv4-packet-proc: packet routing failed
Jul 22 12:39:14.438 cdt: IP: s=10.205.100.248 (Vlan2984), d=224.0.0.5 (nil), len 64, input feature, proto=89, packet consumed, MCI Check(109), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE
Jul 22 12:39:14.577 cdt: IP: s=10.205.100.248 (Vlan2984), d=224.0.0.18 (nil), len 40, input feature, proto=112, MCI Check(109), rtype 0, forus FALSE, sendself FALSE, mtu 0, fwdchk FALSE

My expectation (again, I do not understand VRRP well) is that the VRRP pair's communication between them, and they are on the same HP switch downstream in the motorola network.  They are not directly on the Ciscos, so my expectation is the Cisco only sees the virtual MAC address and when it arps receives the correct virtual MAC and VLAN (which is happening, show ip arp is correct in all cases I can see). 

This virtual IP really should be a black box as far as we are concerned, it is all supposed to be formed and respond inside the Motorola network. 

Please help with a few more words if I am misunderstanding. 

Linwood

debug is capture traffic from or to CPU, here the traffic is data not inject to CPU that why I mention I hope it work 
NOW 
do the following 
ip access-list extended 100
permit ip any host 224.0.0.18 
permit ip any any 

apply these ACL to port connect to L3SW (VRRP) one by one not apply it to both port in same time 
and then check the match, 
do you see any match or not ?

I'm sorry, I am still lost. Please, I beg you, explain what it is you want me to look for.

Which port?  And is that for the debug packet?  

Below is a simplification of the topology.  The VRRP routers, of which their are two pairs, are in physically different cities, one of which is in our building but they reside behind a Motorola switch to which I do not have access.

The two routers indicated from the callouts show where the VLAN 2984 pings fail (yellow) and work (green). This gives me reason to think that the problem is in that 3650 (labeled #2) and not above it (above in the drawing). The Microwave links do have switches internally but should be acting as a wire, just passing end to end the ethernet traffic (with tags) we throw at it, so I do not THINK the packets are lost in that link (and pings to other IP's work fine, single digit latencies, zero loss). 

It is unclear to me what the VRRP protocol has to do with our packet loss. That protocol should be negotiating solely within the grey boxes, correct?  We should simply see the virtual mac address as arriving on the associated VLAN (one VLAN for each virtual IP). 

And in ALL the cisco switches as well as the netgear I see the mac address table containing the proper virtual MAC, duplicated, but labeled with the appropriate VLAN.  Which seems correct.  Similiarly the arp table has the duplicated MAC associated with the VLAN 2984 IP (we do not have any interfaces on VLAN 2982 for IP address presence, though I can probably get Motorola's permission to do so if that is useful - VLAN 2982 has been in production for years, VLAN 2984 is being added). 

It would appear that the VRRP process is working properly in that we do see both virtual addresses labeled with the right VLAN.

But something, somewhere in this blue/green crossover below is losing IP traffic to the VLAN 2984 virtual address. 

I apologize for the length of this description, and vastly appreciate your help, but I do not understand what  you want me to look for. 

Are you suggesting I apply the above as an ACL on the physical port (not VLAN interface) of the ingress ports?  I ask, because the "permit ip any any" would appear to not actually restrict traffic and so I do not see what that ACL actually does as an ACL, and if it's an access list for the debug, not sure which debug? 

Linwood

 

2984-2982.jpg

Did you mean this, with a deny for the multicast address? 

ip access-list extended 100
deny ip any host 224.0.0.18 
permit ip any any 

To put this one to bed, we found that this issue was not occurring inside the Cisco's, but was in a microwave link (specifically the one in the diagram from 3650#2 to 9402 (which I think is a 9200, typo).  

We set up our own sniffer and one poor guy had to carry it to various places around the county while we tested, and eventually could see packets entering that microwave link, and not coming out the other end reliably (just occasionally).

That microwave link input from the Cisco feeds a large ring on a NEC iPaso network (this is an iPaso 650), leading both to the [sic] 9402 and also off to a ERP protected ring of sites around the county shown with the blue "Consumers of VLAN 2982", so the built-in switch in front of the microwave radios (five of them if I recall in different paths) is the likely culprit, as it would have in its MAC table both VLAN's. 

Unfortunately NEC sold that business shortly after these were installed, we are unable to get any real support, and so we have no one to work with to figure out the issue. I have admin access but the management interface's MAC address table display shows empty (i.e. clearly the admin display is broken).

We have leaned harder on Motorola and they have "discovered" a way to reset the virtual MAC to be unique, and now it is working.

So I do not know WHY it was failing in the NEC device, could not fix it there, but that is the culprit, not the Cisco. Which is a bit surprising, as we have generally treated those microwaves as a wire -- what goes in (always tagged) comes out.  But apparently not so. 

Review Cisco Networking products for a $25 gift card