08-21-2013 12:51 PM - edited 03-07-2019 03:03 PM
Hi all!
I've been cracking my head over this for more than a day now - either it's not possible or I am missing something very obvious.
Background: Troubleshooting/analyzing an issue where the 1st ARP broadcasts not forwarded by a switch when the ARPing host's Src MAC is not yet in the CAM table. Yes, there is a veery complicated and looooong TAC case for this, still ongoing, and it's not directly related to this post.
Goal 1: to simulate multiple hosts (simple, TCP/IP enabled). Idea: use one VRF with one routed port for each pseudo host, connected to different ports/ASICs of the problematic switch.
Goal 2: the ARP cache of these pseudo hosts should behave like simple operating system's ARP cache; entries older than a few minutes should age out, so that the pseudo host is forced to ARP-Broadcast for another host in the same subnet.
Test Rig:
[nota: we don't mean to reproduce the original ARP-broadcast-is-lost-problem here. This is just to see if one VRF equals one host actually works.]
int f0/11
description * to f0/21*
no switchport
ip vrf forwarding HOST1
ip address 10.33.53.11 255.255.255.0
arp timeout 300
int f0/12
description * to f0/23*
no switchport
ip vrf forwarding HOST2
ip address 10.33.53.12 255.255.255.0
arp timeout 300
int f0/13
description * to f0/23*
no switchport
ip vrf forwarding HOST3
ip address 10.33.53.13 255.255.255.0
arp timeout 300
[...]
int range f0/21-26
switchport mode access
switchport access vlan 53
spanning-tree portfast
switchport nonegotiate
int vlan 53
ip vrf forwarding ROUTER
ip address 10.33.53.254 255.255.255.0
Observation:
In the latest case, the "host's" ARP cache is not empty during the 30 seconds after "arp timeout".
I thought that this might have to do with CEF and ARP and their inner "connectedness" (as in
http://blog.ipspace.net/2007/05/what-is-cached-cef-adjacency.html ) - but you can't disable CEF on a 3560.
Therefore, I tried on a 3845 router with no VRFs and CEF disabled, and a (real) host with wireshark connected to gig0/0. It's the same thing - a given amount of time before arp timeout occurs, the Router ARPs for the entries it has in its ARP cache
clear arp-cache interface fast0/11 and clear arp interface fast0/11 have another effect: it not only causes every interface to re-arp for it's cache entries, it also causes it to broadcast gratuitious ARPs to its neighbors.
The behavior is identical when i swap the "no switchport" for a combination of vlan xyz & interface vlan xyz & switchport access vlan xyz.
Then I found https://supportforums.cisco.com/thread/2131395 .. from which I take that it is common operating procedure for a Cisco device to send ARP requests to maintain its cache.
So you actually can't "clear" the ARP cache - it repopulates itself instantly.
Questions:
a) why? What is the rationale for active ARP cache maintenance? No host seems to be doing it - why should a router?
b) Can I disable this "feature", and if yes - how?
best regards and thanks a lot for your thoughts and ideas.
Marc
PS: yes, we've also been thinking "Why don't we get a handful of Raspberry Pi and use those?". A Cisco with it's debugging capabilities seemed a more obvious choice.
Solved! Go to Solution.
08-21-2013 01:54 PM
Hi Marc,
You've got some great questions at hand!
a) why? What is the rationale for active ARP cache maintenance? No host seems to be doing it - why should a router?
A normal router would not need to do this. However, CEF-based platforms are different. As you know, CEF precomputes the frame rewrite information - basically, it prepares the frame headers beforehand, so if any packets arrive that are to be routed towards or via a particular neighbor, the frame header for that neighbor is already prepared. This way, much of the daunting work when doing routing table lookups and ARP table lookups is already done, and the results are reused. This speeds up routing significantly.
In order to deliver packets to directly connected stations using CEF (which is what the multilayer switch always tries to do - otherwise, delivering packets to directly connected stations would fall back to process switching which could easily overload the CPU), the switch continuously needs to have proper ARP entries available to keep the adjacency table up-to-date. If a directly connected station is up, the multilayer switch assumes it may receive packets for that station at any time, so to avoid process switching, it repeatedly ARPs for this station. If it is up, it will respond, and the switch will keep the rewrite information prepared in the CEF's adjacency table. If it does not respond then it is most probably down or disconnected, and the rewrite information can be removed.
b) Can I disable this "feature", and if yes - how?
I do not think it is possible, and with the explanation above, I am not sure if that was a wise thing to do.
Feel welcome to ask further!
Best regards,
Peter
08-21-2013 01:54 PM
Hi Marc,
You've got some great questions at hand!
a) why? What is the rationale for active ARP cache maintenance? No host seems to be doing it - why should a router?
A normal router would not need to do this. However, CEF-based platforms are different. As you know, CEF precomputes the frame rewrite information - basically, it prepares the frame headers beforehand, so if any packets arrive that are to be routed towards or via a particular neighbor, the frame header for that neighbor is already prepared. This way, much of the daunting work when doing routing table lookups and ARP table lookups is already done, and the results are reused. This speeds up routing significantly.
In order to deliver packets to directly connected stations using CEF (which is what the multilayer switch always tries to do - otherwise, delivering packets to directly connected stations would fall back to process switching which could easily overload the CPU), the switch continuously needs to have proper ARP entries available to keep the adjacency table up-to-date. If a directly connected station is up, the multilayer switch assumes it may receive packets for that station at any time, so to avoid process switching, it repeatedly ARPs for this station. If it is up, it will respond, and the switch will keep the rewrite information prepared in the CEF's adjacency table. If it does not respond then it is most probably down or disconnected, and the rewrite information can be removed.
b) Can I disable this "feature", and if yes - how?
I do not think it is possible, and with the explanation above, I am not sure if that was a wise thing to do.
Feel welcome to ask further!
Best regards,
Peter
09-03-2013 01:19 PM
Thanks Peter for the enlightening answer.
In the meantime, based on your explanation, we've been using this ARP refreshing of a router to our advantage to solve/hide/mask our original problem.
The devices that suffered from the effect on the "problematic switch" are now being "ARPd" every 3 minutes by their default gateway (in extenso: their distribution L3-switch). This keeps the CAM tables of their respective switchports alive.
Thanks a lot!
Marc
09-03-2013 01:58 PM
Marc,
It has been a pleasure. Thank you!
Best regards,
Peter
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide