Solved: Re: Routing Loop - ICMP: time exceeded (time to live)

clyde.a.huffman.ctr@mail.mil · ‎09-18-2019

A syslog server (192.168.168.228) caused an ICMP storm, see the attached. The wierd thing is 1.1.1.3 is the internal interface between the 2921 router and the EtherSwitch. The 2921 router can't traceroute to the ASA 5520

ASA5520# traceroute 192.168.168.231

Type escape sequence to abort.
Tracing the route to 192.168.168.231
1 192.168.168.231 0 msec * 0 msec
ASA5520#

RTR2921#traceroute 192.168.168.233
Type escape sequence to abort.
Tracing the route to 192.168.168.233
VRF info: (vrf in name/id, vrf out name/id)
1 * * * 
2 * 
RTR2921#

There must be a routing loop but I can't see it. I don't have access to the Meraki FW.

Richard Burts · ‎09-19-2019

I believe that is correct.

HTH

Rick

HTH

Rick

View solution in original post

Richard Burts · ‎09-18-2019

There are a number of things in this post that I do not understand.

- What is the relationship between the 2 traceroutes shown in the beginning of the post and the rest of the problem.

- where is 1.1.1.3?

- I see a reference to 192.168.168.1 as nat inside but not quite clear whether this is for meraki or something else.

- I see a reference to a DHCP scope 192.168.168.70-140.

- I see a reference to 192.168.168.231 as nat outside, I assume for the 2921.

- I see a reference to 192.168.168.233 which appears to be a device in vlan 200.

- So 192.168.168 is being used in a number of different places in the network. Can you help us understand how they relate to each other?

- I do not see how the output in the first part of the attached file relates to the other parts of the post.

The output in the second part of the attached file does have a lot of time exceeded messages. You seem to believe that they reflect a loop. That is possible. But it is also possible (and given the context I think pretty likely) that the time exceeded messages are related to traceroute packets whose TTL has exceeded (expected behavior in traceroute is to have lots of TTL exceeded - 1 per hop along the path).

HTH

Rick

HTH

Rick

clyde.a.huffman.ctr@mail.mil · ‎09-19-2019

Hi Richard, good to hear from your again.

I've improved the drawing to answer some of your questions and changed the routable ISP static addresses to 5.5.5.x, 6.6.6.x, 7.7.7.x to avoid confusion with the RTR2921 internal int 1.1.1.3....

This was the original indicator -> 
  Caused me to "debug icmp" everywhere to find the culprit behind 1.1.1.3
  I can't figure out why the Guest RTR2921 got involved except that the 
  internal int 1.1.1.3 to the EtherSwitch has vlans 1,111,200
ASA5520# sh log
Sep 18 2019 17:21:49: %ASA-6-302021: Teardown ICMP connection for faddr 1.1.1.3/0 gaddr 192.168.168.233/0 laddr 192.168.168.233/0

RTR2921#sh ip int br
Interface IP-Address OK? Method Status Protocol
Embedded-Service-Engine0/0 unassigned YES NVRAM administratively down down
GigabitEthernet0/0 192.168.168.231 YES NVRAM up up
GigabitEthernet0/1 192.168.170.1 YES NVRAM up up
GigabitEthernet0/2 unassigned YES NVRAM administratively down down
GigabitEthernet1/0 1.1.1.3 YES NVRAM up up
GigabitEthernet1/1 unassigned YES unset up down
NVI0 192.168.168.231 YES unset up up

Vlan1 unassigned YES unset down down
RTR2921#

Note that the only physical ports on a 2921 are GI0/0 - 2

The culprit was the syslog server 192.168.168.228 (that was disguised behind 1.1.1.3). I should have used Wireshark to monitor the port from the syslog server but the ICMP storm brought down the network. When I deleted the syslog VM the ICMP errors stopped.

192.168.168.233 is the ASA5520 for Crypto tunnels to remote users.  
192.168.168.228 was the Centos syslog server.  
I could not get remote ASA5504s to north-bound the syslogs with 
  "logging host inside 192.168.168.228" (nor for outside).  
  But it could explain ICMP messages from the syslog server to the tunnel ASA.
TR2921# deb ip icmp
Sep 18 17:24:49.656: ICMP: time exceeded (time to live) sent to 192.168.168.233 (dest was 192.168.168.228), topology BASE, dscp 0 topoid 0

To pass only vlan 111 to WiFi the WLC has it's own switch.

Besides an 8 gi-port POE is expensive so the Office SW is 8-gi-port and WLC SW is 8-fa-port-POE ;-/

clyde.a.huffman.ctr@mail.mil · ‎09-19-2019

I think that I've found the routing loop see https://community.cisco.com/t5/switching/etherswitch-configuration-on-cisco-2921/td-p/3213711

The internal interface between the 2921 and EtherSwitch apparently can be used like an actual trunk. 2921 gi1/0 and gi1/1 are not real ports and EtherSwitch gi0/51 and gi0/52 are not real ports. I thought that it was just for service-module sessions.

RTR2921#service-module gi1/0 session 
Trying 1.1.1.3, 2067 ... Open
User Access Verification
Username: user001
Password: 
EtherSwitch>

So I put a patch cable between RTR2921 sfp gi0/1 and EtherSwitch sfp gi0/1 for the trunk.

----------------> But look at CDP

CDP Neighbor from RTR2921 and EtherSwitch
EtherSwitch#sh cdp nei det
-------------------------
Device ID: RTR2921.mydomain.com
Entry address(es): 
IP address: 1.1.1.3
Platform: Cisco CISCO2921/K9, Capabilities: Router Switch IGMP 
Interface: GigabitEthernet0/52, Port ID (outgoing port): GigabitEthernet1/0
Holdtime : 175 sec
...
EtherSwitch#
RTR2921#sh cdp nei det
-------------------------
Device ID: EtherSwitch.mydomain.com
Entry address(es): 
IP address: 192.168.168.232
Platform: cisco SM-D-ES3G-48-P, Capabilities: Switch IGMP 
Interface: GigabitEthernet1/0, Port ID (outgoing port): GigabitEthernet0/52
Holdtime : 174 sec
...
RTR2921#

Is all this correct?

Richard Burts · ‎09-19-2019

I believe that is correct.

HTH

Rick

HTH

Rick

clyde.a.huffman.ctr@mail.mil · ‎09-20-2019

1) messed around with 1.1.1.3 gi1/0 & gi1/1 on RTR2921 (the internal interfaces without any physical ports):

gi1/1 - was already shut

gi1/0 - I did a shut

Communications kept working - no ICMP storm - but there message said to reload the EtherSwitch for it to work correctly

2) messed around with the other side of 1.1.1.3 on the Etherswitch gi0/51 & gi0/52

gi0/51 - was already shut

gi0/52 - I did a shut = BIG PROBLEM

Shutting gi0/52 turned off all POE and I have over 20 VoIP phones on it.

At least this was after hours so nobody was there to see the phones go down.

So I did a "no shut" to power the VoIP phones back on.....

I don't trust the EtherSwitch now :-( I think that the routing loop is stopped with 1.1.1.3 shutdown but now there is no console to the EtherSwitch.... So I'm going to move everything off the EtherSwitch so that I can find out how the internal interfaces works. I presume that they are redundant to permit a port channel group....

Looking back at the attached txt file there is only about 0.005 seconds between ICMP errors. Much too fast for a VM server to output. It has to be a loop between the cat5 Trunk (RTR2921=>EtherSwitch) and the internal interface (RTR2921=>EtherSwitch).....