cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1697
Views
0
Helpful
5
Replies

Cisco 4331 Router repeatedly loosing connections

mvelatln79
Level 1
Level 1
I have a customer with a Cisco 4331 router.  The symptoms on the surface is that this location is loosing the internet repeatedly, over and over again.  If first started with about a 5-10 minute outage once or twice an hour.  Now they go down frequently over and over again, so they are only up about 5 times an hour for a few minutes each.  I monitor their equipment with PRTG that has sensors reporting ping and bandwidth.  This is how I know how much they go down, along with reports from the customer on site.
 
I've contacted and supplied information to our ISP, which is two parts.  One fiber supplier for layer 2, they've been looking at it with their engineers for several days now and haven't found any issues with their lines or switches on their MPLS network.  The other partner that supplies their bandwidth, I had them check for attacks or anything that they could see.  They see nothing.  They put them into DDOS mitigation in case they could scrub out some of the bad traffic, but this hasn't changed how much the site looses the internet in a given hour.
 
Its possible either the Cisco router is getting hit, or the next hop, their Cisco ASA firewall is getting hit an attack of some kind.  This is my best guess.  However, I normally monitor the link from the router to the firewall, not to the ISP.  So, if that link goes down then I'm not always so sure if its the firewall or our router.  So, I added the link to the ISP equipment to see whats going down and when.  The problem I have is that not only is the link to the firewall going down, so is the link to the ISP.  And if the ISP isn't having an issue then is the router the issue?  Because if he's getting attacked at the Cisco firewall with port scans or something of that nature, then why would my router be affected.  They are not saturating their circuit.  Because both of my interfaces go down, I'm start to question the status/health of my router.
 
Which brought me to two other possibilities.  One, the router needs to be patched.  I went to download the software for it and I cannot according the portal.  This is odd since support confirmed I have Smartnet on the router.  Two, its malfunctioning in some way.  The only thing is I don't know all of the diagnostics commands to look for issues.  The logs have nothing but OSPF messages right now.  If I say show log.  The up time on the router is 100%, it hasn't rebooted in months.  So, its not cycling power or crashing in that way.  If the interfaces are resetting, I don't know how to tell other than show interface commands. 
 
On the interface facing the ISP I see input errors, overruns and pause outputs.  I normally don't see this, but maybe its nothing.
  Queueing strategy: Class-based queueing
  Output queue: 0/40 (size/max)
  30 second input rate 16507000 bits/sec, 2149 packets/sec
  30 second output rate 9840000 bits/sec, 1958 packets/sec
     27211280955 packets input, 29649530972733 bytes, 0 no buffer
     Received 374658314 broadcasts (0 IP multicasts)
     0 runts, 0 giants, 0 throttles
     78 input errors, 0 CRC, 0 frame, 78 overrun, 0 ignored
     0 watchdog, 935278023 multicast, 0 pause input
     20973626015 packets output, 13076748351934 bytes, 0 underruns
     0 output errors, 0 collisions, 1 interface resets
     0 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 1032 pause output
     0 output buffer failures, 0 output buffers swapped out
 
Are their any obvious things I may be missing in looking into this.  At this point I'm suspecting the router is doing something wrong.  If its not the router, then maybe its just an attack I cannot do anything about.
 
Thanks,
Mike
 
5 Replies 5

Richard Burts
Hall of Fame
Hall of Fame

We do not have much information about your situation to help us identify the issue. You have told us that there is a 4331 router, an ASA, and an ISP. Is the connection to the ISP to the ASA or to the 4331?

 

I find this part of your post especially interesting "The problem I have is that not only is the link to the firewall going down, so is the link to the ISP." Could you tell us more about this? How do you detect the interfaces going down?

 

Is it possible to get any diagnostic information while the problem is happening? I assume that you are not on site, but is there anyone who would be able to execute some commands and gather some output to help identify the issue? Otherwise we may need to depend on what shows up in the logs. Could you send us some of the logs? (the first group of lines of output from the command show log contains information about how logging is set up and this might be helpful - as well as some of the log messages)

 

HTH

 

Rick

 

 

 

HTH

Rick

Richard,

 

Thank you for the quick reply.


Sorry about not explaining the topology properly.  Its easy to forget when all my locations are identical and usually don't talk to anyone else but co-workers about it.

 

In text I'll try and describe it first.  I'm sure it will make sense but I can create a quick graphic of it.

 

AT&T internet comes in on fiber, they have a Ciena switch on site which then connects to my Cisco router on interface 0/0.  On that same Cisco router interface 0/1 connects to the Cisco ASA firewall on port 0/1.  Beyond that there are just normal switches etc.

 

So, for PRTG network monitor it uses what they call sensors.  SNMP data is requested from the router and is display in the sensors I setup.  I monitor bandwidth usage of the router for ports 0/0 and 0/1, but it uses the IP of the interface of 0/1 connected to the firewall.  The flaw in doing it this way is that it doesn't guarantee when it says the router is down that it doesn't always mean the ISP is down, because if the firewall is off or down then the link is down which takes down the IP that I monitor.  Its a false positive for being down.  The ping part of it is that the PRTG system just pings it on an interval to keep an eye on the uptime of the router.  As for being down in general, PRTG tells me its down when it stops getting a response for 30 seconds.  So, that could mean several things are wrong and that is why I added another PRTG monitor for the IP of interface 0/0 which connects only to the ISP.  SO, now that I monitor both I can determine a few things; at least I think.  So, my thought is that if the firewall is getting tanked by scans and stops responding or the interface goes down, then my interface 0/1 is down.  However, since I now monitor 0/0 it shouldn't necessarily go down too.  This is why I now think there is something either A attacking the router or B something is wrong with my router.

 

You're right, I am not physically there.  I would love to see if the links go down, or if consoled into the router if it freezes up or anything that would be helpful, or nothing happens.  My plan was to go there Monday morning to see it in person and have a spare router to put in place to see if it stops.  Just for a hard test to eliminate the Cisco router.  If it still happens with a different router in place then that excludes this router as a culprit here.

 

Here is the show log that I did get. I'll have to get the top of the command so you can see how its configured.

.1 is the router leading out to the internet, .10 is the router in question.  This is a WAN running OSPF, I didn't include that in the description of how things were connected.  I'm not 100% sure if this is the cause or a result of links going down and loosing access to the internet etc.  My best guess is this the reaction to packet/link loss.

 

X.1.75.10 on GigabitEthernet0/0/0.74 from LOADING to FULL, Loading Done
*Nov 23 16:34:45.153: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.1 on GigabitEthernet0/0/0.74 from FULL to DOWN, Neighbor Down: Too many retransmissions
*Nov 23 16:34:53.764: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.10 on GigabitEthernet0/0/0.74 from FULL to DOWN, Neighbor Down: Too many retransmissions
*Nov 23 16:35:45.153: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.1 on GigabitEthernet0/0/0.74 from DOWN to DOWN, Neighbor Down: Ignore timer expired
*Nov 23 16:35:48.683: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.1 on GigabitEthernet0/0/0.74 from LOADING to FULL, Loading Done
*Nov 23 16:35:53.763: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.10 on GigabitEthernet0/0/0.74 from DOWN to DOWN, Neighbor Down: Ignore timer expired
*Nov 23 16:36:05.938: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.10 on GigabitEthernet0/0/0.74 from LOADING to FULL, Loading Done
*Nov 23 16:37:03.467: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.1 on GigabitEthernet0/0/0.74 from FULL to DOWN, Neighbor Down: Too many retransmissions
*Nov 23 16:38:03.467: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.1 on GigabitEthernet0/0/0.74 from DOWN to DOWN, Neighbor Down: Ignore timer expired
*Nov 23 16:38:11.024: %OSPF-5-ADJCHG: Process 74, Nbr X.1.75.1 on GigabitEthernet0/0/0.74 from LOADING to FULL, Loading Done

 

Thanks!

Mike

 

Richard,

Below is the top of the show log output.

 

Syslog logging: enabled (0 messages dropped, 10 messages rate-limited, 0 flushes, 0 overruns, xml disabled, filtering disabled)

No Active Message Discriminator.

 

No Inactive Message Discriminator.


Console logging: level critical, 0 messages logged, xml disabled,
filtering disabled
Monitor logging: level debugging, 0 messages logged, xml disabled,
filtering disabled
Buffer logging: level debugging, 3663 messages logged, xml disabled,
filtering disabled
Exception Logging: size (4096 bytes)
Count and timestamp logging messages: disabled
Persistent logging: disabled

No active filter modules.

Trap logging: level informational, 3663 message lines logged
Logging to 207.75.162.11 (udp port 514, audit disabled,
link up),
3662 message lines logged,
0 message lines rate-limited,
0 message lines dropped-by-MD,
xml disabled, sequence number disabled
filtering disabled
Logging Source-Interface: VRF Name:
GigabitEthernet0/0/0.74

 

Mike

Mike

 

Thanks for the additional information. The top part of show log does confirm that logging buffered is capturing all log messages (logging level is debugging). So we should be able to see anything that the router is logging. From the original description of the symptoms I thought that it might be an issue with an interface doing down, then back up, then down again, and back up again. But there do not seem to be any messages about interfaces changing state.

 

I am a bit puzzled about an aspect of these log messages. All of these log messages relate to a subinterface of gig0/0/0.74, which I believe you describe as the connection to the ISP. But there seem to be 2 OSPF neighbors. What is the second neighbor?

 

Are you doing any monitoring of the ASA? If so does it appear to be stable (and consistently available)? Or does it show some loss of connectivity?

 

I am not clear from your description whether you are doing monitoring of the interface connecting the router to the ASA. Can you clarify this?

 

HTH

 

Rick

HTH

Rick

Leo Laohoo
Hall of Fame
Hall of Fame
1. What is the actual firmware the router is running on?
2. Post the complete output to the command "sh interface <WAN LINK>" (remove the description and the IP address).
3. Post the "sh logs"
Review Cisco Networking for a $25 gift card