cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
102
Views
0
Helpful
0
Replies

Cisco 10720, ISIS crash via SRP1/1 interface

kz-support
Level 1
Level 1

We observed periodic is-is drops via the SRP interface c10720-ATS4 and, as a result, rebuilding of the SRP ring.
There is a ring topology:
ATS1 <-> ATS2 <-> ATS3 <-> ATS4 <-> ATS1

Connections between elements are organized via direct optical fibers.
No problems with optics were recorded. I tried to ping at the moment of loss of is-is connectivity, there are almost no losses (one or two packets out of several thousand are dropped).

The problem started at ~11:30 04/17 on ATS4, according to c10720-ats4 logs ISIS down on SRP1/1
Apr 17 18:30:56 MSK: %CLNS-5-ADJCHANGE: ISIS: Adjacency to c10720-ats3 (SRP1/1) Down, hold time expired
Apr 17 18:30:59 MSK: %CLNS-5-ADJCHANGE: ISIS: Adjacency to c10720-ats1 (SRP1/1) Down, hold time expired

After some time, the problem arose on c10720-atс1 with several BGP peers 172.16.100.150 - 172.16.100.166 (for c10720-atc1 - 172.16.100.56) are located behind the GigabitEthernet2/2 interface (towards another C3845 router). 17/04 17:59 Restart of c10720-atc1 was performed

According to the logs, the problem was not recorded during the night period approximately in the interval 22 - 6

On the ats4 node, I noticed a periodic increase in the incoming queue + drops counter on the SRP1/1 interface
Input queue 2/75, 194063 drops
Input queue 0/75, 194063 drops
Input queue 1/75, 194123 drops
Input queue 0/75, 194191 drops
Input queue 1/75, 194602 drops

During the increase of the above counters, a decrease in holdtime for isis neighbors to zero values ​​​​is observed, followed by a reset of sessions, most likely ats4 stops receiving hello from its neighbors.

c10720-ats4#sh isis neighbors
System Id Type Interface IP Address State Holdtime Circuit Id
c10720-ats2 L2 SR1/1 172.20.0.1 UP 22 c10720-ats4.01
c10720-atc3 L2 SR1/1 172.20.0.45 UP 22 c10720-ats4.01
c10720-atc1 L2 SR1/1 172.20.0.56 UP 19 c10720-ats4.01
c7304 L2 Gi2/3.950 172.20.0.130 UP 18 c10720-ats4.02

c10720-ats4#sh isis neighbors
System Id Type Interface IP Address State Holdtime Circuit Id
c10720-ats2 L2 SR1/1 172.20.0.1 UP 2 c10720-ats4.01
c10720-atc3 L2 SR1/1 172.20.0.45 UP 2 c10720-ats4.01
c10720-atc1 L2 SR1/1 172.20.0.56 UP 7 c10720-ats4.01
c7304 L2 Gi2/3.950 172.20.0.130 UP 7 c10720-ats4.02

We assume that the isis neighbor loss problem is related to the increase in incoming traffic through SRP1/1 on ats4.
According to the graphs sent by the client, the traffic is very small, the processor load is far from 100%.

Tried to increase the queue on the SRP interface 75 -> 1000, didn't help.

2 hours after the end of the working day, the problem with ISIS flaps went away, but the problem with flaps on c10720-atc1 remained

Rebooted c10720-atc1, but bgp continues to twitch. bgp peers 172.20.100.150 - 172.20.100.166 are behind Gi2/2 c10720-atc56 (d towards C3845). Added to those twitching because of SRP1/1: 172.16.100.4 and 172.16.100.3 (previously there was only 172.16.100.5)

The error is the same for everyone:

Apr 18 20:35:23 MSK: %BGP-3-NOTIFICATION: received from neighbor 172.16.100.4 3/1 (update malformed) 42 bytes 0904AC14 6403800E 21000180 0C000000 00

Can you tell me what the problem might be?

0 Replies 0