01-15-2013 08:53 PM - edited 03-04-2019 06:43 PM
Hi All,
Looking for a bit of assistance with an issue I am dealing with.
-I have 2 E1 Circuits bundled together. We shall say they are 2/3 and 2/4
-They were previously up and running fine but then had a Telco issue which took out both circuits.
-Eventually 1 E1 came up and the ML came back up running at 2mb and was running clean.
-Now the 2nd E1 has come up to restore to a full 4Mb...but with errors!
-Hear is where it started to get a bit funky...
-If i remove say 2/3 from the bundle (PE side) the link runs clean. When i add it back in it errors....If I then remove 2/4 from the bundle then it runs clean??!
-These are real CRC's and not just accounting/reporting errors.
-Customer is complaining of issues and when i test the circuit I am seeing packet loss.
-I read about bug CSCsj26883, but this is different.
-When I check the E1 controllers for errors there is nothing accumulating on either. Only the ML showing errors.
Router A
Using E1 controllers but configured for Unframed.
Cisco 3725: Version 12.2(8r)Bexit
E1 2/3 is up.
Applique type is Channelized E1 - balanced
No alarms detected.
alarm-trigger is not set
Framing is UNFRAMED, Line Code is HDB3, Clock Source is Line.
Data in current interval (449 seconds elapsed):
0 Line Code Violations, 0 Path Code Violations
0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins
0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
Total Data (last 24 hours)
0 Line Code Violations, 0 Path Code Violations,
0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins,
0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
E1 2/4 is up.
Applique type is Channelized E1 - balanced
No alarms detected.
alarm-trigger is not set
Framing is UNFRAMED, Line Code is HDB3, Clock Source is Line.
Data in current interval (458 seconds elapsed):
0 Line Code Violations, 0 Path Code Violations
0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins
0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
Total Data (last 24 hours)
1 Line Code Violations, 0 Path Code Violations,
0 Slip Secs, 0 Fr Loss Secs, 1 Line Err Secs, 0 Degraded Mins,
1 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
Multilink1 is up, line protocol is up
Hardware is multilink group interface
Internet address is x.x.x.x
MTU 1500 bytes, BW 4096 Kbit/sec, DLY 100000 usec,
reliability 200/255, txload 13/255, rxload 165/255 <- Bad reliability
Encapsulation PPP, LCP Open, multilink Open
Listen: CDPCP
Open: IPCP, loopback not set
Keepalive set (10 sec)
DTR is pulsed for 2 seconds on reset
Last input 00:00:00, output never, output hang never
Last clearing of "show interface" counters 00:45:02
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 27039
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 2654000 bits/sec, 275 packets/sec
5 minute output rate 215000 bits/sec, 234 packets/sec
897514 packets input, 930761334 bytes, 0 no buffer
Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
131211 input errors, 0 CRC, 71710 frame, 52785 overrun, 0 ignored, 6714 abort <-- Constantly incrementing
701050 packets output, 75979321 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
0 carrier transitions
Router B
Using Serial interfaces for the Bundle
Cisco 7206 VXR: Version 12.2(8r)T2
There is no significant errors accumulating on the serial interface or the ML interface
#The configs on the serial interfaces and ML's are exactly the same and have not been altered since before the outage with the Telco.
#I would like to shut down the interfaces (one at a time) on the remote CE end but i am worried incase i drop my connection and i cannot really set the router to reload as it would require an outage window.
Router#ping vrf cisco x.x.x.x rep 10
Type escape sequence to abort.
Sending 10, 100-byte ICMP Echos to x.x.x.x, timeout is 2 seconds:
!.!!!!!.!!
Success rate is 80 percent (8/10), round-trip min/avg/max = 92/138/204 ms
#Could this be a weird telco issue? I am unfamiliar with what hardware the telco is using but nothing should have been changed...
#Any idea's or suggestions for tests?
Many Thanks in Advance,
F
01-16-2013 04:13 AM
at First time , you can check multilink connectivity locally , make back to back connection between your 7600 and other router , and check if errors in multilink still exists ,if no errrors happen , than cause of observed errors are due to teleco side.
01-16-2013 11:48 AM
As a last resort i could arrange to do this but not initially as downtime would need to be arranged and also would need to make arrangements to get hardware to the remote end of the CCT.
01-16-2013 01:46 PM
Hello f.sorrentino,
That symptom you are reporting could be related to a differential delay issue between the different members of the multilink. This is a result of a multilink bad design so it is recommended to rebuild the multilink in order to make sure the path is optimized. Please use the "show ppp multilink" command and check if lost fragments are observed and if they match the number of input errors the interface multilink is getting. If we see lost fragments, we will know that some fragments are not arriving or are not arriving right on time. Thus, they are considered lost and it would trigger retransmissions.
On the other hand, a noisy line could also be the root of the problem, so it is a good idea to check if the individual interfaces of the multilink are also taking input errors. In that case, it is recommended to get your provider/Telco involved to ensure the link is clean.
On your side you can also apply a workaround like increasing the timeout period before considering fragments lost in the multilink and increasing also the multilink buffer.
01-16-2013 03:11 PM
Hi Jose,
Looks like you might have something there. It looks like there is a lot of lost fragments, reordered, discard and lost received. The lost fragments dont seem to match up but if you add the lost onto the discarded (19+85=104) this equals the input errors on the Multilink interface.
Router#sh int mu1
Multilink1 is up, line protocol is up
Hardware is multilink group interface
Internet address is x.x.x.x
MTU 1500 bytes, BW 4096 Kbit/sec, DLY 100000 usec,
reliability 255/255, txload 6/255, rxload 51/255
Encapsulation PPP, LCP Open, multilink Open
Listen: CDPCP
Open: IPCP, loopback not set
Keepalive set (10 sec)
DTR is pulsed for 2 seconds on reset
Last input 00:00:00, output never, output hang never
Last clearing of "show interface" counters 00:02:16
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 747
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 825000 bits/sec, 183 packets/sec
5 minute output rate 111000 bits/sec, 155 packets/sec
25909 packets input, 10740194 bytes, 0 no buffer
Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
104 input errors, 0 CRC, 59 frame, 37 overrun, 0 ignored, 8 abort
23037 packets output, 2001087 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
0 carrier transitions
Multilink1
Bundle name: Multi1
Remote Endpoint Discriminator: [1] Router B
Local Endpoint Discriminator: [1] Router A
Bundle up for 1d09h, total bandwidth 4096, load 6/255
Receive buffer limit 24000 bytes, frag timeout 1000 ms
0/0 fragments/bytes in reassembly list
19 lost fragments, 16196 reordered
85/47946 discarded fragments/bytes, 59 lost received
0x9E47F6 received sequence, 0xBC339 sent sequence
Member links: 2 active, 0 inactive (max not set, min not set)
Se2/4:0, since 1d07h
Se2/3:0, since 06:55:19
Do you think tearing down the Multilink interface and re-establishing should clear it? I guess if it doesnt i am looking at telco issue?
What is the best way to tear it down? Just remove the interface from the bundle?
Thanks.
01-16-2013 04:52 PM
Hello f.sorrentino,
I think shutting down or tearing the connection on your end will not eliminate the issue if a noisy line or differential delay are roots of the issue. Telco is the one responsible for rebuilding the circuit and testing too. You can perform HW loopback tests on your end just to rule out but the Telco should handle the issue. Try getting show interfaces from the links that are members of the bundle too.
Regards,
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide