cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1457
Views
0
Helpful
5
Replies

Errors on Multilink but not on E1 controllers!?

f.sorrentino
Level 1
Level 1

Hi All,

Looking for a bit of assistance with an issue I am dealing with.

-I have 2 E1 Circuits bundled together. We shall say they are 2/3 and 2/4

-They were previously up and running fine but then had a Telco issue which took out both circuits.

-Eventually 1 E1 came up and the ML came back up running at 2mb and was running clean.

-Now the 2nd E1 has come up to restore to a full 4Mb...but with errors!

-Hear is where it started to get a bit funky...

-If i remove say 2/3 from the bundle (PE side) the link runs clean.  When i add it back in it errors....If I then remove 2/4 from the bundle then it runs clean??!

-These are real CRC's and not just accounting/reporting errors. 

-Customer is complaining of issues and when i test the circuit I am seeing packet loss.

-I read about bug CSCsj26883, but this is different.

-When I check the E1 controllers for errors there is nothing accumulating on either.  Only the ML showing errors.

Router A

Using E1 controllers but configured for Unframed.

Cisco 3725: Version 12.2(8r)Bexit

E1 2/3 is up.

  Applique type is Channelized E1 - balanced

  No alarms detected.

  alarm-trigger is not set

  Framing is UNFRAMED, Line Code is HDB3, Clock Source is Line.

  Data in current interval (449 seconds elapsed):

     0 Line Code Violations, 0 Path Code Violations

     0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins

     0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs

  Total Data (last 24 hours)

     0 Line Code Violations, 0 Path Code Violations,

     0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins,

     0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs

E1 2/4 is up.

  Applique type is Channelized E1 - balanced

    No alarms detected.

  alarm-trigger is not set

  Framing is UNFRAMED, Line Code is HDB3, Clock Source is Line.

  Data in current interval (458 seconds elapsed):

     0 Line Code Violations, 0 Path Code Violations

     0 Slip Secs, 0 Fr Loss Secs, 0 Line Err Secs, 0 Degraded Mins

     0 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs

  Total Data (last 24 hours)

     1 Line Code Violations, 0 Path Code Violations,

     0 Slip Secs, 0 Fr Loss Secs, 1 Line Err Secs, 0 Degraded Mins,

     1 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs

Multilink1 is up, line protocol is up

  Hardware is multilink group interface

   Internet address is x.x.x.x

  MTU 1500 bytes, BW 4096 Kbit/sec, DLY 100000 usec,

     reliability 200/255, txload 13/255, rxload 165/255   <- Bad reliability

  Encapsulation PPP, LCP Open, multilink Open

  Listen: CDPCP

  Open: IPCP, loopback not set

  Keepalive set (10 sec)

  DTR is pulsed for 2 seconds on reset

  Last input 00:00:00, output never, output hang never

  Last clearing of "show interface" counters 00:45:02

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 27039

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 2654000 bits/sec, 275 packets/sec

  5 minute output rate 215000 bits/sec, 234 packets/sec

     897514 packets input, 930761334 bytes, 0 no buffer

     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles

     131211 input errors, 0 CRC, 71710 frame, 52785 overrun, 0 ignored, 6714 abort  <-- Constantly incrementing

     701050 packets output, 75979321 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 unknown protocol drops

     0 output buffer failures, 0 output buffers swapped out

     0 carrier transitions

Router B

Using Serial interfaces for the Bundle

Cisco 7206 VXR: Version 12.2(8r)T2

There is no significant errors accumulating on the serial interface or the ML interface

#The configs on the serial interfaces and ML's are exactly the same and have not been altered since before the outage with the Telco.

#I would like to shut down the interfaces (one at a time) on the remote CE end but i am worried incase i drop my connection and i cannot really set the router to reload as it would require an outage window.

Router#ping vrf cisco x.x.x.x rep 10

Type escape sequence to abort.

Sending 10, 100-byte ICMP Echos to x.x.x.x, timeout is 2 seconds:

!.!!!!!.!!

Success rate is 80 percent (8/10), round-trip min/avg/max = 92/138/204 ms

#Could this be a weird telco issue?  I am unfamiliar with what hardware the telco is using but nothing should have been changed...

#Any idea's or suggestions for tests?

Many Thanks in Advance,

F

5 Replies 5

adnane dakna
Level 1
Level 1

at First time , you can check multilink connectivity  locally , make back to back connection between your 7600 and other router , and check if errors in multilink still exists ,if no errrors happen , than cause of observed errors are due to teleco side.

As a last resort i could arrange to do this but not initially as downtime would need to be arranged and also would need to make arrangements to get hardware to the remote end of the CCT.

Hello  f.sorrentino,

That symptom you are reporting could be related to a differential delay issue between the different members of the multilink. This is a result of a multilink bad design so it is recommended to rebuild the multilink in order to make sure the path is optimized. Please use the "show ppp multilink" command and check if lost fragments are observed and if they match the number of input errors the interface multilink is getting. If we see lost fragments, we will know that some fragments are not arriving or are not arriving right on time. Thus, they are considered lost and it would trigger retransmissions.

On the other hand, a noisy line could also be the root of the problem, so it is a good idea to check if the individual interfaces of the multilink are also taking input errors. In that case, it  is recommended to get your provider/Telco involved to ensure the link is clean.

On your side you can also apply a workaround like increasing the timeout period before considering fragments lost in the multilink and increasing also the multilink buffer.

Hi Jose,

Looks like you might have something there.  It looks like there is a lot of lost fragments, reordered, discard and lost received.  The lost fragments dont seem to match up but if you add the lost onto the discarded (19+85=104)  this equals the input errors on the Multilink interface.

Router#sh int mu1

Multilink1 is up, line protocol is up

  Hardware is multilink group interface

   Internet address is x.x.x.x

  MTU 1500 bytes, BW 4096 Kbit/sec, DLY 100000 usec,

     reliability 255/255, txload 6/255, rxload 51/255

  Encapsulation PPP, LCP Open, multilink Open

  Listen: CDPCP

  Open: IPCP, loopback not set

  Keepalive set (10 sec)

  DTR is pulsed for 2 seconds on reset

  Last input 00:00:00, output never, output hang never

  Last clearing of "show interface" counters 00:02:16

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 747

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 825000 bits/sec, 183 packets/sec

  5 minute output rate 111000 bits/sec, 155 packets/sec

     25909 packets input, 10740194 bytes, 0 no buffer

     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles

     104 input errors, 0 CRC, 59 frame, 37 overrun, 0 ignored, 8 abort

     23037 packets output, 2001087 bytes, 0 underruns

     0 output errors, 0 collisions, 0 interface resets

     0 unknown protocol drops

     0 output buffer failures, 0 output buffers swapped out

     0 carrier transitions

Multilink1

  Bundle name: Multi1

  Remote Endpoint Discriminator: [1] Router B

  Local Endpoint Discriminator: [1] Router A

  Bundle up for 1d09h, total bandwidth 4096, load 6/255

  Receive buffer limit 24000 bytes, frag timeout 1000 ms

    0/0 fragments/bytes in reassembly list

    19 lost fragments, 16196 reordered

    85/47946 discarded fragments/bytes, 59 lost received

    0x9E47F6 received sequence, 0xBC339 sent sequence

  Member links: 2 active, 0 inactive (max not set, min not set)

    Se2/4:0, since 1d07h

    Se2/3:0, since 06:55:19

Do you think tearing down the Multilink interface and re-establishing  should clear it?  I guess if it doesnt i am looking at telco issue?

What is the best way to tear it down?  Just remove the interface from the bundle?

Thanks.

Hello f.sorrentino,

I think shutting down or tearing the connection on your end will not eliminate the issue if a noisy line or differential delay are roots of the issue. Telco is the one responsible for rebuilding the circuit and testing too. You can perform HW loopback tests on your end just to rule out but the Telco should handle the issue. Try getting show interfaces from the links that are members of the bundle too.

Regards,

Review Cisco Networking for a $25 gift card