cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements
Join Customer Connection to register!
1235
Views
0
Helpful
4
Replies
wayfaring
Beginner

packet loss freak show

Office A's isp was taken over by another isp, since then we see  ~10-20% packet loss with select internet hosts and many of our other 70 offices on Cisco equipment on different isp's across the country.  Jerkiness and stuttering in ssh to the wan interface and in rdp & vnc sessions over dmvpn tunnel to this location from the other offices indicate it's not just icmp dropping.  The packet loss occurs at either office A router or a laptop set for same static ip both plugged directly into the fiber ONT.  Bad destinations have packet loss 24/7, good destinations have no packet loss 24/7.  There is no packet loss between office A router wan interface and the isp gateway or google as an external example.  Prior to the isp takeover the location had no packet loss using the same network equipment and ONT.  All packet loss revolves around office A as source or destination, there is no packet loss issues going on at or between the other offices.  If another isp is plugged into same router wan interface at office A, packet loss is gone.  Offices are on 881, 2901, 2911's with some variety of ios versions and a couple locations with ASA + ASR's.

Tests from different isp's in different states turned up some unusual results - 2911 running c2900-universalk9-mz.SPA.152-4.M1 at office B has packet loss to/from office A's 2911 on same ios yet a 2821 running c2800nm-adventerprisek9-mz.151-4.M1 also at office B on same isp connection but using different public ip within /29 address block does not have packet loss to/from the office A 2911.  Both of the office B routers link to the internet through interfaces on an isp-managed Juniper SRX 240.  None of the other offices with or without the packet loss to office A have a Juniper gateway on site in the mix, it just happens to be the setup for office B.  Office A 2911 has no packet loss to office C ASA5520 outside interface public ip but does have packet loss to office C ASR1002 which is reached through another public ip on same isp nat'd through the ASA5520 outside interface.  Office A 2911 has packet loss to office D ASA5520 public outside interface ip and much worse packet loss to office D ASR1002 on same isp connection reached thruogh another public ip nat'd through the ASA5520 outside interface. Tickets with the new isp with traceroutes have yielded no explanation or solution.  I'm feeling out whether service interruptions for analysis will be required to get to the bottom of this.  Packet debug on office A 2911 during icmp loss with office B 2911 attached.  Any ideas or suggestions based on similar experiences?



Office A, B, C wan interfaces during tests, acl's make no difference -
interface GigabitEthernet0/1
 bandwidth 75000
 ip address **** 255.255.255.0
 ip access-group edge-ingress in
 ip access-group edge-egress out
 ip flow ingress
 ip flow egress
 ip nat outside
 ip virtual-reassembly in
 duplex auto
 speed auto

     Full Duplex, 1Gbps, media type is RJ45
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 babbles, 0 late collision, 0 deferred
     0 lost carrier, 0 no carrier, 0 pause output
     0 output buffer failures, 0 output buffers swapped out



2911 office B to 2911 office A:
B-2911#ping a.a.a.a repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to a.a.a.a, timeout is 2 seconds:
!!!!..!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!.!!.!!!!!!!!!!!!!.!!..!!!!.!!!!!.!!!!!!!.!!!!!!.!.!!!!!!!.!!.!!!!!!
!!.!!!!!.!!!!!!!!!!!!!!!!!!!!.!!!!!!!!.!!!!!!!.!!!!!!!!!!!!!.!!!!!!!!.
!!!!!!!!!!...!!!!!!.!!!!!!.!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!..!!!!.!!!!!!
!!..!!!.!!!.!!!!!!.!!!!..!!!!!!.!!!!!!!!!.!!!!!!!!!!!!!.!!!!!!!!!!!!!.
!!!!!!!!!!!!..!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!.!!!
!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!.!!!!!!!!!!!!
!.!!!..!.!
Success rate is 89 percent (446/500), round-trip min/avg/max = 76/79/180 ms


2821 office B to 2911 office A -
B-2821#ping a.a.a.a repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to a.a.a.a, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!
Success rate is 100 percent (500/500), round-trip min/avg/max = 76/78/116 ms



2911 office A to 2911 office B:
A-2911#ping b.b.b.146 repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to b.b.b.146, timeout is 2 seconds:
!!!!!!!!!!.!!!!!!!!!!!!!!!.!!.!.!!!.!!!!!.!.!!!!!!!!!!!!!!!!!!!!!!!!!.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!..!!!!!!!!!!!!!!!!!
!!!!!!!.!!!!!!!!!!!.!!!!!!!!!!!!!.!!!!!!.!.!!!.!!!!!!!!!!!!!!!!!!.!!!!
!!!!!!.!.!!!!!!!!!!!.!!!!!!.!!!!!!!!.!!!!..!!!!!!!!.!!!!!!!!!!!!!!!!!!
!!!!!.!.!.!!!!!!!!!!!!!!!!!..!!!!!!!!!!!!!.!!!!!!...!!!.!!!!!.!!!!!!!!
!!.!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!.!!!!!.!!!!
!!!!!!!..!!.!!.!!!!!.!!!!!!!!!.!!.!!!!!!!!..!!!!!!.!!!!.!!!!!!!!!!!!!!
!!!!!!!!!!
Success rate is 89 percent (447/500), round-trip min/avg/max = 76/78/156 ms


2911 office A to 2821 office B -
A-2911#ping b.b.b.147 repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to b.b.b.147, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!
Success rate is 100 percent (500/500), round-trip min/avg/max = 76/78/104 ms



2911 office A to ASA5520 office C -
A-2911#ping c.c.c.asa repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to c.c.c.asa, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!
Success rate is 100 percent (500/500), round-trip min/avg/max = 68/70/88 ms


2911 office A to ASR1002 office C -
A-2911#ping c.c.c.asr repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to c.c.c.asr, timeout is 2 seconds:
!!!!!!!!!!!.!!!!!.!!!!.!!!!!!!!!!!!!..!!!!.!!!!!!.!!!!.!!!!!!!!.!.!!!!
!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!.!!!!!.!.!!!!!!!!!!!!!!!.!!!!!!.
!!!!!!!!!.!!.!!!!!!!!!.!!!!!!!!!!!!!!!.!!.!.!!!!!!!.!!!!!..!!!!!.!!!!!
!!!!!!!!!!!!!!!!!!!!!..!!!!!!!.!!!!!.!!!!.!.!!!!!!.!!!!!!!!!!!!!!!!!.!
!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!.!!!!!!!!!!!!!..!!.!!!!
!.!!!!!!!!!!!!!.!!.!!!!!!!!!!!!!!!!!!!!!...!!!.!!!!!!!!!!!!!!!!!!!.!!!
!!!!!.!.!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!.!!!!!!.!!!!!!!!.!.!!!!.!!!!!!!!
!!!.!!!!!.
Success rate is 88 percent (441/500), round-trip min/avg/max = 68/70/76 ms



2911 office A to ASA5520 office D -
A-2911#ping d.d.d.asa repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to d.d.d.asa, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!.!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!.!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!.!.!!!!!!!.!!!!!!!!!!
!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!
Success rate is 97 percent (489/500), round-trip min/avg/max = 44/48/64 ms


2911 office A to ASR1002 office D -
A-2911#ping d.d.d.asr repeat 500
Type escape sequence to abort.
Sending 500, 100-byte ICMP Echos to d.d.d.asr, timeout is 2 seconds:
!!!!!!!.!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!.!!!!!!!!!!!!!!!
!!!!!!.!!!!!!!!!!.!!!!!!!..!!!!!!!!!!!!!!!!!!!!!!!!.!.!!!.!.!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!.!!!
!!!!!!!!!!!!!!!!!!!!!!!.!!!!.!!!!.!!.!!!.!!!!!!!!!.!!!!.!!!!!!!!!!.!!.
!!!!!.!!!..!.!.!!...!.!.!!.!!!.....!!!!!!!!!...!!!!!!.!!.!.!!!!!.!!.!!
!!!!!.!!!!!.!.!!!.!.!.!..!!!.!!!.!!!.!.!!!..!!!.!!!!!!...!!.!.!!!.!!!!
.!!..!!.!!!!!...!.!!!.!!!!..!.!.!!!.!!!!!!!.!!..!.!..!!.!.!!!.!!!!.!.!
!.!!..!.!!
Success rate is 80 percent (403/500), round-trip min/avg/max = 36/43/56 ms


1 ACCEPTED SOLUTION

Accepted Solutions
Philip D'Ath
Advisor

If nothing else has changed except the ISP take over then it wont be your kit.

What did the ISP have to say about this?

It is possible they are now using policing to limit your bandwidth, where as maybe they were using shaping before.

It is possible they are advertising the prefixes out of multiple bearers, and one of them has an issue (routing loop, circuit issue, etc).

You need ot start with your ISP on this one.

View solution in original post

4 REPLIES 4
Philip D'Ath
Advisor

If nothing else has changed except the ISP take over then it wont be your kit.

What did the ISP have to say about this?

It is possible they are now using policing to limit your bandwidth, where as maybe they were using shaping before.

It is possible they are advertising the prefixes out of multiple bearers, and one of them has an issue (routing loop, circuit issue, etc).

You need ot start with your ISP on this one.

View solution in original post

The problem has been presented to the new isp repeatedly for a couple months.  I hint at peering issues but can't prove that.  Responses have ranged from "not our problem" to insults.  It's never clear if they're acknowledging a problem and acting or have written me off as a time waster.  isp-performed testing from their gateway a week ago showed no packet loss to some internet destinations that we are seeing packet loss with but did show packet loss to other office ip's which we are not seeing packet loss with which is another unusual test result.  Is there a clue in office B's 2821 .147 ip vs 2911 .146 ip result that could provide them with a better idea what to look for?  I did find another 2821 on same ios at office E in another state with two different isp's connected on different interfaces and see the packet loss with Office A from one isp but no packet loss from the other isp so it doesn't seem 2821 or its ios is relevant.  The office E isp having packet loss with office A also happens to be the same isp in use at office B but packet loss with Office A is with multiple isp's.

This is going to be non-conclusive but may turn up helpful information.  Take a broken path.  Do a traceroute along it, noting down all the IP addresses.  Do a 100 or so pings to each hop, starting at the closest hop.

You should hopefully reach a point where the ping to one hop works fine, but the next hop has the packet loss.  This means the issue is likely to lay between these two points.  You can then take this info back to the ISP.

a period of back and forth with the isp noc identified problems in the paths but resulted in no improvement for us so in the end we had to change isp's which resolved the problem.  it's difficult to tell in some cases when carriers are dropping icmp skewing test results but checking hops along the traceroute as you suggested is useful and was looked at with the noc.