cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1208
Views
0
Helpful
23
Replies

New WAN network rollout - performance issues

Ableton34
Level 1
Level 1

Hi all.

Wondering if you can help. Where I work we have just finished rolling out a new WAN network to replace the previous one with a new sole ISP, previously it was a mixed ISP network with different flavour circuits connecting around 140 sites into our core.

Since we have migrated all the sites, we are getting an increasing number of them complaining about performance of all applications.

To give you some background, we are using BGP to connect CE and PE routers using P2P serial addresses, the BGP advertises the WAN ranges into the ISP cloud, ISP then sends all traffic back to our core network again using BGP and BGP advertises our internal LAN server ranges back out into the cloud.

We have a core and a DR site which have 2 pairs of resilient ASR routers which load share the traffic from various WAN sites inbound and out.

We also have an issue with the ISP in that their standard MTU size allowed in their cloud is 1492. I know this isnt standard and we have raised the issue, they cannot change this so we have had to have the ip mtu command on our WAN edge routers in order that there are no issues with the server traffic. This seemed to solve the original problem of nothing working properly.

Anyways, the complaints are getting more and more all the time and we can see  no network issues at all to the affected sites. Testing with ping/traceroute to check routing issues, no packet loss or high response times seen, a very healthy over all core to end site link. We just cannot work it out and have been scratching our heads for weeks.

Server team and most others are blaming the new network, with good reason but the network engineers just cannot find anything wrong! Traces have been done and all TCP conversations take place normally, IP SLA probes do not show anything untoward and our network monitoring tools arent seeing anything.

Has anyone experienced anything like this before, any ideas anyone as we have run out of them!

Thanks

1 Accepted Solution

Accepted Solutions

Hello,

both ends is good practice, but as Joseph suggested, HQ is probably sufficient.

I am very curious if that adjustment makes a difference...

View solution in original post

23 Replies 23

lpassmore
Level 1
Level 1

A couple of things to look at if you haven't already:

  • Have the ISP prove they aren't dropping any packets due to mismatched duplex or other interface errors on their end of your main links.  Make sure you have done the same.
  • If you are running a non-native speed (e.g. 200Mb or something) into the WAN, check if they are policing or shaping to the limit. Make sure you are shaping correctly.
  • Do you have QoS over the network? Are they honouring the markings and applying the correct bandwidth profiles according to your needs?

LP

Thanks for your reply.

-No duplex or other interface issues on main links.

-No issues with QOS, no drops on PE or CE and correct bandwidth settings

QOS config below for reference:

Core WAN routers

class-map match-all DACP_PREMIUM
description All EF traffic
match dscp ef
class-map match-all DACP_ENHANCED-MARKING
description DACP_ENHANCED traffic matching
match access-group name DACP_ENHANCED-marking
class-map match-all DACP_ENHANCED
description All AF traffic
match dscp af31
class-map match-all DACP_PREMIUM-MARKING
description DACP_PREMIUM traffic matching
match access-group name DACP_PREMIUM-marking
!
policy-map WAN-QoS-policy
class DACP_PREMIUM
bandwidth 20000
class DACP_ENHANCED
bandwidth 10200
policy-map WAN-traffic-marking
class DACP_PREMIUM-MARKING
set dscp ef
class DACP_ENHANCED-MARKING
set dscp af31


WAN branch site

Class-map SHAPE-ALL
match any
exit
Class-map DACP_PREMIUM_QUEING
match dscp ef
exit
Class-map DACP_ENHANCED2_QUEING
match dscp af31
exit

Policy-map WAN-QUEING-CHILD
class DACP_PREMIUM_QUEING
priority 3000
class DACP_ENHANCED2_QUEING
bandwidth 10200
exit
exit
policy-map WAN-QUEING-PARENT
class SHAPE-ALL
shape average 1000000000
service-policy WAN-QUEING-CHILD

Hello,

it would be useful to know how your load sharing/load balancing is configured. Can you post the configs of the main/core and one of the remote sites ?

Hi,

No load balancing on the branch sites, the traffic is organized in groups and redistributed into our IGP EIGRP and has the choice of paths either out of the HQ site or the DR site, we have noticed asymmetric routing happening on some of the WAN ranges but i'm led to believe this shouldnt be an issue as no firewalls/NAT happening anywhere, an example from one of our ASR core WAN routers:

access-list 50 remark route-map eigrp-to-bgp-redist catch all to deny advertising routes
access-list 50 permit 0.0.0.0
access-list 55 remark route-map eigrp-to-bgp-redist define specific routes for inbound load balancing
access-list 55 permit 172.24.xxx.0 0.0.0.255
access-list 55 permit 172.24.xxx.0 0.0.0.255
access-list 55 permit 172.24.xxx.0 0.0.0.255
access-list 55 permit 172.24.xxx.0 0.0.0.255
access-list 60 remark route-map eigrp-to-bgp-redist define specific routes for inbound load balancing
access-list 60 permit 172.24.xxx.0 0.0.0.255
access-list 60 permit 172.24.xxx.0 0.0.0.255
access-list 60 permit 172.24.xxx.0 0.0.0.255
access-list 60 permit 172.24.xxx.0 0.0.0.255
access-list 60 permit 172.24.xxx.0 0.0.0.255
access-list 65 remark route-map bgp-to-eigrp-redist define specific routes for outbound load balancing
access-list 65 permit 172.23.xxx.0 0.0.31.255
access-list 65 permit 172.23.xxx.0 0.0.31.255
access-list 65 permit 172.23.xxx.0 0.0.31.255
access-list 70 remark route-map bgp-to-eigrp-redist define specific routes for outbound load balancing
access-list 70 permit 172.23.xxx.0 0.0.31.255
access-list 70 permit 172.23.xxx.0 0.0.31.255
access-list 70 permit 10.99.xxx.0 0.0.0.255
access-list 70 permit 10.99.xxx.0 0.0.0.255
!
route-map default-originate permit 10
match ip address 50
set metric 100
!
route-map default-originate deny 20
!
route-map bgp-to-eigrp-redist permit 10
match ip address 65
set metric 10040000 10 0 1 1
!
route-map bgp-to-eigrp-redist permit 20
match ip address 70
set metric 10010000 40 0 1 1
!
route-map bgp-to-eigrp-redist permit 30
set metric 10040000 10 0 1 1
!
route-map eigrp-to-bgp-redist permit 10
match ip address 55
set metric 100
!
route-map eigrp-to-bgp-redist permit 20
match ip address 60
set metric 400
!
route-map eigrp-to-bgp-redist deny 30

Hello,

is your setup that your application servers are located at the central site, and user at the remote sites access those servers ?

Can you post the output of 'show interface' for the interfaces connecting the remote sites and the central site, as well as the configs of both those interfaces ?

Hi,

Yes remote sites connect to application servers at central site. There is a cloud network inbetween so i can only give you interface on CE router at remote site and  interface on central WAN router:

remote site example:

GigabitEthernet0/0/0 is up, line protocol is up
Hardware is BUILT-IN-2T+6X1GE, address is 843d.c6ac.7102 (bia 843d.c6ac.7102)
Description:
Internet address is 10.xxx.xxx.xxx/30
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive not supported
Full Duplex, 1000Mbps, link type is auto, media type is SX
output flow-control is on, input flow-control is on
ARP type: ARPA, ARP Timeout 04:00:00
Last input 02:53:40, output 02:53:40, output hang never
Last clearing of "show interface" counters never
Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: Class-based queueing
Output queue: 0/40 (size/max)
30 second input rate 4855000 bits/sec, 987 packets/sec
30 second output rate 2141000 bits/sec, 878 packets/sec
3102695976 packets input, 1522999527533 bytes, 0 no buffer
Received 0 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
1 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
2739840973 packets output, 923455122744 bytes, 0 underruns
0 output errors, 0 collisions, 4 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out

central site:

GigabitEthernet0/0/0 is up, line protocol is up
Hardware is BUILT-IN-2T+6X1GE, address is 843d.c6e6.bd02 (bia 843d.c6e6.bd02)
Description: WAN link to
Internet address is 10.xxx.xxx.xxx/30
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 4/255, rxload 3/255
Encapsulation ARPA, loopback not set
Keepalive not supported
Full Duplex, 1000Mbps, link type is auto, media type is SX
output flow-control is on, input flow-control is on
ARP type: ARPA, ARP Timeout 04:00:00
Last input 02:16:17, output 02:16:17, output hang never
Last clearing of "show interface" counters 34w0d
Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: Class-based queueing
Output queue: 0/40 (size/max)
5 minute input rate 14492000 bits/sec, 6053 packets/sec
5 minute output rate 17265000 bits/sec, 4222 packets/sec
24295970568 packets input, 6244383276120 bytes, 0 no buffer
Received 6 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
13904865541 packets output, 6704315439883 bytes, 0 underruns
0 output errors, 0 collisions, 36 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out

Hello,

the interfaces look clean.

I had a case once where adjusting the buffers on the edge routers made a huge difference. Can you post the output of ;'show buffers' from both routers ?

remote site:

Buffer elements:
1480 in free list
133842621 hits, 0 misses, 1019 created

Public buffer pools:
Small buffers, 104 bytes (total 1200, permanent 1200):
1198 in free list (200 min, 2500 max allowed)
47835648 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Middle buffers, 600 bytes (total 900, permanent 900):
899 in free list (100 min, 2000 max allowed)
70569704 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Big buffers, 1536 bytes (total 900, permanent 900, peak 907 @ 7w0d):
900 in free list (50 min, 1800 max allowed)
41504395 hits, 7 misses, 7 trims, 7 created
0 failures (0 no memory)
VeryBig buffers, 4520 bytes (total 100, permanent 100, peak 101 @ 7w0d):
100 in free list (0 min, 300 max allowed)
1805946 hits, 58 misses, 1 trims, 1 created
58 failures (0 no memory)
Large buffers, 5024 bytes (total 100, permanent 100, peak 101 @ 7w0d):
100 in free list (0 min, 300 max allowed)
58 hits, 0 misses, 1 trims, 1 created
0 failures (0 no memory)
VeryLarge buffers, 8256 bytes (total 100, permanent 100):
100 in free list (0 min, 300 max allowed)
3 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Huge buffers, 18024 bytes (total 20, permanent 20, peak 21 @ 7w0d):
20 in free list (0 min, 33 max allowed)
0 hits, 0 misses, 1 trims, 1 created
0 failures (0 no memory)

Interface buffer pools:
CF Small buffers, 104 bytes (total 101, permanent 100, peak 101 @ 7w0d):
101 in free list (100 min, 200 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
Generic ED Pool buffers, 512 bytes (total 101, permanent 100, peak 101 @ 7w0d):
101 in free list (100 min, 100 max allowed)
0 hits, 0 misses
CF Middle buffers, 600 bytes (total 101, permanent 100, peak 101 @ 7w0d):
101 in free list (100 min, 200 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
Syslog ED Pool buffers, 600 bytes (total 1057, permanent 1056, peak 1057 @ 7w0d):
1025 in free list (1056 min, 1056 max allowed)
136 hits, 0 misses
EOBC0 buffers, 1524 bytes (total 256, permanent 256):
256 in free list (0 min, 256 max allowed)
62 hits, 0 fallbacks
CF Big buffers, 1536 bytes (total 26, permanent 25, peak 26 @ 7w0d):
26 in free list (25 min, 50 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
IPC buffers, 4096 bytes (total 2604, permanent 2604):
2599 in free list (868 min, 8680 max allowed)
61 hits, 0 fallbacks, 0 trims, 0 created
0 failures (0 no memory)
CF VeryBig buffers, 4520 bytes (total 3, permanent 2, peak 3 @ 7w0d):
3 in free list (2 min, 4 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
CF Large buffers, 5024 bytes (total 2, permanent 1, peak 2 @ 7w0d):
2 in free list (1 min, 2 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
IPC Medium buffers, 16384 bytes (total 2, permanent 2):
2 in free list (1 min, 8 max allowed)
0 hits, 0 fallbacks, 0 trims, 0 created
0 failures (0 no memory)
Private Huge IPC buffers, 18024 bytes (total 1, permanent 0, peak 1 @ 7w0d):
1 in free list (0 min, 4 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
Private Huge buffers, 65280 bytes (total 1, permanent 0, peak 1 @ 7w0d):
1 in free list (0 min, 4 max allowed)
0 hits, 0 misses, 685 trims, 686 created
0 failures (0 no memory)
IPC Large buffers, 65535 bytes (total 17, permanent 16, peak 17 @ 7w0d):
17 in free list (16 min, 16 max allowed)
0 hits, 0 misses, 110885 trims, 110886 created
0 failures (0 no memory)

Header pools:
Header buffers, 0 bytes (total 266, permanent 256, peak 266 @ 7w0d):
10 in free list (10 min, 512 max allowed)
253 hits, 3 misses, 0 trims, 10 created
0 failures (0 no memory)
256 max cache size, 256 in cache
51039987 hits in cache, 0 misses in cache

Particle Clones:
1024 clones, 0 hits, 0 misses

Public particle pools:
F/S buffers, 256 bytes (total 384, permanent 384):
128 in free list (128 min, 1024 max allowed)
256 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
256 max cache size, 256 in cache
0 hits in cache, 0 misses in cache
Normal buffers, 512 bytes (total 512, permanent 512):
384 in free list (128 min, 1024 max allowed)
128 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
128 max cache size, 128 in cache
0 hits in cache, 0 misses in cache

Private particle pools:
lsmpi_rx buffers, 416 bytes (total 8194, permanent 8194):
0 in free list (0 min, 8194 max allowed)
8194 hits, 0 misses
8194 max cache size, 0 in cache
75731524 hits in cache, 0 misses in cache
lsmpi_tx buffers, 416 bytes (total 4098, permanent 4098):
0 in free list (0 min, 4098 max allowed)
4098 hits, 0 misses
4098 max cache size, 4095 in cache
52664741 hits in cache, 0 misses in cache

core site:

Buffer elements:
1480 in free list
290666885 hits, 0 misses, 1019 created

Public buffer pools:
Small buffers, 104 bytes (total 1200, permanent 1200):
1198 in free list (200 min, 2500 max allowed)
66769202 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Middle buffers, 600 bytes (total 900, permanent 900):
899 in free list (100 min, 2000 max allowed)
199330039 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Big buffers, 1536 bytes (total 900, permanent 900, peak 901 @ 7w0d):
900 in free list (50 min, 1800 max allowed)
136510026 hits, 0 misses, 1 trims, 1 created
0 failures (0 no memory)
VeryBig buffers, 4520 bytes (total 100, permanent 100, peak 102 @ 7w0d):
100 in free list (0 min, 300 max allowed)
4359478 hits, 22 misses, 2 trims, 2 created
22 failures (0 no memory)
Large buffers, 5024 bytes (total 100, permanent 100, peak 101 @ 7w0d):
100 in free list (0 min, 300 max allowed)
22 hits, 0 misses, 1 trims, 1 created
0 failures (0 no memory)
VeryLarge buffers, 8256 bytes (total 100, permanent 100):
100 in free list (0 min, 300 max allowed)
3 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
Huge buffers, 18024 bytes (total 20, permanent 20, peak 21 @ 7w0d):
20 in free list (0 min, 33 max allowed)
0 hits, 0 misses, 1 trims, 1 created
0 failures (0 no memory)

Interface buffer pools:
CF Small buffers, 104 bytes (total 101, permanent 100, peak 101 @ 7w0d):
101 in free list (100 min, 200 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
Generic ED Pool buffers, 512 bytes (total 101, permanent 100, peak 101 @ 7w0d):
101 in free list (100 min, 100 max allowed)
0 hits, 0 misses
CF Middle buffers, 600 bytes (total 101, permanent 100, peak 101 @ 7w0d):
101 in free list (100 min, 200 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
Syslog ED Pool buffers, 600 bytes (total 1057, permanent 1056, peak 1057 @ 7w0d):
1025 in free list (1056 min, 1056 max allowed)
1262 hits, 0 misses
EOBC0 buffers, 1524 bytes (total 256, permanent 256):
256 in free list (0 min, 256 max allowed)
64 hits, 0 fallbacks
CF Big buffers, 1536 bytes (total 26, permanent 25, peak 26 @ 7w0d):
26 in free list (25 min, 50 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
IPC buffers, 4096 bytes (total 2604, permanent 2604):
2599 in free list (868 min, 8680 max allowed)
90 hits, 0 fallbacks, 0 trims, 0 created
0 failures (0 no memory)
CF VeryBig buffers, 4520 bytes (total 3, permanent 2, peak 3 @ 7w0d):
3 in free list (2 min, 4 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
CF Large buffers, 5024 bytes (total 2, permanent 1, peak 2 @ 7w0d):
2 in free list (1 min, 2 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
IPC Medium buffers, 16384 bytes (total 2, permanent 2):
2 in free list (1 min, 8 max allowed)
0 hits, 0 fallbacks, 0 trims, 0 created
0 failures (0 no memory)
Private Huge IPC buffers, 18024 bytes (total 1, permanent 0, peak 1 @ 7w0d):
1 in free list (0 min, 4 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
Private Huge buffers, 65280 bytes (total 1, permanent 0, peak 1 @ 7w0d):
1 in free list (0 min, 4 max allowed)
0 hits, 0 misses, 2246 trims, 2247 created
0 failures (0 no memory)
IPC Large buffers, 65535 bytes (total 17, permanent 16, peak 17 @ 7w0d):
17 in free list (16 min, 16 max allowed)
0 hits, 0 misses, 363225 trims, 363226 created
0 failures (0 no memory)

Header pools:
Header buffers, 0 bytes (total 266, permanent 256, peak 266 @ 7w0d):
10 in free list (10 min, 512 max allowed)
253 hits, 3 misses, 0 trims, 10 created
0 failures (0 no memory)
256 max cache size, 256 in cache
136441706 hits in cache, 0 misses in cache

Particle Clones:
1024 clones, 0 hits, 0 misses

Public particle pools:
F/S buffers, 256 bytes (total 384, permanent 384):
128 in free list (128 min, 1024 max allowed)
256 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
256 max cache size, 256 in cache
0 hits in cache, 0 misses in cache
Normal buffers, 512 bytes (total 512, permanent 512):
384 in free list (128 min, 1024 max allowed)
128 hits, 0 misses, 0 trims, 0 created
0 failures (0 no memory)
128 max cache size, 128 in cache
0 hits in cache, 0 misses in cache

Private particle pools:
lsmpi_rx buffers, 416 bytes (total 8194, permanent 8194):
0 in free list (0 min, 8194 max allowed)
8194 hits, 0 misses
8194 max cache size, 0 in cache
138012908 hits in cache, 0 misses in cache
lsmpi_tx buffers, 416 bytes (total 4098, permanent 4098):
0 in free list (0 min, 4098 max allowed)
4098 hits, 0 misses
4098 max cache size, 4095 in cache
143529959 hits in cache, 0 misses in cache

Hello,

we could do some buffer adjustments, but before that (and referring to your original post), since you changed the MTU size to 1492, make sure that the interfaces configured with that MTU size also have:

ip tcp adjust-mss 1452

configured.

Thanks.

thats interesting, the network architect did not ask for the MSS adjustment to be applied. Is this possibly a cause of performance issues?

What buffers do you think need adjusting?

Hello,

MSS could cause problems because it it is based on the standard MTU size. If you adjust MTU, MSS should be adjusted, too.

The buffers don't look too bad actually, you could adjust the VeryBig buffers. Try the MSS adjustment first, and if that doesn't help, we'll adjust the buffers.

Ok thanks for your help, there is a plan to add the MSS adjustment on the core WAN routers so we can implement a video solution which requires 1452. I presume this will also need to be added to all the remote branch sites.

Thanks for all your help

Hello,

yes, the branch sites need to be adjusted too. Actually, MTU and MSS should be the same at both ends.

Actually, I understand MSS adjust works anywhere along the path, for both ingress or egress, so if my understanding is correct, you only need to apply to HQ devices (however, I consider it good practice to configure on both sides of the smaller MTU link).

Review Cisco Networking products for a $25 gift card