cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1687
Views
5
Helpful
11
Replies

GLBP weighting load balancing

fgasimzade
Level 4
Level 4

Hello,

 

We have 2 2951 routers with GLBP configured and we have glbp 30 weighting 50 configured on both as well as glbp 30 load-balancing weighted

However, the load is not balanced 50/50, I can see it from the CPU usage. One is around 90, another is around 40

As far as I understand, they both should have approx the same load if configured 50/50, but we dont see it.

Am I understanding it correctly?

1 Accepted Solution

Accepted Solutions

For the sake of testing, I set up two hosts sending equal sized packets to a remote destination. The CPU utilization on both routers was roughly the same. As soon as I had one host send very large packets, the CPU utilization on one of the routers spiked, not on both. So I think Joseph's observation makes sense, and GLB does not do per packet load balancing...

View solution in original post

11 Replies 11

Hello,

 

post the output of:

 

sh proc cpu | ex 0.00

 

from both routers. CPU utilization is not necessarily related to interface traffic.

 

Also, post the output of 'sh interfaces x' from both routers, where 'x' is the interface that is participating in GLBP.

Hello Georg,

Both routers have the same configuration:

rt-front02-occ#sh proc cpu | ex 0.00
CPU utilization for five seconds: 83%/79%; one minute: 80%; five minutes: 79%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
2 5602056 7446582 752 0.07% 0.06% 0.07% 0 Load Meter
32 728439636 2358937600 308 1.27% 1.19% 1.18% 0 ARP Input
84 67975744 37562221 1809 0.15% 0.17% 0.16% 0 Per-Second Jobs
103 55977864 148776044 376 0.15% 0.14% 0.15% 0 Netclock Backgro
121 167023124 769255553 217 0.55% 0.49% 0.44% 0 IP Input
125 5128836 577149511 8 0.07% 0.03% 0.02% 0 VRRS Main thread
128 13887988 327281552 42 0.15% 0.14% 0.15% 0 Ethernet Msec Ti
193 17087952 56117122 304 0.07% 0.05% 0.05% 0 CEF: IPv4 proces
232 143878688 281588900 510 0.15% 0.18% 0.20% 0 ADJ resolve proc
345 75453636 37074670 2035 0.31% 0.26% 0.25% 0 CFT Timer Proces
352 9660940 1179611605 8 0.07% 0.05% 0.07% 0 GLBP
356 37700516 221115111 170 0.39% 0.42% 0.40% 0 IP SLAs XOS Even

 

RT-FRONT01-BCT#sh proc cpu | ex 0.00
CPU utilization for five seconds: 44%/40%; one minute: 43%; five minutes: 45%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
6 47450204 3280448 14464 0.79% 0.26% 0.18% 0 Check heaps
32 377285540 889329219 424 1.19% 1.14% 1.17% 0 ARP Input
84 46115780 14958400 3082 0.15% 0.19% 0.18% 0 Per-Second Jobs
103 40687268 58900535 690 0.15% 0.15% 0.15% 0 Netclock Backgro
118 44575500 29454910 1513 0.07% 0.04% 0.05% 0 BPSM stat Proces
121 119281256 381831488 312 0.23% 0.23% 0.24% 0 IP Input
128 8556744 1817554630 4 0.15% 0.15% 0.15% 0 Ethernet Msec Ti
165 2869432 447928928 6 0.07% 0.03% 0.02% 0 IPAM Manager
256 886320 76492955 11 0.07% 0.04% 0.05% 0 PPP manager
337 129899760 17558708 7398 0.07% 0.56% 0.59% 0 SNMP ENGINE
345 60140284 14975150 4016 0.15% 0.15% 0.17% 0 CFT Timer Proces
354 23916632 3422184027 6 0.55% 0.51% 0.50% 0 IP SLAs XOS Even

 

rt-front02-occ#sh interfaces gigabitEthernet 0/1.30
GigabitEthernet0/1.30 is up, line protocol is up
Hardware is PQ3_TSEC, address is e4c7.2292.8a61 (bia e4c7.2292.8a61)
Internet address is XX.XXX.XX.222/24
MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 25/255, rxload 25/255
Encapsulation 802.1Q Virtual LAN, Vlan ID 30.
ARP type: ARPA, ARP Timeout 04:00:00
Keepalive set (10 sec)
Last clearing of "show interface" counters never

 

RT-FRONT01-BCT#sh interfaces gigabitEthernet 0/1.30
GigabitEthernet0/1.30 is up, line protocol is up
Hardware is PQ3_TSEC, address is c067.afc7.f061 (bia c067.afc7.f061)
Internet address is XX.XXX.XX.223/24
MTU 1500 bytes, BW 10000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 30/255, rxload 30/255
Encapsulation 802.1Q Virtual LAN, Vlan ID 30.
ARP type: ARPA, ARP Timeout 04:00:00
Keepalive set (10 sec)
Last clearing of "show interface" counters never

PS.

CPU usage on both routers is changing, like 80%/20% then 20%/80% - looks like they preempt each other

on the interface show load is equal, Can you post full output of show process

 

show process cpu | ex 0.00%  (screen shot or full) both the router.

Since i do not see the process issue due to GLBP

 

352 9660940 1179611605 8 0.07% 0.05% 0.07% 0 GLBP

 

if possible what is the version of code running on both th devices ?

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Hello Balaji,

RT-FRONT01-BCT#show process cpu | ex 0.00%
CPU utilization for five seconds: 88%/84%; one minute: 80%; five minutes: 76%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
2 4082212 2961516 1378 0.07% 0.08% 0.06% 0 Load Meter
15 21252108 14755237 1440 0.23% 0.05% 0.05% 0 Environmental mo
32 378342404 892807641 423 0.95% 1.08% 1.12% 0 ARP Input
84 46197156 15016469 3076 0.15% 0.17% 0.16% 0 Per-Second Jobs
103 40734632 59131260 688 0.15% 0.14% 0.15% 0 Netclock Backgro
118 44731804 29570230 1512 0.07% 0.04% 0.05% 0 BPSM stat Proces
121 119544504 383363661 311 0.47% 0.57% 0.56% 0 IP Input
125 3855836 228757826 16 0.07% 0.03% 0.02% 0 VRRS Main thread
128 8569852 1824812111 4 0.07% 0.15% 0.15% 0 Ethernet Msec Ti
193 11267192 22101588 509 0.07% 0.04% 0.05% 0 CEF: IPv4 proces
232 103929184 158924683 653 0.31% 0.31% 0.31% 0 ADJ resolve proc
281 740124 78285096 9 0.07% 0.02% 0.02% 0 PPP Events
345 60233000 15036171 4005 0.39% 0.34% 0.32% 0 CFT Timer Proces
354 23959800 3436153589 6 0.39% 0.44% 0.46% 0 IP SLAs XOS Even

 

 

rt-front02-occ#show process cpu | ex 0.00%
CPU utilization for five seconds: 31%/28%; one minute: 30%; five minutes: 27%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
32 729445288 2362522793 308 1.27% 1.20% 1.15% 0 ARP Input
84 68051888 37620407 1808 0.23% 0.19% 0.18% 0 Per-Second Jobs
103 56022152 149007497 375 0.15% 0.15% 0.15% 0 Netclock Backgro
118 92218620 74487725 1238 0.07% 0.04% 0.05% 0 BPSM stat Proces
121 167232792 770572989 217 0.15% 0.22% 0.22% 0 IP Input
128 13900328 334591317 41 0.15% 0.15% 0.15% 0 Ethernet Msec Ti
345 75528492 37135546 2033 0.15% 0.14% 0.15% 0 CFT Timer Proces
356 37736408 235193443 160 0.47% 0.49% 0.48% 0 IP SLAs XOS Even

 

As you can see, even though weighting is now configured 50 and 50, CPU utilization is 31 on one router and 88 on another with TX/RX values being txload 48/255, rxload 49/255 on 88% router and txload 7/255, rxload 7/255 on 31% router

 

RT-FRONT01-BCT#sh version
Cisco IOS Software, C2951 Software (C2951-UNIVERSALK9-M), Version 15.2(4)M4, RELEASE SOFTWARE (fc2)

 

rt-front02-occ#sh version
Cisco IOS Software, C2951 Software (C2951-UNIVERSALK9-M), Version 15.2(4)M4, RELEASE SOFTWARE (fc2)

 

EDIT:

Now CPU utilization changed, they mirrored each other, the one which had 80% is now 30%, and the other went from 30% to 80%

Ok i understand the CPU hitting as below

 

RT-FRONT01-BCT#show process cpu | ex 0.00%

 

But the evidence does not show what process is maxing here, what kind of bandwidth and throughput on this device?

 

what is the device located in the network, pointing to internet edge ?

 

I suspect the CPU load because of throughput. 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Joseph W. Doherty
Hall of Fame
Hall of Fame
I recall (?) GLBP doesn't do dynamic load balancing, it balances by round-robin gateways against ingress MACs. I.e. one busy host, can skew your load balance. Even it you had weighted to be 99:1, it's possible a busy host could total invert that balance. Over time, you might see the overall load balance more closely track the ratio you set, but I think I recall the same host MAC will always go to the same assigned gateway. If so, a constantly busy host to keep the load balancing different from what's configured.

Hello Joseph,

 

I thought about it as well, but we have weighted load-balance, not round robin. It is configured to be 50/50, but CPU utilization on one router is higher, than on another

For the sake of testing, I set up two hosts sending equal sized packets to a remote destination. The CPU utilization on both routers was roughly the same. As soon as I had one host send very large packets, the CPU utilization on one of the routers spiked, not on both. So I think Joseph's observation makes sense, and GLB does not do per packet load balancing...

Hello Georg,

Thank you. So it means one host is having much more bandwidth utilization, than another. We will take a deeper look into it.

 

 

When I noted "round-robin", I perhaps implied "equally", but weighted GLBP is done proportionally. So for example, if 3:1, three hosts will be directed to the one gateway, for every one host that's directed to the other gateway.
Review Cisco Networking for a $25 gift card