11-07-2011 01:16 PM - edited 03-07-2019 03:15 AM
We run a college campus with 4000+ clients and two core internal 7206VXRs setup to load balance each vlan's default gateway between them via GLBP. We use weighted load balancing as both routers are essentially identical in their specs and connections to the core. Our core switching is actually a stack of ether-channelled 3560Gs ( sad I know ) and our fiber distribution plant is a set of redundant 3750G Stacks that run dual fiber links out to each network closet. The closets are all Layer2. For the most part GLBP does exactly what I want, evenly balances traffic between the two core routers and if one dies ( tested ) the other picks up the slack seamlessly. However on occasion I see a massive amount of GLBP state changes where there routers change states from Active to Listen repeatedly.
i.e:
Nov 2 09:53:24 172.16.0.2 4123445: Nov 2 09:52:10: %GLBP-6-FWDSTATECHANGE: GigabitEthernet0/3.254 Grp 254 Fwd 1 state Listen -> Active
Nov 2 09:53:25 172.16.0.2 4123448: Nov 2 09:52:10: %GLBP-6-FWDSTATECHANGE: GigabitEthernet0/3.254 Grp 254 Fwd 1 state Active -> Listen ======================
Sometimes this occurs on multiple subinterfaces at the same time causing some pretty hefty router overhead. Here is how that particular sub-interface is configured on both routers:
Router 1:
interface GigabitEthernet0/3.254
encapsulation dot1Q 254
ip address 192.168.254.5 255.255.255.0
ip nbar protocol-discovery
ip flow ingress
ip pim sparse-dense-mode
glbp 254 ip 192.168.254.1
glbp 254 timers msec 250 msec 750
glbp 254 priority 150
glbp 254 preempt delay minimum 180
glbp 254 load-balancing weighted
glbp 254 authentication text ******
!
Router 2:
interface GigabitEthernet0/3.254
encapsulation dot1Q 254
ip address 192.168.254.7 255.255.255.0
ip nbar protocol-discovery
ip flow ingress
ip pim sparse-dense-mode
glbp 254 ip 192.168.254.1
glbp 254 timers msec 250 msec 750
glbp 254 priority 140
glbp 254 preempt delay minimum 180
glbp 254 load-balancing weighted
glbp 254 authentication text *****
!
====================================
All of our sub-interfaces are setup the same way. Does anything stand out as dead wrong? I don't have many peers that run full Cisco shops with GLBP implemented... Is the high frequency of GLBP events normal? Any advice would be great, even further reading or training on campus design (aside from the generic cisco Campus HA docs and First Hop redundancy docs I've been through all of them). Real world info on large scale GLBP deployments is very hard to find.
Thanks,
Jim Phillips
Network and Communications Support Technologist
Cambrian College