取消
显示结果 
搜索替代 
您的意思是: 
cancel
280
查看次数
2
有帮助
0
评论

Snipaste_2023-07-03_22-03-43.png

在Cisco SDWAN的环境中,一直困扰我的一个问题是VRRP不管是在哪种环境下,都会存在Backup抢占Master的Flapping情况。具体的异常现象如下:

Jul  3 13:02:09.194: %VRRP-6-STATE: GigabitEthernet5.10 IPv4 group 8 state BACKUP -> MASTER
Jul  3 13:02:09.409: %VRRP-6-STATE: GigabitEthernet5.10 IPv4 group 8 state MASTER -> BACKUP
Jul  3 13:02:09.890: %VRRP-6-STATE: GigabitEthernet5.10 IPv4 group 8 state BACKUP -> MASTER
Jul  3 13:02:09.939: %VRRP-6-STATE: GigabitEthernet5.10 IPv4 group 8 state MASTER -> BACKUP
Jul  3 13:02:13.965: %VRRP-6-STATE: GigabitEthernet5.10 IPv4 group 8 state BACKUP -> MASTER

检查了很多VRRP的配置,但是都没有明显的观察到具体的问题。配置VRRP的配置命令也比较简单,不存在多复杂和难以理解的内容,如下是我们绑定模板后默认的配置内容:

C8K-13#sho run inter Gi5.10 
Building configuration...

Current configuration : 325 bytes
!
interface GigabitEthernet5.10
 encapsulation dot1Q 10
 vrf forwarding 10
 ip address 192.168.103.252 255.255.255.0
 no ip redirects
 ip mtu 1496
 ip nbar protocol-discovery
 vrrp 8 address-family ipv4
  timers advertise 100
  vrrpv2
  track omp shutdown
  address 192.168.103.254 primary
  exit-vrrp
 arp timeout 1200
end

C8K-13#

细心的小伙伴可能会注意到,如上的命令timers advertise的参数值是100(这里的单位是ms),换言之就是0.1s,而从我们官方的文档查询得到的是:VRRP默认的timer是1s(即1000ms)。

Parameter Name Description
Group ID

Enter the virtual router ID, which is a numeric identifier of the virtual router. You can configure a maximum of 24 groups.

Range: 1 through 255

Priority

Enter the priority level of the router. There router with the highest priority is elected as primary VRRP router. If two routers have the same priority, the one with the higher IP address is elected as primary VRRP router.

Range: 1 through 254

Default: 100

Timer

Specify how often the primary VRRP router sends VRRP advertisement messages. If subordinate routers miss three consecutive VRRP advertisements, they elect a new primary VRRP routers.

Range: 1 through 3600 seconds

Default: 1 second

Track OMP​

Track Prefix List

By default, VRRP uses of the state of the service (LAN) interface on which it is running to determine which router is the primary virtual router. if a router loses all its WAN control connections, the LAN interface still indicates that it is up even though the router is functionally unable to participate in VRRP. To take WAN side connectivity into account for VRRP, configure one of the following:

Track OMP—Click On for VRRP to track the Overlay Management Protocol (OMP) session running on the WAN connection. If the primary VRRP router loses all its OMP sessions, VRRP elects a new default gateway from those that have at least one active OMP session.

Track Prefix List—Track both the OMP session and a list of remote prefixes, which is defined in a prefix list configured on the local router. If the primary VRRP router loses all its OMP sessions, VRRP failover occurs as described for the Track OMP option. In addition, if reachability to one of the prefixes in the list is lost, VRRP failover occurs immediately, without waiting for the OMP hold timer to expire, thus minimizing the amount of overlay traffic is dropped while the routers determine the primary VRRP router.

IP Address Enter the IP address of the virtual router. This address must be different from the configured interface IP addresses of both the local router and the peer running VRRP.

https://www.cisco.com/c/en/us/td/docs/routers/sdwan/configuration/System-Interface/systems-interfaces-book-xe-sdwan/configure-interfaces.html#id_107075

从vManage的GUI界面查看,GUI显示的timer默认值的确是100ms。

Snipaste_2023-07-03_22-19-06.png

由此可以判断得出,这之间必然是存在问题的,VRRP的默认值为1秒,而Feature Template中默认的Timer参数值为0.1秒,这势必会在正常的网络运行中产生问题,导致VRRP Status翻动也是原因之一。

当然Workaround就是修改成思科文档中提到的默认值(注意修改成默认值后,在CLI里面就看不到timer advertise命令了,因为默认的参数不会被显示出来)。

C8K-13#sho run inter Gi5.10 
Building configuration...

Current configuration : 325 bytes
!
interface GigabitEthernet5.10
 encapsulation dot1Q 10
 vrf forwarding 10
 ip address 192.168.103.252 255.255.255.0
 no ip redirects
 ip mtu 1496
 ip nbar protocol-discovery
 vrrp 8 address-family ipv4
  vrrpv2
  track omp shutdown
  address 192.168.103.254 primary
  exit-vrrp
 arp timeout 1200
end

作为延伸,思科官方也的确公布了对应的BUG,我们可以在bug search工具中查找。

Snipaste_2023-07-03_22-15-05.png

https://bst.cisco.com/bugsearch/bug/CSCwd95581

希望后续遇到这样问题得朋友可以避开这个坑。这个BUG的修复可能要到vmanage 20.11.x 及之后的版本了。

根据如上的内容,将Feature Template中将我们的Timer值修改成1000,即可规避该问题。

Snipaste_2023-07-03_22-19-19.png

入门指南

使用上面的搜索栏输入关键字、短语或问题,搜索问题的答案。

我们希望您在这里的旅程尽可能顺利,因此这里有一些链接可以帮助您快速熟悉思科社区:









快捷链接