cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
5447
Views
5
Helpful
10
Replies

Catalyst 9500 Stack is in Half ring setup; Reloading a switch might cause stack split message

laukik.nahar1
Level 1
Level 1

Hi,

 

Have configured 2 Catalyst 9500 switches in SVL with 1 dual-active link and 2 SVL links.

 

Now the switches are in Active-Standby state and whenever we try to reload Core 2 switch[that is Standby switch], we are getting below message - 

"Stack is in Half ring setup; Reloading a switch might cause stack split message".

 

The actual issue is, we have also connected 2 WLC with LAG enabled. So whenever we reload the Standby device, WLC loses its HA and again form HA after few seconds[We are receive atleast 15-20 Request time out before WLC will again for the HA]. This issue is exactly seem when the switch comes up and trying to for the stand-by switch[issue observed when the Current state is 'HA sync in progress'].

 

Because of this, whenever the Stand-by switch reloaded and when it tries to form the HA, all the users connected through Wireless will loose connectivity for few seconds until the issue is HA resolved.

 

Tried to open the Wireless TAC and we are working on the same, but it seems to be the issue from Catalyst 9500 switch and due to above mentioned message.

 

WLC IOS version - 8.5.140.0 [most stable version according to Cisco TAC]

Catalyst 9500 IOS version - 16.9.1

 

 

SVL configuration on SWITCH -

 

interface TenGigabitEthernet1/0/38
stackwise-virtual dual-active-detection
description "SVL Dual-Active Detection Link"

interface TenGigabitEthernet2/0/38
stackwise-virtual dual-active-detection
description "SVL Dual-Active Detection Link"

 

interface TenGigabitEthernet1/0/39
stackwise-virtual link 1
description "SVL StackWise Virtual Link-1"

 

interface TenGigabitEthernet2/0/39
stackwise-virtual link 1
description "SVL StackWise Virtual Link-1"

 

interface TenGigabitEthernet1/0/40
stackwise-virtual link 1
description "SVL StackWise Virtual Link-1"

 

interface TenGigabitEthernet2/0/40
stackwise-virtual link 1
description "SVL StackWise Virtual Link-1"

 

 

show switch
Switch/Stack Mac Address : 5061.bf28.9d00 - Local Mac Address
Mac persistency wait time: Indefinite
H/W Current
Switch# Role Mac Address Priority Version State
-------------------------------------------------------------------------------------
*1 Active 5061.bf28.9d00 1 V01 Ready
2 Standby 5061.bf28.a380 1 V01 HA sync in progress

 

 

 

 

Experts, please help.

 

Thanks,

Laukik.

10 Replies 10

Hello,

 

the message you get is informational and would also pop up when you would manually force a switchover (redundancy force-switchover).

 

That said, are your switches configured for stateful switchover (SSO) ?

Hi Georg,

Yes they are configured doe SSO, below is the output of 'show redundancy' -

show redundancy
Redundant System Information :
------------------------------
Available system uptime = 9 weeks, 4 days, 8 hours, 14 minutes
Switchovers system experienced = 28
Standby failures = 4
Last switchover reason = active unit removed

Hardware Mode = Duplex
Configured Redundancy Mode = sso
Operating Redundancy Mode = sso
Maintenance Mode = Disabled
Communications = Up

Current Processor Information :
-------------------------------
Active Location = slot 1
Current Software state = ACTIVE
Uptime in current state = 23 hours, 52 minutes
Image Version = Cisco IOS Software [Fuji], Catalyst L3 Switch Software (CAT9K_IOSXE), Version 16.9.1, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2018 by Cisco Systems, Inc.
Compiled Tue 17-Jul-18 17:00 by mcpre
BOOT = flash:packages.conf
CONFIG_FILE =
Configuration register = 0x102

Peer Processor Information :
----------------------------
Standby Location = slot 2
Current Software state = STANDBY HOT
Uptime in current state = 47 minutes
Image Version = Cisco IOS Software [Fuji], Catalyst L3 Switch Software (CAT9K_IOSXE), Version 16.9.1, RELEASE SOFTWARE (fc2)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2018 by Cisco Systems, Inc.
Compiled Tue 17-Jul-18 17:00 by mcpre
BOOT = flash:packages.conf
CONFIG_FILE =
Configuration register = 0x102


Thanks,
Laukik

Was this working before ?

there in active/standby HOT which is correct
does show switch still show it stuck in HA sync in progress ?

This suggest a dual active scenario took place
Last switchover reason = active unit removed

is the VSL fully up
sh switch virtual link
sh switch virtual dual-active summary

Hi Mark,

The switch doesn't get stuck at 'HA sync in progress'. It gets completed and shows stand-by Ready.

Yes, the SVL is fully up as the output of show switch is Active-Standby.

Below are the output as per request -

1#show stackwise-virtual bandwidth
Switch Bandwidth
------ ---------
1 10G
2 10G

SHG-DATACENTER-CR01#
SHG-DATACENTER-CR01#
SHG-DATACENTER-CR01#show stackwise-virtual link
Stackwise Virtual Link(SVL) Information:
----------------------------------------
Flags:
------
Link Status
-----------
U-Up D-Down
Protocol Status
---------------
S-Suspended P-Pending E-Error T-Timeout R-Ready
-----------------------------------------------
Switch SVL Ports Link-Status Protocol-Status
------ --- ----- ----------- ---------------
1 1 TenGigabitEthernet1/0/39 U R
TenGigabitEthernet1/0/40 U R
2 1 TenGigabitEthernet2/0/39 U R
TenGigabitEthernet2/0/40 U R


#show stackwise-virtual bandwidth
Switch Bandwidth
------ ---------
1 20G
2 20G


1#show stackwise-virtual dual-active-detection
Dual-Active-Detection Configuration:
-------------------------------------
Switch Dad port Status
------ ------------ ---------
1 TenGigabitEthernet1/0/38 up
2 TenGigabitEthernet2/0/38 up


#show stackwise-virtual dual-active-detection pagp
Pagp dual-active detection enabled: No
In dual-active recovery mode: No

1#show stackwise-virtual neighbors
Stackwise Virtual Link(SVL) Neighbors Information:
--------------------------------------------------
Switch SVL Local Port Remote Port
------ --- ---------- -----------
1 1 TenGigabitEthernet1/0/39 TenGigabitEthernet2/0/39
TenGigabitEthernet1/0/40 TenGigabitEthernet2/0/40
2 1 TenGigabitEthernet2/0/39 TenGigabitEthernet1/0/39
TenGigabitEthernet2/0/40 TenGigabitEthernet1/0/40


Thanks,
Laukik.


Ok so even though everything looks good from the switch side , when you force a failover either by cli or pulling the plug your HA doesn't fully work and you loose your wireless users connected to the standby switch

Is this only happening to wireless users ? is there any switches dual connected to the pair of switches that you could run a constant ping and see if the same issue is seen , it will rule out the wlc , just thinking is it something to do with how the 9k communicates with the wlc when the failover occurs or is it something general to how the 9k is failing over and effects all systems dual homed

Hi Mark,

 

Is this only happening to wireless users ?

 

No, not just wireless users, but wired users are also affected only when the switch is trying to get into Stand-by role and not before that.

 

But there is one interesting/weird behavior we did observ[from LAN / Wireless users] , is that when we keep continuous ping to 8.8.8.8 , we loose the ping response as soon as we reload the Stand-by switch[Telnet/SSH connection is also lost, needs to reset the connection via PUTTY] and once it gets back to Stand-by role, the ping response comes up. But during this whole time internet of the users is working fine even when we see ping drop to 8.8.8.8.

 

But as the WLC HA also fails, all the AP's are getting disconnected and it takes time to re-connect [atleast 15-20 Request time out] and as most of the users are connected via wireless getting lot of complaints.

 

If this is how catalyst 9500 works, seems it needs to be fixed at the earliest, as 9500 are used as Core switches.

Also I don't think this could be the behavior of the catalyst 9500 switches when configured as SVL. But if it is, please let us know.

 

Catalyst 9500 IOS image - 16.9.1

 

Thanks,

Laukik

Hi Laukik, did you find and fix the issue?

this is a bit late/old, but -
What is the config on the ports leading to your WLC?
I am wondering if it is etherchannel mode "on", you should use mode active.

(for a transition, you could set the switch as mode passive, then reconfigure the WLC to active, but then you want to set the switch to active too).

It sounds to me like load balancing kicks in when the link/line comes up and the switch isn't actually ready.

(Fully a S.W.A.G. without the interface config)

robertoriostt
Level 1
Level 1

Hi guys,

 

I had the same issue on my 9500 switch and this message on on it is only an informative message, its not represent a reload problem according Cisco support. I heard from Cisco this informative message will be remove from the newer version after 16.6.x that I'm currently using.

Hi, we are running 17.3.3 and this message is still there...

 

9500_CORE#redundancy reload peer
Stack is in Half ring setup; Reloading a switch might cause stack split
Reload peer [confirm]

 

Regards,

Gavin

Review Cisco Networking products for a $25 gift card