07-10-2017 05:19 AM - edited 03-08-2019 11:15 AM
Hello,
We have 2 Cisco 4500x switches configured with VSS. We have a portchanel of 2 FO links as VSL
One of the ports in portchannel went down a couple of days ago, however the port is blinking green. Port channel shows this port as waiting to be aggregated
We tried to shutdown/no shutdown the port, removed and added back to port channel, but it is still in notconnected (up/down) state with waiting to be aggregated in portchannel state
We changed the FO link, patch cords, SFP - all the same.
Port configuration is below
interface Port-channel10
description VSS-Link-10
switchport
switch virtual link 1
interface TenGigabitEthernet1/1/16
description VSS-PO10
no lldp transmit
no lldp receive
no cdp enable
channel-group 10 mode on
service-policy output VSL-Queuing-Policy
interface TenGigabitEthernet1/1/15
description VSS-PO10
no lldp transmit
no lldp receive
channel-group 10 mode on
service-policy output VSL-Queuing-Policy
interface Port-channel20
description VSS-Link-20
switchport
switch virtual link 2
interface TenGigabitEthernet2/1/16
description VSS-PO20
no lldp transmit
no lldp receive
no cdp enable
channel-group 20 mode on
service-policy output VSL-Queuing-Policy
interface TenGigabitEthernet2/1/15
description VSS-PO20
no lldp transmit
no lldp receive
channel-group 20 mode on
service-policy output VSL-Queuing-Policy
Any ideas? Are we hitting some kind of a bug?
Thank you
Solved! Go to Solution.
07-13-2017 01:56 AM
that's good its gone :) I would still look at moving off that image as it may trigger again but least for now its backup
07-10-2017 05:31 AM
Hard to tell if your hitting a bug without your software version could be there are a few bad version sof code on the 4500x what are you on currently , you can check that though by going to the caveat section in your release guide where you download the image it shows all open bugs for the specific release , if its not on that it could still be a bug but not known for that image or check the bug tool kit
https://bst.cloudapps.cisco.com/bugsearch/?referring_site=shp
what does the interface itself show are there any errors on it , are they incrementing if they are ?
do the logs show anything other than the interface going down
if you swap the working interface cable to the non working is it still the same
Your config looks fine so maybe software if the actual interfaces are clean
07-10-2017 05:37 AM
Hello Mark,
Thank you for your time
There are just a few errors on one of the interfaces, another one is clean.
I can not swap the interfaces now since we have only 1 working link. However, we tried another FO link with the same interface - no luck. It seems like the port is stuck in (w) state and waits the switch to be reloaded, just like when configuring VSS from the scratch.
07-10-2017 05:53 AM
Hi
I just checked the bug kit for VSL bugs on the 4500 and there are quite a few related to different images , whats your version running
couple as an example , pages of them came up , you an narrow it down to your version
CSCur43040 - VSL ports stay in 'w' state with VSL encryption
CSCun34823 - Cat4500x trunking problem after VSL down
07-10-2017 05:54 AM
Cisco IOS Software, IOS-XE Software, Catalyst 4500 L3 Switch Software (cat4500e-UNIVERSALK9-M), Version 03.06.03.E RELEASE SOFTWARE (fc3)
07-10-2017 06:04 AM
A better/recommended image for that platform is 3.6.6 , the 3.6.3 is couple of years old now
I see a couple bug ids that could relate to it , im not TAC so I cant be certain though without all the information , you have a few choices ,
take the whole show tech run it yourself through the Cisco CLi analyser see if It gives you the correct id relating to your issue
open a tac case get the exact bug id they will provide it with a path its not in too
or take a chance and move to the latest recommended ios 3.6.6 c version see if that resolves it without going through TAC as if they don't have a bug id that's what they will advise to do
you could also root through the pages of bugs in the toolkit see if you see anything else similar but it doesn't show anything specific for your version when I run it
VSL link stuck in W state, when OIR is done on the SFP of the VSL link
07-10-2017 06:12 AM
07-10-2017 06:56 AM
We will reload the switches to see if it helps
07-10-2017 06:58 AM
If your going to reload you may as well put it on the current recommended image , you can always roll back to 3.6.3 anyway , your choice though
07-12-2017 09:07 AM
Hello Mark,
We reloaded the VSS one by one (not the entire shelf at once), but it did not help
This bug perfectly describes what we experience, but the images are not suitable for our model.
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCut09985/?referring_site=bugquickviewredir
Known fixed releases are
But what I see in the download section for our model is 3.9.2 is the highest available release.
Btw, the bug was last updated on 28th of June, so looks like it was discovered recently. So, logically, we need to install the latest release dated 02 may, which is 3.9.2, however, I am not sure if it was fixed in the release
What do you think?
07-12-2017 09:53 AM
personally I would try 3.6.6 as its an MD not an ED and its the most stable release and its not listed in the bug as effected , your taking a shot either way , were using 3.6.6 across all our ios-xe platforms and its working well last few months no issues , we moved off earlier and later ones for this version , choice is yours though , I often avoid the latest image as your the guinea pig the first to use it run it anything could be wrong with it , bugs get added to the caveat section as there found per image so rolling to the latest in my opinion is never wise unless you have too
07-12-2017 10:04 AM
We will definitely try 3.6.6, but what bothers me is that this bug was discovered a couple of weeks ago and 3.6.6 was release in 2016 (i guess)
BTW, one thing I dont understand is that in bug description the fixed versions numbering is dufferent from what I can potentially download
Looks like this bug is not related to 4500X versions, however it perfectly describe what we face
07-12-2017 10:11 AM
Ye there maybe new images but if you look at the cisco website there advising you to use 3.6.6 now that will change over time as more issues are found but its been up there now goods of year still with the star
and yes that's the problem with bugs that's why I was saying TAC it if you had support , in reality they have the tools and the dev team behind them to exactly pinpoint the bug id and if not they will raise a new one if your the first to hit it on that version/platform and find you a fix , otherwise your just checking docs and fingers crossed most of the time its not in that release but with your fault youl know pretty quick
I would get a window upload 2 or 3 images in the bootvar you think might do and bounce through in reloads till hit one that's not seeing the issue
07-13-2017 01:53 AM
UPDATE
Yesterday, when restarting the VSS, we restarted the same switch twice by mistake.
Today we restarted the second one and the problem is gone
07-13-2017 01:56 AM
that's good its gone :) I would still look at moving off that image as it may trigger again but least for now its backup
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide