cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1237
Views
5
Helpful
2
Replies

CSCvo49699 - Tesla SG550x-48P SG350-52 SG350-52MP

r1ncew1nd
Level 1
Level 1

We expierence the same system reboots on multiple of these switches as described in the bug report.

 

Affected Hardware:

SG350-52

SG350-52MP

 

Tested firmware:

2.4.0.94

2.5.0.78

 

Log entry:

%PSET-F-ILLEGAL_IFINDEX: PSETG_add_port_to_set: Illegal ifIndex 0


***** FATAL ERROR *****
Reporting Task: BRMN.
Software Version: 2.4.5.71 (date Nov 4 2018 time 19:41:09)
ros[0x2e6070]
ros(HOSTG_fatal_error+0x8)[0x2e84d8]
ros(OSSYSG_fatal_error+0x270)[0x887714]
ros[0x69f9e0]
ros(PSETG_add_port_to_set+0x50)[0x7dfa58]
ros(SW2G_tx_to_vlan_subset+0xac)[0xb3bcd0]
ros(NETG_l2_bridge_forward+0x4e0)[0xa4754c]
ros(NETG_l2_snoop_app_gcall+0xa4)[0xa68b90]
ros[0x7bf080]
/lib/libp2linux.so.1(task_run+0xf4)[0xb6e93818]

 

We have >50 switches from one delivery in operation

Switches affected by these errors have in common that more than two ether-channels are active.

 

Does somebody have a clue how to solve this?

Just tried to downgrade to 2.3.5.63 ...

 

2 Replies 2

Willem_B
Level 1
Level 1

We are experiencing this same problem with SG350X series at two customers. At the first customer the issue is rare though as it doesn't occur frequently, twice in 6 months. However the second customer sees the issue each week, luckily it always occurs in the middle of the night outside of production hours. We have many more customers with these switches where we don't see this issue. The different occurrence frequency and the fact that most customers don't see this issue suggests the problem is triggered externally (I would guess a PoE device as we don't see the issue on non PoE switches)

 

The log entry suggests its a software bug in the switch. We have disabled Bonjour, LLDP, CDP and Auto Voice VLAN, but that does not prevent the problem from happening. The error should tell someone at Cisco with insight into the software a possible reason and possibly a possible workaround I think. The error is too cryptic for me to know what is going on. Apparently some process tries to ad an interface, but what kind of interface and to what is unclear to me.

 

Also I suspect the issue was introduced somewhere in the 2.3 or 2.4 firmware as the first customer did no no see this issue when they were running 2.2.x, after upgrading to 2.4.x the issue occurred for the first time.

 

I created a ticket at Cisco support and they first asked us to upgrade to the latest firmware, and we did. A few months later the issue occurred again and I contacted support again and they wanted me to upgrade to the latest firmware again (a new version came out between both occurrences) However the release notes did not indicate a fix for this issue, so I told them no, and send in a support package with logs etc. They looked at it and they wanted me to do a packet capture from a few seconds before the issue. However it occurs rarely and at a random time, also I don't know which of the 52 ports has the device that triggers it so I was stuck and as it was the only customer with the problem I decided to leave it and hope Cisco would fix it as I saw the CSCvo49699 bug report online.

 

We now have a second customer with this issue with more frequent occurrences. I will ask my colleague who is their IT contact to open a case with Cisco and hopefully they can find the issue.

So far we tracked this issue down to:

The problem occurs on switches were more than one EtherChanel is active
on all 2.4 Firmwares. The switches configuration is basilically the same.
From total 20 SG350-52, SG350-52-MP switches only six with multiple
defined EhterChanels were affected. One SG350, five SG350MP.
It is not an issure with PoE as, at this time no PoE consumer was
connected to any PoE Switch (future WLAN planing).

All affected devices were downgraded to firmware 2.3.5.63 which fixes
the problem as we seen this issure on the last three 2.4 versions.
All other devices running firmware 2.4.5.71 and we haven't seen a reboot
on these.

It would be nice if Cisco is looks into this problem.