cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2678
Views
0
Helpful
4
Replies

VSS Catalyst 4500X-16 SFP+ / crashing on cat4500e-universalk9.SPA.03.05.03.E.152-1.E3.bin / radius / dot1x

2colin-cant
Level 1
Level 1

Hi guys,

I am not sure if I am hitting IOS bug CSCtx61557

according to the bug tool this is the info:

-----------------------------

crash after authc result 'success' from 'dot1x' for client (Unknown MAC)
CSCtx61557
Description
Symptoms: The switch crashes after logging "success" from "dot1x" for client
(Unknown MAC).

Conditions: The symptom is observed with the following conditions:

1. A switchport is configured with both of the following:

authentication event server dead action authorize...
authentication event server alive action reinitalize

2. The radius server was down previously, and a port without traffic (for
example: a hub with no devices attached) was authorized into the inaccessible
authentication bypass (IAB) VLAN without an associated MAC address.
3. The radius server becomes available again, and a dot1x client
attempts to authenticate.

Workaround: There is no workaround.


--------------------------------------------------------------

 

I am running the following IOS on my 4500X-16 SFP+:


cat4500e-universalk9.SPA.03.05.03.E.152-1.E3.bin

 

-------------------------------------------------------------
This is what I configured, and what happened:

 

HOSTNAME(config)#aaa group server radius rad_eap
HOSTNAME(config-sg-radius)# server name ACS1
HOSTNAME(config-sg-radius)# server name ACS2
HOSTNAME(config-sg-radius)# server name ACS3

HOSTNAME(config-sg-radius)#$ication login default group radius local
HOSTNAME(config)#aaa authentication login CONSOLE local
HOSTNAME(config)#aaa authentication enable default group radius enable
HOSTNAME(config)#aaa authentication ppp default local group radius
HOSTNAME(config)#aaa authentication dot1x default group radius
HOSTNAME(config)#aaa authorization exec default if-authenticated
HOSTNAME(config)#aaa authorization network default group radius
HOSTNAME(config)#aaa accounting update newinfo
HOSTNAME(config)#aaa accounting dot1x default start-stop group radius
HOSTNAME(config)#aaa accounting network default start-stop group

eption to IOS Thread:
Frame pointer 897BAE38, PC = 1C03EECC

IOSD-EXT-SIGNAL: Aborted(6), Process = Exec
-Traceback= 1#49176b00b95a50f3145e3825de17d470  c:1C008000+36ECC c:1C008000+3BE50 c:1C008000+3BF48 :1F679000+201A18C :1F679000+31CEE2C :1F679000+2C22958 :1F679000+2C293E4 :1F679000+1166260 :1F679000+2C3C20C

Fastpath Thread backtrace:
-Traceback= 1#49176b00b95a50f3145e3825de17d470  uld:1F224000+2DE8 uld:1F224000+2DE4 iosd_unix:1C3ED000+186A0 pthread:1AA69000+6450

Auxiliary Thread backtrace:
-Traceback= 1#49176b00b95a50f3145e3825de17d470  pthread:1AA69000+BB8C pthread:1AA69000+BB6C c:1C008000+F61E4 iosd_unix:1C3ED000+21270 pthread:1AA69000+6450

Buffered messages: (last 8192 bytes only)
6 left the port-channel Port radius

HOSTNAME(config)#aaa accounting system default start-stop group radius
HOSTNAME(config)#
HOSTNAME(config)#
HOSTNAME(config)#no authentication logging verbose
HOSTNAME(config)#
HOSTNAME(config)#
HOSTNAME(config)#login block-for 300 attempts 5 within 60
-channel1
*Aug 28 01:08:47.873 UTC: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session DOWN on slot 11 port 12.
*Aug 28 01:08:48.056 UTC: %SYS-6-LOGGINGHOST_STARTSTOP: Logging to host 172.16.5.98 port 514 started - CLI initiated
*Aug 28 01:08:48.571 UTC: %FASTHELLO-2-FH_DOWN:  Fast-Hello interface Te2/1/12 lost dual-active detection capability

*Aug 28 01:08:49.099 UTC: %PIM-5-DRCHG: DR change from neighbor 0.0.0.0 to 172.16.250.61 on interface Vlan250
*Aug 28 01:15:08.753 UTC: %C4K_IOSINTF-5-LMPHWSESSIONSTATE: Lmp HW session UP on slot 11 port 1.
*Aug 28 01:15:24.759 UTC: %VSLP-5-VSL_UP:  Ready for control traffic

*Aug 28 01:15:27.760 UTC: %VSLP-5-RRP_ROLE_RESOLVED: Role resolved as ACTIVE  by VSLP
*Aug 28 01:15:27.760 UTC: %EC-5-BUNDLE: Interface TenGigabitEthernet2/1/1 joined port-channel Port-channel2
*Aug 28 01:15:28.049 UTC: %C4K_REDUNDANCY-6-DUPLEX_M
<Thu Aug 28 01:18:32 2014> Message from sysmgr: Reason Code:[2] Reset Reason:Service [iosd] pid:[6813] terminated abnormally [6].
Details:
--------
Service: IOSd service
Description: IOS daemon
Executable: /tmp/sw/mount/cat4500e-universalk9.SPA.152-1.E.pkg//usr/binos/bin/iosd

Started at Wed Aug 27 22:27:48 2014 (647795 us)
Stopped at Thu Aug 28 01:18:32 2014 (115506 us)
Uptime: 2 hours 50 minutes 44 seconds

Start type: SRV_OPTION_RESTART_STATELESS (23)
Death reason: SYSMGR_DEATH_REASON_FAILURE_SIGNAL (2)
Last heartbeat 0.00 secs ago

PID: 6813
Exit code: signal 6 (no core)

CWD: /var/sysmgr/work


PID: 6813
UUID: 512
FAILURE: syslogd shutdown


I had a ICMP ping going, and it was not affected, as the Standby VSS chassis kicked in and took over, while the previous active chassis reloaded.

 

--------------------------------------------------------
2nd time it happened:

Now this time, I had waited until the previous active chassis was back up and running and came back up as Standby hot.

once again I pasted the same config, and bang, It happened a second time on the second chassis which was acting now as Active supervisor.

And once again, the ICMP continuous ping was not interrupted, as the other chassis remained up, while the "new" active crashed after configuring the same configs in a slight different order.

 

HOSTNAME(config)#radius server ACS2
HOSTNAME(config-radius-server)#$5.22 auth-port 1812 acct-port 1813
HOSTNAME(config-radius-server)# timeout 1
HOSTNAME(config-radius-server)# key 0 XXXX
HOSTNAME(config-radius-server)#!
HOSTNAME(config-radius-server)#radius server ACS3
HOSTNAME(config-radius-server)#$xxxx auth-port 1812 acct-port 1813
HOSTNAME(config-radius-server)# timeout 1
HOSTNAME(config-radius-server)# key 0 xxxxxxx
HOSTNAME(config-radius-server)#
HOSTNAME(config-radius-server)#aaa group server radius rad_eap
HOSTNAME(config-sg-radius)# server name XXXX
HOSTNAME(config-sg-radius)# server name XXXX
HOSTNAME(config-sg-radius)# server name XXXX
HOSTNAME(config-sg-radius)#
HOSTNAME(config-sg-radius)#
PER-3-S

Exception to IOS Thread:
Frame pointer 89455E38, PC = 1CC27ECC

IOSD-EXT-SIGNAL: Aborted(6), Process = Exec
-Traceback= 1#e495ba4f9346cc1496eecd01ebf1814a  c:1CBF1000+36ECC c:1CBF1000+3BE50 c:1CBF1000+3BF48 :20276000+201B18C :20276000+31D0DA8 :20276000+2C24800 :20276000+2C2B28C :20276000+11671B0 :20276000+2C3E0B4

Fastpath Thread backtrace:
-Traceback= 1#e495ba4f9346cc1496eecd01ebf1814a  iosd_unix:1CFD6000+1C230 iosd_unix:1CFD6000+1C284 iosd_unix:1CFD6000+18854 pthread:1B653000+6450

Auxiliary Thread backtrace:
-Traceback= 1#e495ba4f9346cc1496eecd01ebf1814a  pthread:1B653000+BB8C pthread:1B653000+BB6C c:1CBF1000+F61E4 iosd_unix:1CFD6000+21270 pthread:1B653000+6450

Buffered messages: (last 8192 bytes only)
INTF-5-TRANSCEIVERINSERTED: Slot=11 Port=3: Transceiver hasW-9(config-sg-radius)#
HOSTNAME(config-sg-radius)#no authentication logging verbose
HOSTNAME(config)#
HOSTNAME(config)#
HOSTNAME(config)#login block-for 300 attempts 5 within 60
 been inserted
*Aug 28 01:26:03.864 UTC: %C4K_IOSINTF-5-TRANSCEIVERINSERTED: Slot=11 Port=4: Transceiver has been inserted
*Aug 28 01:26:03.864 UTC: %C4K_IOSINTF-5-TRANSCEIVERINSERTED: Slot=11 Port=5: Transceiver has been inserted
*Aug 28 01:26:03.864 UTC: %C4K_IO
<Thu Aug 28 01:28:10 2014> Message from sysmgr: Reason Code:[2] Reset Reason:Service [iosd] pid:[6770] terminated abnormally [6].
Details:
--------
Service: IOSd service
Description: IOS daemon
Executable: /tmp/sw/mount/cat4500e-universalk9.SPA.152-1.E3.pkg//usr/binos/bin/iosd

Started at Thu Aug 28 01:13:52 2014 (60006 us)
Stopped at Thu Aug 28 01:28:10 2014 (993041 us)
Uptime: 14 minutes 18 seconds

Start type: SRV_OPTION_RESTART_STATELESS (23)
Death reason: SYSMGR_DEATH_REASON_FAILURE_SIGNAL (2)
Last heartbeat 0.00 secs ago

PID: 6770
Exit code: signal 6 (no core)

CWD: /var/sysmgr/work

 


------------------------------------------------

are these the symptoms related to  CSCtx61557 ?

I have tested this in a test environment, where no ACS was reachable!

Thanks

Colin

 

4 Replies 4

Reza Sharifi
Hall of Fame
Hall of Fame

Hi,

It appears you are hitting a bug in the OS.  I would open a ticket with TAC and have them help you in finding a resolution.

HTH
 

Search the bug tool for Login Block and 4500 platforms.. and you will find my case notes :)

 

We hit a bug...

by configuring the following line 2x in a row:

 

login block-for 300 attempts 3 within 60

 

this replaces basically a "reload" command, if your running dual SUPs  :)

 

 

2colin-cant
Level 1
Level 1

Just a quick update:

 

The Active SUP reloads on while the Secondary takes over, the data plane is not affected.

 

HOSTNAME#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
HOSTNAME(config)#
HOSTNAME(config)#login block-for 120 attempts 4 within 60

 

The interesting thing is, that the following line was already in the config:

 

login block-for 300 attempts 5 within 60

 

But once I "re-configure" that line, the switch crashes!

 

I am raising a TAC case and let you guys know of its outcome.

 

Colin

 

 

 

 

Another update,

 

It seems not only the 4500X platform is affected, its also 4510R+E's:

 

WS-C4510R+E
WS-X45-SUP8-E

IOS-XE (cat4500es8-UNIVERSALK9-M), Version 03.03.01.XO

 

 

4510R+E#sh redundancy /| i    | i state
        Current Software state = ACTIVE
       Uptime in current state = 2 hours, 39 minutes
        Current Software state = STANDBY HOT
       Uptime in current state = 6 minutes

 

4510R+E(config)#login block-for 300 attempts 3 within 60

Exception to IOS Thread:
Frame pointer 8D104E28, PC = C9C0FF4

IOSD-EXT-SIGNAL: Aborted(6), Process = Exec
-Traceback= 1#9492282023e5ef761bd83af205155966  c:C98A000+36FF4 c:C98A000+3C2B0 c:C98A000+3C3A8 :10000000+201B994 :10000000+31CA4E4 :10000000+2C1DC54 :10000000+2C246E0 :10000000+116A3F0 :10000000+2C37508

Fastpath Thread backtrace:
-Traceback= 1#9492282023e5ef761bd83af205155966  c:C98A000+E29C0 c:C98A000+E29A0 iosd_unix:CD74000+1877C pthread:B3FE000+647C

Auxiliary Thread backtrace:
-Traceback= 1#9492282023e5ef761bd83af205155966  pthread:B3FE000+BBB4 pthread:B3FE000+BB94 c:C98A000+FA4E8 iosd_unix:CD74000+21270 pthread:B3FE000+647C

Buffered messages: (last 8192 bytes only)

 

 

at least one now can directly "redundancy failover" from config mode.....      :)

 

 

 

Review Cisco Networking products for a $25 gift card