cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
644
Views
2
Helpful
15
Replies

9400 series NAC issue with 17.9.6 code (code is pulled now)

RVTim
Level 1
Level 1

I was intending to write today looking for help, but, I may have answered my own question enough to satisfy me.  But, I thought I'd still post because this could be good information for someone else about to upgrade.

Somewhere around 9/27/2024 I downloaded the 17.9.6 code for the 9300/9400 series switches, right after their security vulnerabilities were made public.

I just upgraded to 17.9.6 and the next morning was immediately hit with issues.  In general, the code seems OK, however, we use dot1x NAC with Clearpass and certificate authentication for our laptops and PCs.

What is/was happening is this:   The PC connects, and all of the dot1x process goes fine. Clearpass logs show the PC being allowed, and the switch shows AUTH when you do 'show auth sessions'.   So everything is good.  Except, the PC can't get an IP address from DHCP, and, if you put a static IP on it, it won't talk to anything either.   The mac address-table gets populated but it shows as STATIC, even on a DYNAMIC port/device.  The arp in our case is done on a firewall for all vlans, and there is NO arp entry that ends up on the firewall for that mac address.  If you do "show interface" on that interface, you'll see 0 (zero) packets input.  You will see some packets output, but not too many.

Additionally, we have some devices that use mac authentication, and those seem to be working fine.  So it's just the dot1x stuff that blew up.  It's almost like the layer 2 side isn't being connected once dot1x succeeds, or, that an "open the port" Dynamic default ACL isn't being applied.

I was able to get all the ports functional by removing all of the NAC config.  I probably could have turned it off at the global level but was hoping I could troubleshoot and fix it.  Turns out the troubleshooting wasn't giving me any real results.  Debug logs clearly show the authentication successful.

So my next step was to come here and post for help, but, before I did that, I wanted to One star review the code on the download page, and comment there, to prevent others from having this issue. 

When I went to the download page, that 17.9.6 version is nowhere to be seen anymore.  This was the original data block on it:

Description : CAT9300/9400/9500/9600 Universal
Release : Cupertino-17.9.6
Release Date : 16-Sep-2024
FileName : cat9k_iosxe.17.09.06.SPA.bin
Min Memory : DRAM 8192 Flash 16384
Size : 1199.43 MB ( 1257688537 bytes)

So, should you be running NAC, for sure avoid the code.  And, I'm not sure what other bugs they must have been finding so do research before you upgrade to that rev for sure!

 

15 Replies 15

balaji.bandi
Hall of Fame
Hall of Fame

This is could be bug, that is reason most of the time before upgrade and after upgrade compare the configuration, sometimes some of the features break with the upgrade - so cisco suggesting 17.9.5 as best version so far for all production and working.

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Can I see dot1x port config 

Also 

Show mac interface x

Show authentication session interface x detail 

MHM

RVTim
Level 1
Level 1

Sure, here's some info:  (some info has been changed.  vlan assigned to the port gets changed by clearpass too)

 

Dot1X port config:

switchport access vlan 22
switchport mode access
authentication host-mode multi-auth
authentication order dot1x
authentication priority dot1x
authentication port-control auto
authentication periodic
authentication timer reauthenticate server
dot1x pae authenticator
dot1x timeout server-timeout 30
dot1x timeout tx-period 10
dot1x timeout supp-timeout 15
dot1x max-req 3
dot1x max-reauth-req 5

 

For some reason, this says STATIC when I think it should say dynamic, no?  It's a DHCP client as well:

MY-9410#sh mac address-table int g8/0/6
Mac Address Table
-------------------------------------------

Vlan Mac Address Type Ports
---- ----------- -------- -----
10 8cae.4ccd.3722 STATIC Gi8/0/6
Total Mac Addresses for this criterion: 1

 

MY-9410#sh auth session int g8/0/6 det
Interface: GigabitEthernet8/0/6
IIF-ID: 0x13D86FB9
MAC Address: 8cae.4ccd.3722
IPv6 Address: Unknown
IPv4 Address: Unknown
User-Name: host/BobsMachine.ourcompany.com
Status: Authorized
Domain: DATA
Oper host mode: multi-auth
Oper control dir: both
Session timeout: 259200s (server), Remaining: 259183s
Timeout action: Reauthenticate
Common Session ID: 0D01630A0000003C4971C2FB
Acct Session ID: 0x00000023
Handle: 0xdb00003c
Current Policy: POLICY_Gi8/0/6


Local Policies:
Service Template: DEFAULT_LINKSEC_POLICY_SHOULD_SECURE (priority 150)
Security Policy: Should Secure

Server Policies:
Session-Timeout: 259200 sec
Vlan Group: Vlan: 33


Method status list:
Method State
dot1x Authc Success

 

 

And some console lines:

Oct 1 13:57:24: %LINK-3-UPDOWN: Interface GigabitEthernet8/0/6, changed state to down
MY-9410#
Oct 1 13:57:25: %SESSION_MGR-5-START: R0/0: sessmgrd: Starting 'dot1x' for client (8cae.4ccd.3722) on Interface GigabitEthernet8/0/6 AuditSessionID 0D01630A0000003C4971C2FB
MY-9410#
Oct 1 13:57:27: %LINK-3-UPDOWN: Interface GigabitEthernet8/0/6, changed state to up
Oct 1 13:57:28: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet8/0/6, changed state to up
MY-9410#
Oct 1 13:57:42: %DOT1X-5-SUCCESS: R0/0: sessmgrd: Authentication successful for client (8cae.4ccd.3722) on Interface Gi8/0/6 AuditSessionID 0D01630A0000003C4971C2FB
Oct 1 13:57:42: %SESSION_MGR-5-SUCCESS: R0/0: sessmgrd: Authorization succeeded for client (8cae.4ccd.3722) on Interface GigabitEthernet8/0/6 AuditSessionID 0D01630A0000003C4971C2FB

 

It doesn't matter if you wait a long time, or not...both Clearpass show it as successful and so does the switch.   But, you get zero input packets if you do "show interface".

 

 

 

Comment back to BB:    Yeah, they are still recommending 17.9.5.   The hard part is, I am in an audited environment that isn't too interested in seeing slow patching when vulnerabilities come out.  So I try to stay with current once they are publicized.  Usually no problem, this time it bit me.  Plus, they were also slow to get out release notes for this version, so I couldn't read them ahead of time.  I could have waited a couple weeks, but it's often many months for them to update the "recommended" code version, and release notes really should be put out when the code is put out, in my opinion.   But, I've had such good luck over a couple decades that I decided to move forward.  It looks like in this case we'll be moving back to 17.9.5 for a bit.  It's better than leaving NAC off.

Sure i suggest to move to 17.9.5 or try 17.12.X if you like to test higher version.

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Mac learn in vlan10 and it authc via vlan group 33 ?

Which vlan is dynamic assign by server to this user ? 

MHM

Sorry MHM.  VLAN 10 was one of the items I changed to obscure a little.  So if you see 10, in context of my reply, think 33.   i.e. 33 is what I put in this thread for real VLAN 10.  So yes, in truth, vlan 10 is dynamically assigned.  In the config above, I had vlan 22 on the port, and when the policy gets applied, this particular client moves to vlan 10.

BB:  Yes, we came from 17.9.5, which was working fine.  I believe we implemented NAC back in the early 17.3 versions, moved up to 17.3.5, and then made a jump to the 17.9 series after that.  So this worked for sure on 17.9.4, 17.9.4a, 17.9.5, and other versions.

Sorry for the confusion but I was trying to obfuscate what I could just because...good security practice.  At least there were no passwords in it.

Run portfast under one interface and check if dhcp server assign IP to authc user or not.

If it work OK 

Apply portfast + bpduguard to all dot1x ports

MHM

We use portfast on all of our host ports, so that was actually part of the config.  I only provided the NAC related stuff above.

So, I did try to add bpduguard, but that had no effect whatsoever for fixing the issue.

 

interface GigabitEthernet8/0/6

description PC NAC
switchport access vlan 22
switchport mode access
authentication host-mode multi-auth
authentication order dot1x
authentication priority dot1x
authentication port-control auto
authentication periodic
authentication timer reauthenticate server
dot1x pae authenticator
dot1x timeout server-timeout 30
dot1x timeout tx-period 10
dot1x timeout supp-timeout 15
dot1x max-req 3
dot1x max-reauth-req 5
spanning-tree portfast
spanning-tree bpduguard enable

So I tried it with this config and it didn't work still.

 

As I see it not dot1x issue it dhcp issue'

Can you use port without dot1x and with same vlan as dynamic vlan assign by aaa 

Then check if endpoint get IP from DHCP or not.

MHM

Nope, not a DHCP issue, definite NAC issue.   We tested with static IP addresses as well, and the port authorizes but doesn't pass traffic.   Additionally, last night we rolled back to 17.9.5 code, and using this same port config, everything is now working fine again.

DHCP does indeed work, with that port, if you remove NAC from it.

interface GigabitEthernet8/0/6
description PC NAC
switchport access vlan 22
switchport mode access
authentication host-mode multi-auth
authentication order dot1x
authentication priority dot1x
authentication port-control auto
authentication periodic
authentication timer reauthenticate server
dot1x pae authenticator
dot1x timeout server-timeout 30
dot1x timeout tx-period 10
dot1x timeout supp-timeout 15
dot1x max-req 3
dot1x max-reauth-req 5
spanning-tree portfast
end

It is true, the machine won't get an IP address via DHCP, with the bug, but that is not the cause.  The cause is definitely a bug, unless there is some new *feature* that they they rolled out that needs to be enabled for this to work.

But now that we're reverted to 17.9.5, everything is good again, and we'll have to wait to see when they re-release code. Hopefully they won't call it 17.9.6, but 17.9.6a, because I won't be downloading anything named 17.9.6 again.

""DHCP does indeed work, with that port, if you remove NAC from it.""

With all of what you mentioned above.

Try make only one port with access vlan33' this force SW to add vlan33 

Then config other port with dot1x (use same code as you use previously) then check if endpoint get IP and forward traffic.

MHM

I think my previous reply not so clear 

Let me clear it 

Aaa push vlan 33 to SW to use for port with authc user'

If SW dont have vlan33 then this process will failed 

That why we config one interface as access with this vlan (vlan33) or allow this vlan in trunk and add vlan manaully to db of SW.

Your issue I think is from SW dont have vlan33 

Try my suggestion and check

MHM

No, I'm sorry but you're not reading all of the above information I've posted.

First, this worked on all previous IOS versions.  The VLAN is definitely on the switch.   Also, we tested DHCP and Static IPs, and I also tried manually assigning the VLAN as well as letting radius assign it.  And, as soon as we remove the NAC commands and turn dot1x off, the same clients work fine, using DHCP and/or static IPs, and traffic flows.

The other thing is, like I said, Cisco released this 17.9.6 code, and then now they pulled it off the download site.  So it is no longer available.  They would not have done that if the code did not have issues.

And, running the exact same config, with the same VLANs, and everything else the same on the switch, this works fine with 17.9.5.

Additionally, we do not have just one of these switches but it was not working on all 3 of the 9410's that we have.

I have reverted the code back to 17.9.5, so I cannot test anything else, and will not upgrade to 17.9.6 or variants until Cisco acknowledges that they had an issue and release a code that comes out with full release notes.

So until then, we may as well consider the thread paused.  I'll reply to it later once they have a new release, and confirm it works.

Review Cisco Networking for a $25 gift card