05-25-2022 03:05 AM - edited 05-25-2022 03:06 AM
Hi all,
I have a strange one for you.
We have deployed a two-node ISE 2.7 Patch 6 cluster. We have deployed NAC successfully about 7 months ago. We have a few issues now and again but all is working well.
Recently there are a number of desks where NAC has stopped working regardless of who uses that desk. NAC works for the users on other desks without any problems. When the go back to the "Desk X", they don't get any connectivity.
On the ISE logs, I can see the user when user logged in at Desk 1 @ 09:25 (see screen shot "NAC1.jpg") successfully and got network access. When the user went to "Desk X", I can see entries in ISE at 09:52, 09:59, 10:05, these three entries shows a successful authentication with a session, however on the users laptop it states that there is no internet. The three entries is where we tried to eliminate the docking station, network cables and the switchport as a root cause.
Troubleshooting steps at Desk X:
GigabitEthernet3/0/4 is up, line protocol is down (notconnect)
Hardware is Gigabit Ethernet, address is 00bc.6094.2404 (bia 00bc.6094.2404)
Description: ### User Access Port ###
GigabitEthernet3/0/9 is up, line protocol is down (notconnect)
Hardware is Gigabit Ethernet, address is 00bc.6094.2409 (bia 00bc.6094.2409)
Description: ### User Access Port ###
Switch config:
!
template PORT-AUTH-TEMPLATE
dot1x pae authenticator
mab
access-session host-mode multi-domain
access-session control-direction in
access-session closed
access-session port-control auto
authentication periodic
authentication timer reauthenticate server
service-policy type control subscriber INT-AUTH-POLICY
!
!
interface GigabitEthernet3/0/9
description ### User Access Port ###
switchport access vlan XX
switchport mode access
switchport voice vlan YY
srr-queue bandwidth share 10 10 60 20
srr-queue bandwidth shape 10 0 0 0
queue-set 2
priority-queue out
mls qos trust device cisco-phone
mls qos trust cos
dot1x timeout tx-period 60
dot1x max-reauth-req 3
auto qos voip cisco-phone
storm-control broadcast level 30.00 25.00
storm-control action shutdown
storm-control action trap
source template PORT-AUTH-TEMPLATE
spanning-tree portfast edge
spanning-tree bpduguard enable
service-policy input AutoQoS-Police-CiscoPhone
end
Regardless of what user uses that desk, it will not work. A fix for this is to remove NAC and re add it.
The switch is a 2960X-48FPD-L and is a member of a stack which has 3 switches in it. The software version is: 15.2(4)E6.
This is occurring randomly across desks across numerous sites for the customer.
Any ideas?
Solved! Go to Solution.
05-30-2022 01:24 PM - edited 05-30-2022 01:25 PM
This is excellent.
The next steps:
no dot1x timeout tx-period 60
no dot1x max-reauth-req 3
no source template PORT-AUTH-TEMPLATE
shut
no shut
dot1x timeout tx-period 60
dot1x max-reauth-req 3
source template PORT-AUTH-TEMPLATE
shut
no shut
Just to point out that if a switchport starts behaving like this, it doesn't matter who logs into the network on these ports, they will not get network access. This includes printers and phones (MAB).
Hopefully an IOS upgrade will resolve this.
Thanks
Anthony.
05-26-2022 09:54 PM
HI @Anthony O'Reilly ,
1st ... about the aaa accounting update newinfo periodic 5 command, consider the use of 1440 or 2880.
2nd ... during the issue what is the result of the following command:
#show mac address-table interface G3/0/4
#show ip device tracking interface G3/0/4
#show authentication sessions interface G3/0/4
Regards
05-26-2022 10:57 PM
I think device-tracking may be missing from the config. As Marcelo mentioned, let's see the output of those commands.
05-27-2022 01:09 AM
#show mac address-table interface G3/0/4
There was nothing showing in the mac address-table for this interface. If I moved the device to another interface, it was in the mac address-table.
#show ip device tracking interface G3/0/4
I didn't run this at the time, I will the next time.
#show authentication sessions interface G3/0/4
There was nothing in the output for this command, it was as if there was nothing patched into the switch port.
If I plugged a phone into the port, the phone would get power. There was no entry for the phone in the #sh cdp nei command.
I ran a shut and no shut on the ports that were having issues. I will wait for the next interface that has this issues and report back. I will test this command aaa accounting update newinfo periodic 1440 on three 2960 switches.
@Arne Bier Device tracking is on this, it is configured globally using this command ip device tracking probe delay 10
This is the result from the tracking on the switch.
STACK01#sh ip device tracking all
Global IP Device Tracking for clients = Enabled
Global IP Device Tracking Probe Count = 3
Global IP Device Tracking Probe Interval = 30
Global IP Device Tracking Probe Delay Interval = 10
-----------------------------------------------------------------------------------------------
IP Address MAC Address Vlan Interface Probe-Timeout State Source
-----------------------------------------------------------------------------------------------
1.1.1.2 xxxx.xxxx.xxx 10 GigabitEthernet2/0/3 30 ACTIVE ARP
1.1.1.4 xxxx.xxxx.xxx 10 GigabitEthernet2/0/33 30 ACTIVE ARP
1.1.2.2 xxxx.xxxx.xxx 24 GigabitEthernet2/0/14 30 ACTIVE ARP
1.1.1.8 xxxx.xxxx.xxx 10 GigabitEthernet1/0/1 30 ACTIVE ARP
05-27-2022 06:09 AM
Hi @Anthony O'Reilly ,
1st: the "periodic" of the aaa accouting update newinfo periodic [1440 or 2880] ensure that the RADIUS Accounting (Interim Accounting Update) is sent to the ISE Node (regardless if the SW observes a change for the Active Session or not) every 1 or 2 days, If ISE fails to receive Interim Accounting Message for an Endpoint Session beyond 5 days, ISE will stop maintaining the Session for that Endpoint. A "periodic = 5" (5 sec) generates a lot of Interim Accounting Update.
2nd: the "device tracking" is enabled via the (config)# ip device tracking (although it is enable by default in 15.x+), the (config)# ip device tracking probe delay 10 does not allow a SW to send a Probe for 10sec when it detects a Link UP/Flap (a good practice).
3rd: when you said " ... There was nothing showing in the mac address-table for this interface ... ", in other words, if you check the #show logging info, then something like this appears? I mean the "Unknown MAC":
May 27 XX:XX:XX.XXX: %AUTHMGR-5-START: Starting 'dot1x' for client ("Unknown Mac") on Interface G3/0/4 AuditSessionID XXXXXXXXXXXXXXXXXXXXXXXX
Regards
05-27-2022 06:37 AM
Thanks for your quick response.
1. The switches in one site are now configured as aaa accouting update newinfo periodic 2880 I am currently monitoring them.
2. The ip device tracking command has not changed
3. There is definitely nothing in the logs for Unknown for our testing on Gi3/0/4. I've searched logs for Unknown and unknown across all our switches looking for other examples and I don't have any entries for it.
Is there anything else I can check?
Thanks
Anthony.
05-27-2022 10:46 AM
Hi @Anthony O'Reilly ,
in other words:
1. the MAC Addr Table is empty for G3/0/4
2. the Starting 'dot1x' for client on G3/0/4 is not appearing on the show logging
3. you are having problems with DeskX, even if you change the SW Port (G3/0/4 to G3/0/9).
My question:
At NAC1.JPG (your 1st image), ISE received an EAP-TLS packet from DeskX, but there is no Starting 'dot1x' for client on G3/0/4 on show logging and no MAC on the MAC Addr Table on G3/0/4 ... did you check if the Endpoint sent the packet to the SW (debug dot1x all) and the SW sent the the packet to ISE (debug radius)?
Regards
05-27-2022 01:41 PM
1. the MAC Addr Table is empty for G3/0/4 This is correct
2. the Starting 'dot1x' for client on G3/0/4 is not appearing on the show logging I am not sure on this at the time the log buffer was full and I cleared it to save me strolling down. There were four ports in total on this switch, I checked the logs for Gi3/0/4, Gi3/0/9, Gi2/0/6 and Gi2/0/16
3. you are having problems with DeskX, even if you change the SW Port (G3/0/4 to G3/0/9). This desk has two ports that had the same symptoms.
My question:
At NAC1.JPG (your 1st image), ISE received an EAP-TLS packet from DeskX, but there is no Starting 'dot1x' for client on G3/0/4 on show logging and no MAC on the MAC Addr Table on G3/0/4 ... did you check if the Endpoint sent the packet to the SW (debug dot1x all) and the SW sent the the packet to ISE (debug radius)? Unfortunately, I didn't do any debugs, I will for the next one. There are alot of people due on site on Monday and I will be ready to collect logs and debugs. There was no Starting 'dot1x' for client in the logs.
I will hopefully have a live example for you on Monday, fingers crossed.
Thanks for all your help, much appreciated.
05-30-2022 03:14 AM
I have a good example this morning. I was able to replicate the issue whether the laptop was on a docking station or plugged directly into the laptop.
Laptop is in port Gi1/0/18, there two commands were run after the device was in the port after about 50 seconds.
SW2#sh access-session int gi1/0/18
No sessions match supplied criteria.
Runnable methods list:
Handle Priority Name
6 5 dot1x
17 10 mab
15 15 webauth
SW#sh mac address-table int gi1/0/18
Mac Address Table
-------------------------------------------
Vlan Mac Address Type Ports
---- ----------- -------- -----
!
interface GigabitEthernet1/0/18
description ### User Access Port ###
switchport access vlan XX
switchport mode access
switchport voice vlan YY
srr-queue bandwidth share 10 10 60 20
srr-queue bandwidth shape 10 0 0 0
queue-set 2
priority-queue out
mls qos trust device cisco-phone
mls qos trust cos
dot1x timeout tx-period 60
dot1x max-reauth-req 3
auto qos voip cisco-phone
storm-control broadcast level 30.00 25.00
storm-control action shutdown
storm-control action trap
source template PORT-AUTH-TEMPLATE
spanning-tree portfast edge
spanning-tree bpduguard enable
service-policy input AutoQoS-Police-CiscoPhone
end
Debug dot1x all and debug radius are attached.
I can do another test again if you wish.
Thanks
Anthony.
05-30-2022 06:41 AM
This is a good one for TAC since it sounds like you are under a time pressure to get it fixed.
It sounds like it could be a switch bug... consider upgrading the switch.
05-30-2022 07:18 AM
I think we may have hit this bug.
All of the issues so have have been on member switches of a switch stack
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvv93417
What do you think?
Thanks
Anthony.
05-30-2022 11:21 AM
Hi @Anthony O'Reilly ,
interesting and CSCvv93417 2960x stack Member Switch fails wired dot1x; MasterSwitch passes dot1x using the same configs
could be the cause, are you able to test your Endpoint in a SW without a stack configuration?
As you said: " ... The Switch is a 2960X-48FPD-L and is a Member of a stack which has 3 Switches in it. The software version is: 15.2(4)E6... "
I checked the EAP and RADIUS debug:
ISE IP "10.10.10.100"
NAS IP "10.10.1.1"
Framed-MTU "1500"
Endpoint IP "10.10.10.10"
EndPoint MAC "48-2A-E3-3E-72-04"
I'm able to verify the Access-Request and Access-Challenge from/to NAD to/from ISE.
Attention to:
May 30 10:40:03.385: RADIUS/ENCODE(00000000):Orig. component type = Invalid
please take a look at CSCuu75107 2960 - traceback with EAP Authentication timeout., although there is no RESULT_OVERRIDE in your logs.
Attention to:
May 30 10:40:02.839: RADIUS/DECODE: EAP-Message fragments, 253+253+253+249, total 1008 bytes
always good to double check Fragmentation/MTU end-to-end (specially if you have a FW between) ... just in case : )
Regards
05-30-2022 01:24 PM - edited 05-30-2022 01:25 PM
This is excellent.
The next steps:
no dot1x timeout tx-period 60
no dot1x max-reauth-req 3
no source template PORT-AUTH-TEMPLATE
shut
no shut
dot1x timeout tx-period 60
dot1x max-reauth-req 3
source template PORT-AUTH-TEMPLATE
shut
no shut
Just to point out that if a switchport starts behaving like this, it doesn't matter who logs into the network on these ports, they will not get network access. This includes printers and phones (MAB).
Hopefully an IOS upgrade will resolve this.
Thanks
Anthony.
06-01-2022 10:56 AM
07-26-2022 05:52 AM
Just to let you know that the IOS upgrade from 15.2(7)E3 to 15.2(7)E5 on the Cisco 2960X switches work. We haven't encountered this issue since.
Thanks for all your help. Much appreciated.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide