In this article we will go through a quick example of how Cisco’s Embedded Event Manager (EEM) in Catalyst 9800 Wireless Controllers allows scripting of specific actions and functions, specifically for activating a PSK based backup wireless network if the RADIUS server for our main 802.1X based SSID fails. This is similar to the critical VLAN feature (also referred to as “inaccessible authentication bypass”) for wired 802.1X networks, but now on wireless. Although such a feature does not natively exist over Wi-Fi, EEM on C9800 platforms allows us to implement a customized version of it.
EEM is a software feature supported in Cisco IOS, IOS-XR, NX-OS and IOS-XE platforms that can match events and trigger actions. An event could be an IP SLA result or a console log message, for example, and actions could be sequences of command lines. Through EEM we could for instance detect if a specific interface goes administratively down, in case of a manual configuration error for example, by monitoring console log messages such as “GigabitEthernet1/0/45, changed state to administratively down”. Through the same script, we can then launch a series of configuration commands to bring that interface back, like “enable”, “conf t”, “interface gi 1/0/45” and “no shut”: these in the end are the same commands that we would manually enter for the same purpose, but now automated by our EEM script. Such a script would look similar to the following in IOS:
The newly announced Catalyst 9800 wireless controllers run on IOS-XE and support EEM, therefore allowing customization for even more use cases. Some wireless deployments need endpoints and users to always have network services, such as sites where employees are mainly connecting over Wi-Fi and where Ethernet ports may not be available for everyone. These enterprise WLANs are often configured for 802.1X authentication, hence they rely on RADIUS servers in the back-end. Wi-Fi networks configured for 802.1X do not support clients to connect with any other techniques, such as Pre-Shared Key (PSK) or just in open mode. If RADIUS services are not available anymore, already connected clients can stay connected until the next 802.1X authentication (triggered by a timeout, a new association, a disconnection, etc.), but that next 802.1X authentication will fail. Newly connecting clients will of course fail authentications as well. For 802.1X on wired networks, switches can offer backup services, such as automatically assigning clients to a critical VLAN if RADIUS servers are not available anymore. But again, in wireless networks, we cannot support associations to an 802.1X enabled SSID by clients not going through 802.1X. This usually means that, if RADIUS services are not available anymore for long enough, clients usually connecting to an 802.1X enabled SSID could probably face a network down situation.
Thanks to EEM on C9800 wireless controllers we can detect whether RADIUS services are not available anymore and, if so, automatically enable a pre-configured “backup” SSID secured with a PSK, through which clients can keep connecting and getting network services. Our backup SSID can be pre-configured for ease of use, instead of running all the commands to create it each time through the EEM script. Also, it should be based on PSK, precisely because it is for backup purposes when RADIUS services are not available. Another detail to note is that we may not want to configure such a backup/critical SSID with the same name as our standard 802.1X enabled one. Clients may get “confused” about which SSID configuration to pick from their configured list, the 802.1X or the PSK enabled one, even if just one of the two is available at any given time. We could rather pre-configure our clients with two different SSIDs, the standard 802.1X one and the backup PSK one, and they will keep probing for both of these SSIDs all the time. However, on the C9800 side, only one of these two SSIDs will be active at any given time, so clients will not risk to “jump” between wireless networks for example. We explicitly referred to pre-configuring clients with the 802.1X based SSID and the PSK based one, because such options are available for the Windows native wireless supplicant through Group Policy Objects (GPOs) in Active Directory, or even by distributing common configuration profiles for the Network Access Module (NAM) of Cisco AnyConnect. To resume, the main ingredients for our EEM use case are the following:
Our standard 802.1X based SSID, which we will call “Enteprise_SSID” in the next configuration examples, already configured and enabled in the C9800.
Our backup PSK based SSID, which we will call “Backup_SSID” in the next configuration examples, already configured and disabled in the C9800.
Clients pre-configured to automatically connect to both the Enterprise_SSID and the Backup_SSID when available.
Options to monitor RADIUS services and detect when they are not available anymore.
After having configured it through the graphical user interface (GUI), the Enterprise_SSID settings should look similar to the following in the command line interface (CLI) of the C9800 (where ‘1’ is its associated index, in our specific example):
wlan Enterprise_SSID 1 Enterprise_SSID … no shutdown
The Backup_SSID should not have the “no shutdown” command explicitly displayed over CLI, because the shutdown status is the default setting. An example of a sequence of commands that we would manually type to enable the Backup_SSID and disable the Enterprise_SSID is the following:
! Needed to enable RADIUS server testing capabilities aaa new-model ! Based on the detection "speed" and needs, you may want to modify these values radius-server dead-criteria time 20 tries 3 ! Deadtime is also needed to keep a RADIUS server in the "dead" state for some time and not to risk redeclaring it back "alive" too early radius-server deadtime 1 ! Replace <YOUR_RADIUS_SERVER> with the name of your RADIUS server radius server <YOUR_RADIUS_SERVER> automate-tester username RADIUS_AUTOMATE_TESTER idle-time 1
We should note here that such options to test a RADIUS server, with a user called “RADIUS_AUTOMATE_TESTER” in the example, by default may result in an authentication failure on the RADIUS server side, depending on its configuration. That should not be the main concern, because the purpose of this test user is not to successfully authenticate it, but to simply get a response from the RADIUS server, so to determine that it is in fact still available and not “dead”. If these tests fail and the RADIUS server does not reply, in the IOS-XE console we should start seeing log messages similar to the following:
%RADIUS-4-RADIUS_DEAD: RADIUS server 10.150.20.220:1812,1813 is not responding
If a RADIUS server becomes available again and is marked as alive, the corresponding log message in the console should look like the following:
%RADIUS-4-RADIUS_ALIVE: RADIUS server 10.150.20.220:1812,1813 is being marked alive
With all these settings in place we are now ready to build our EEM script, so that the following should happen:
The script looks for log messages containing “%RADIUS-4-RADIUS_DEAD”.
If such a message is detected, the Backup_SSID is enabled.
The script then pauses for 30 seconds, to allow clients already connected to the Enterprise_SSID to stay connected, while also giving them time to start detecting the Backup_SSID. You may of course want to modify such a timer at your convenience, or even not configure it at all.
As a further step, we should also configure another EEM script, which would reactivate the Enterprise_SSID when the RADIUS server becomes available again, and put the Backup_SSID back in the disabled state. This second script should be similar to the following:
All these examples can of course be customized even further to fit some more specific needs. For instance, matching just the log “%RADIUS-4-RADIUS_DEAD” would mean that, if the C9800 has multiple RADIUS servers configured and just one fails, the Backup_SSID might get enabled even if other RADIUS servers are still available. In such a case, we could rather match the log “%RADIUS-4-RADIUS_DEAD: RADIUS server 10.150.20.220:1812,1813 is not responding”, where 10.150.20.220 should be the last RADIUS server used by the C9800 in a sequence of multiple ones (meaning that all the others have already failed before). If the RADIUS servers are behind a load balancer, however, we would need to configure just the load balancer’s IP as the only RADIUS server for the C9800.
We hope that such a brief configuration example could provide you with some ideas on how to implement this scenario for a backup SSID, as well as on how to take advantage of the EEM features on the C9800 wireless controllers for more use cases to come. Please do not hesitate to post your comments/questions or provide additional feedback on this content.
I have 2 Access Points. Much of the readings suggest that one uses channels 1,6 and 11 on the 2.4GHz band. Since I have 2 AP, could not I use a different combination of non-overlapping channels such as:3 & 85 & 10, etc. Am I missing som...
Hello everyone,Maybe someone can help my out. Searched on google about this problem but couldn't find anything useful. After checking the LED Status, I connected with a console cable, only to find that the AP's are hanging at the L2 error. Afte...
Hello, We have 2 Cisco 2504 WLC which are configured in N+1 HA mode. I would like to upgrade the firmware. As I assume firstly I need to upgrade secondary unit reboot it and then upgrade primary because there is no auto-synchronization between units?