If you saw my recent blog in infoworld you will have seen the rationale for on-box python and some of the use cases. This blog series dives deeper into the python code for the example scripts I published.
You might recall I mentioned three use cases for on-box python. Those are
- Scale: as the device can process information locally, reducing latency and volume of operational data sent back to a central management station.
- Security: as the device can run it's own scripts without requiring external logins from "management" accounts.
- Autonomy: as the device can make decisions when disconnected from a central management station.
Today I want to dive a little deeper into an example application for use case one which is scale. We will take a look at the script, and more importantly how that script gets executed on the device.
Why Scale?
If I have a “sanity” script that I need to run regularly and it takes six seconds per device * 1000 devices, that would be 6,000 seconds, or one hour and 40 minutes. I could run them in parallel, but that still consumes resources on my management station, processing the data. It also potentially transports lots of data back to the management station, only to discard much of it. An alternative is to distribute the work to the devices and get them to provide an update when they’re done.
There are lots of examples of simple use cases for distributed processing on the device.
Scale Use Case: PCI Compliance
Here’s an example of a use case that I think you’ll find interesting. One of our customers had a PCI requirement to ensure that any switch ports that were unused for more than seven days were disabled. (This was to prevent people from plugging in unauthorized devices.)
The script looks at all interfaces on the switch, and those that have been inactive (no traffic send/received) for more than seven days are shutdown. All interfaces that were shut down in a logged in a Cisco sparkroom. The interface description is updated with a message indicating the time/date it was shutdown by the PCI-check application.
How does it work?
The code is published @ https://github.com/aradford123/on-box-python.git I will leave the details on how to install these to another post.
The first thing I need to do collect the current state of the interfaes. I am going to take advantage of the inbuilt "cli" module on-box python to do this. Here is a quick example from the Python REPL (Run, Execute, Print, Loop)
[guestshell@guestshell ~]$ python
Python 2.7.5 (default, Jun 17 2014, 18:11:42)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from cli import cli
>>> output = cli("show int | inc Last in|line pro")
>>> print (output)
Vlan1 is administratively down, line protocol is down
Last input never, output never, output hang never
GigabitEthernet0/0 is up, line protocol is up
Last input 00:00:01, output 00:00:00, output hang never
TwoGigabitEthernet1/0/1 is administratively down, line protocol is down (disabled)
Last input never, output never, output hang never
TwoGigabitEthernet1/0/2 is down, line protocol is down (notconnect)
Last input never, output never, output hang never
<SNIP>
|
This tells me that interface GigabitEthernet0/0 should not be shutdown, interface TwoGigabitEthernet1/0/1 is already PCI disabled, and interface TwoGigabitEthernet1/0/2 should be PCI shutdown.
The next step is to parse the output. I could look to use NETCONF/YANG except that last input/output statistics are not currently returned as operational data. Instead I am going to use TextFSM to parse this using regular expressions. I will not go into the details of TextFSM here, but show you the definition in "show_int.textfsm".
Value Interface (\S*)
Value Admin (\S*)
Value Input (\S*)
Value Output (\S*)
Start
^${Interface} is ${Admin}
^\s*Last input ${Input}, output ${Output}, -> Record
|
to use this I need to load it, then apply it to the output of the CLI command.
template = open("/flash/gs_script/src/pci-tool/show_int.textfsm")
re_table = textfsm.TextFSM(template)
|
I then apply the FSM to the output of the command, which will return a list of variables. I skip non Ethernet interfaces, I skip those that are already shutdown and if the "apply_change" flag is set (i.e. not in test mode), I add the relevant commands to a list to execute.
fsm_results = re_table.ParseText(output)
for interface in fsm_results: # skip non Ethernet if is_idle(interface[2], interface[3]) and ("Ethernet" in interface[0]): if interface[1] != "administratively": if apply_change: exec_commands.extend(['interface ' + interface[0], description, 'shutdown'])
|
I can then apply the list of commands to the device. Again I use the "configure" function from cli module. This will execute a series of configuration commands and give me the output from them.
I can log the output to local syslog, and to spark.
def apply_commands(commands):
response = configure(commands) for r in response: log(r.__str__(), 5) if len(response) > 1: spark('\n'.join([r.__str__() for r in response]))
|
The logging to spark is optional. It requires an environment variable "SPARKTOKEN" (which is my authentication token) to be set. This approach keeps sensitive information like my spark token out of the code.
def spark(message):
''' If there is a spark token in the environment, send a message to Cisco Spark :param message: :return: ''' sparktoken = os.environ.get("SPARKTOKEN") if sparktoken is not None: roomId = getRoomId(SPARKROOM, sparktoken) postMessage('\n```\n' + message +'\n```', roomId, sparktoken)
|
To set the SPARKTOKEN variable, I could add the following line to my ~/.bashrc file in the guest shell.
export SPARKTOKEN="Bearer ZXXXXXXXXXXXXXXXXXX"
|
How to run it?
To run the script, we’ll use the Embedded Event Manager. EEM is a really powerful piece of IOS infrastructure that can be used to schedule the Python script to run. The EEM cron job runs the Python script at 15 minutes past the hour, Monday to Friday.
event manager applet PCI-check
event timer cron cron-entry "15 * * * 1-5"
action 1.0 cli command "enable"
action 1.1 cli command "guestshell run python bootflash:gs_script/src/pci-tool/pci_check.py --apply"
|
This script can now be run hourly (instead of weekly). It’s a great example of scaling using on-box Python.
Conclusion
This blog post looked at simple scale use case for running Python scripts on-box in IOS-XE. It is a deeper dive from my earlier infoworld article where I highlighted three categories of use case, namely:
- Scale: due to distributed execution.
- Security: as “management” accounts aren’t required to “log in” to a device to perform tasks.
- Autonomy: as the device can react to losing network connectivity.
The Embedded Event Manager is a powerful IOS tool that aids the execution of the Python scripts. This use case takes advantage of the“Cron” or pre-scheduled mechanism for running a script. This is for scripts that need to run at a certain time of day, hour, week etc. Other EEM mechanisms include countdown timer and event driven.
My next blog will cover the Security use case with a neat little DNS_update script to dynamically update an ACL based on DNS lookups. It will use the "EEM Countdown" method to fire the script.
You can learn more about on-box Python and Guestshell on the Cisco DevNet Python Network Automation site