on 12-19-2012 09:42 AM
A SMU is a software maintenance update, or simply put a patch, that can be loaded on the XR device you are running. The concept of a SMU applies to all XR devices, although this article focuses on the ASR9000 primarily.
When the system is running into a SW deficiency (a.k.a. a bug) Cisco can provide a patch for that particular problem in order for you to keep running your base release, but get free of the problem at hand. This is a substantial difference over the classic IOS that has no capability to apply a single fix in a single component on top of the base release run.
A SMU is a patch that is provided on a per release and per component basis and is specific to the platform. This means that when you are provided a SMU that it can’t be transported onto different hardware (eg a SMU delivered for CRS can’t be applied to ASR9K) and is also particular to the release that you are running. For instance a SMU provided on XR release 4.2.1 can’t be applied to a system running XR 4.2.3.
SMU’s are “PIE” files (package installation envelope) similar as the functionality of feature PIE’s such as MGBL, MPLS and multicast and they are installed in a similar fashion.
The 3 operation steps to apply a smu are:
In this example we’re going to install a dummy bogus OSPF smu that is a process restart.
RP/0/RSP0/CPU0:A9K-TOP#admin install add tftp://3.0.0.1/xthuijs/asr9k-p-4.0.3.CSCea12345.pie
Wed Dec 19 12:05:38.639 EDT
Install operation 82 '(admin) install add /tftp://3.0.0.1/xthuijs/asr9k-p-4.0.3.CSCea12345.pie' started by user 'root'
RP/0/RSP0/CPU0:Dec 19 12:05:38.820 : instdir[206]: %INSTALL-INSTMGR-6-INSTALL_OPERATION_STARTED : Install operation 82 '
(admin) install add /tftp://3.0.0.1/xthuijs/asr9k-p-4.0.3.CSCea12345.pie' svia CLI at 12:05:38 EDT Wed Dec 19 2012.
tarted by user 'root'
The install operation will continue asynchronously.
Couple of notes to that:
Now that the SMU is loaded on the system, after you see this message:
RP/0/RSP0/CPU0:A9K-TOP#Info: The following package is now available to be activated:
Info:
Info: disk0:asr9k-p-4.0.3.CSCea12345-1.0.0
Info:
Info: The package can be activated across the entire router.
Info:
RP/0/RSP0/CPU0:Dec 19 12:05:54.469 : instdir[206]: %INSTALL-INSTMGR-6-INSTALL_OPERATION_COMPLETED_SUCCESSFULLY : Install operation 82 completed successfully
Install operation 82 completed successfully at 12:05:54 EDT Wed Dec 19 2012.
You can verify the ability to activate this smu via the command “admin show install inactive”
RP/0/RSP0/CPU0:A9K-TOP#admin show install inactive
Wed Dec 19 12:08:58.125 EDT
Secure Domain Router: Owner
Node 0/RSP0/CPU0 [RP] [SDR: Owner]
Boot Device: disk0:
Inactive Packages:
disk0:asr9k-p-4.0.3.CSCtr31747-1.0.0
disk0:asr9k-mini-p-4.0.1
disk0:asr9k-k9sec-p-4.0.3
disk0:asr9k-mpls-p-4.0.1
disk0:asr9k-p-4.0.3.CSCea12345-1.0.0
Note that the directory lists now disk0. For as9k that is always where the package are stored that are to be activated and running.
Effectively the “install add” operation moves the file from the source directory onto disk0, but it is not yet running in the load path. That is what the next step will take care of.
If you don’t know the precise package, you can always use this command:
RP/0/RSP0/CPU0:A9K-TOP#admin install activate ?
disk0:asr9k-cpp-4.0.1 Package to activate
disk0:asr9k-cpp-4.0.3.CSCtr31747-1.0.0 Package to activate
disk0:asr9k-k9sec-p-4.0.3 Package to activate
disk0:asr9k-mini-p-4.0.1 Package to activate
disk0:asr9k-mpls-p-4.0.1 Package to activate
disk0:asr9k-p-4.0.3.CSCea12345-1.0.0 Package to activate
disk0:asr9k-p-4.0.3.CSCtr31747-1.0.0 Package to activate
disk0:iosxr-diags-4.0.1 Package to activate
disk0:iosxr-fwding-4.0.3.CSCtr31747-1.0.0 Package to activate
disk0:iosxr-routing-4.0.3.CSCea12345-1.0.0 Package to activate
One thing to highlight is the line in RED is the SMU that we just added. But now there is also the green line appearing.
Basically this means that the SMU we uploaded apparently is going to patch something in the “routing” component of the system, which is what OSPF falls under, since this is an OSPF demo smu.
RP/0/RSP0/CPU0:A9K-TOP#admin install activate disk0:asr9k-p-4.0.3.CSCea12345-1.0.0
Wed Dec 19 12:14:03.083 EDT
Install operation 83 '(admin) install activate disk0:asr9k-p-4.0.3.CSCea12345-1.0.0' started by user 'root' via CLI at
RP/0/RSP0/CPU0:Dec 19 12:14:03.288 : instdir[206]: %INSTALL-INSTMGR-6-INSTALL_OPERATION_STARTED : Install operation 83 '
(admin) install activate disk0:asr9k-p-4.0.3.CSCea12345-1.0.0' started by u12:14:03 EDT Wed Dec 19 2012.
ser 'root'
Info: Install Method: Parallel Process Restart
The install operation will continue asynchronously.
RP/0/RSP0/CPU0:A9K-TOP#
After that completes, similar messages as the below appear:
LC/0/0/CPU0:Dec 19 12:14:38.365 : sysmgr[87]: %OS-SYSMGR-7-INSTALL_NOTIFICATION : notification of
software installation received
LC/0/3/CPU0:Dec 19 12:14:38.370 : sysmgr[87]: %OS-SYSMGR-7-INSTALL_NOTIFICATION : notification of software installation
received
LC/0/0/CPU0:Dec 19 12:14:38.381 : sysmgr[87]: %OS-SYSMGR-7-INSTALL_FINISHED : software installation is finished
LC/0/3/CPU0:Dec 19 12:14:38.385 : sysmgr[87]: %OS-SYSMGR-7-INSTALL_FINISHED : software installation is finished
LC/0/6/CPU0:Dec 19 12:14:38.529 : sysmgr[90]: %OS-SYSMGR-7-INSTALL_NOTIFICATION : notification of software installation
received
LC/0/6/CPU0:Dec 19 12:14:38.546 : sysmgr[90]: %OS-SYSMGR-7-INSTALL_FINISHED : software installation is finished
RP/0/RSP0/CPU0:Dec 19 12:14:53.145 : sysmgr[95]: %OS-SYSMGR-7-INSTALL_NOTIFICATION : notification of software installati
on received
RP/0/RSP0/CPU0:Dec 19 12:14:53.184 : sysmgr[95]: %OS-SYSMGR-7-INSTALL_FINISHED : software installation is finished
Info: The changes made to software configurations will not be persistent across system reloads. Use the command
Info: '(admin) install commit' to make changes persistent.
Info: Please verify that the system is consistent following the software change using the following commands:
Info: show system verify
Info: install verify packages
RP/0/RSP0/CPU0:Dec 19 12:15:04.165 : instdir[206]: %INSTALL-INSTMGR-4-ACTIVE_SOFTWARE_COMMITTED_INFO : The currently act
ive software is not committed. If the system reboots then the committed software will be used. Use 'install commit' to c
ommit the active software.
RP/0/RSP0/CPU0:Dec 19 12:15:04.166 : instdir[206]: %INSTALL-INSTMGR-6-INSTALL_OPERATION_COMPLETED_SUCCESSFULLY : Install
operation 83 completed successfully
Install operation 83 completed successfully at 12:15:04 EDT Wed Dec 19 2012.
Take note of the green and red lines highlighted. The RED line we’ll discuss in step 3.
Now the SMU is active. And this change happened inline, the system continued to function as normal and eventhough OSPF was kicked, the neighbors were not lost:
RP/0/RSP0/CPU0:A9K-TOP#show ospf ne
Wed Dec 19 12:16:29.892 EDT
* Indicates MADJ interface
Neighbors for OSPF CORE
Neighbor ID Pri State Dead Time Address Interface
2.2.2.2 1 FULL/DR 00:00:36 200.200.1.2 Bundle-Ether200
Neighbor is up for 7w1d
We should be able to see the effects of this demo smu now:
RP/0/RSP0/CPU0:A9K-TOP#show ospf
Wed Dec 19 12:17:09.059 EDT
DEMO SMU: If you see this line the SMU was installed correctly
Routing Process "ospf 111" with ID 12.12.1.2
NSR (Non-stop routing) is Disabled
Supports only single TOS(TOS0) routes
Supports opaque LSA
Router is not originating router-LSAs with maximum metric
Looking at the green line I say our exercise succeeded.
If the chassis were to reload right now, the SMU will be lost from the running software. Which means that the green line or our fix will disappear. To make the change or application of this SMU persistent across reloads, you need to commit the smu via this command:
admin install commit
In theory, every software issue can be converted into a SMU. However we apply a set of “rules” to what can be put in a SMU.
For starters, features or CLI changes are not eligible for a SMU. To leverage a new functionality, it is advised to upgrade to a new release.
Cosmetic issues we generally don’t smu either.
Only mission critical issues are to be SMU’d that will not allow you as a user to operate the device under “normal” circumstances.
On occasion when we uncover a critical issue we might create a SMU on a (set of) release(s), such as when a PSIRT is in order.
If you have an issue that appears to be a software bug and you have a TAC case open and there is a known DDTS associated with the issue, then it is eligible for a SMU. Work with your TAC engineer to see if this particular issue can be put into a SMU and what the timelines are for delivery.
Sometimes it may make sense to consider a sw maintenance upgrade rather than filing an individual SMU request. Your TAC or Advanced Services representative will be able to advice you on the best course of action.
Note that SMU delivery and commitment comes from Cisco engineering. Make sure that your support representative communicates the commitment as set by the Cisco engineering group.
SMU’s are categorized as follows:
Furthermore these SMU’s have a particular “restart type”.
We try to make SMU’s of a process restart kind similar in the example you have seen above which effectively means that the component this SMU touches is restarted in an inline fashion not affecting the operation of the device.
However sometimes SMU’s touch such key base components deep in the OS requiring a reload of the device. Examples of that are changes in the MBI (boot image), kernel or Network Processor ucode.
Sometimes it can be the case that a particular SMU would require too many processes to restart and while this theoretically should be possible, for safety reasons if the fan out of process restarts exceeds 10, we make the SMU a reload to maintain system stability.
There are 2 key repositories where SMU’s are located. That is either on CCO found in the section of
https://upload.cisco.com/cgi-bin/swc/fileexg/main.cgi?CONTYPES=IOS-XR
Note that SMU’s on file exchang require special permission and you may need to access for access to download a specific SMU from that location.
*File exchange may disappear in the near future to be “superseded” by a new software delivery platform. We’ll update documentation here as that gets more shape and form.
Sometimes a TAC engineer in association with a development engineer may provide you what is called an engineering SMU for a fix verification to be run temporarily.
An ENG-SMU is really like any other regular SMU however it is build from a private workspace. This means that if you have other SMU’s running in the same component, these changes may be negated. Subsequently future SMU’s in the same component may not inherit the changes that you have in this particular eng smu and will be negated when a new smu in this component is run.
For that reason Cisco can NOT allow you to run eng-smu’s in a production environment for a prolonged period of time.
Also the availability of the eng smu should not give you the commitment that a production smu is in order. These eng smu’s are only and solely for the fix verification to make sure that the intended changes do fix the problem.
We try to minimize the need for such events as much as possible, however if Cisco TAC or Development is not able to reproduce a problem, this may be a potential solution to verify the fix in a real scenario and make sure there is no collateral.
Once Cisco TAC has provided you the official confirmation that a SMU will be provided it can take some time before the final SMU will be posted or when you get it in your hand.
A SMU will undergo various stages before it is released and they are the following:
Only until after step 2 you can have the confirmation that a SMU will be provided.
The timelines from 1 to 7 can range between 6 to 8 weeks.
The majority delay we experience in step 6. This is integration level testing whereby the SMU is not only subject to the particular issue it fixes, but also it is tested in a multi dimensional test scenario to make sure there is no collateral into other components.
One important concept to understand about SMU’s is that they are committed to a software lineup particular to that release.
This means that if we have a SMU “x” that fixes lets say LSA flooding in OSPF. SMU “x” will basically contain the new ospf process and libraries for instance. Now when we have a SMU “y”, that fixes a crash in OSPF due to the reception of a malformed hello and if “y” was delivered AFTER “x”, basically “y” contains the fixes for both issue “x” and issue “y”
Coming back to the eng-smu discussion earlier. If you were to be given an eng-smu for “y”, the changes from “x” wouldn’t be there. Subsequently, if there would be a SMU “z”, also in OSPF, then “z” may not contain the changes of “y” if these changes were not committed to the SMU lineup. So Loading “z” would negate the changes applied by “y”.
This story also immediately shows the concept of a SMU “supersede”.
What a supersede means is that if there are 2 SMU’s in the same overlapping component, there is no need to run them both at the same time. In this example above. SMU “y”, while committed to the lineup, takes inherently the changes from “x” already. So by running “y” you don’t need the smu for “x” anymore.
If you are already running it, you can remove it to save some space. If you want the changes from both X and Y, just running “Y” is good enough.
In some cases a SMU includes 2 components at the same time. For instance in the example whereby SMU “x” contains a change in OSPF and let’s say some library.
If SMU “y” is another OSPF change as in the example above, but has no lib changes, we say that SMU “y” is a partial supersede over x. That is, there are some overlapping components but NOT ALL of them.
It supersedes the changes of X, but Y itself doesn’t include the lib changes hence in this case they will need to be run together.
The way the SMU is built by Cisco SW tools will include that dependency, so while you are installing “y”, it will mention to you that “x” is needed as well.
In this example above we can also state that “y” has a prerequisite of smu “x”. That means in order to run “y” we need “x” as well.
Generally our SW delivery team removes SMU’s from CCO and file exchange that are fully superseded, so you won’t be as confused. But that inherently means that if you were looking for “x”, how do you know that now you need to install “y” to get whatever you need?
Well we recognize that difficulty and right now there is no other way then asking your AS to TAC rep what you need to do if you want the changes for “x” and you don’t find “x” anywhere. But read on as we have some software tools for you to use to simplify all this in terms of SMU management.
A combo SMU or pack is a collection of individual DDTS’s included in what we call an “umbrella” DDTS.
This DDTS is a new number which is basically an aggregation of a set of DDTS’s under that umbrella number.
This simplifies the SW delivery model because now you don’t have to install individual SMU’s to get fixes for individual issues, but rather instead there is a single file to download and install to patch evyerhting up.
The terminology of combo, umbrella and pack are used interchangeably but in the end they all refer to the same thing.
The ASR9000 team is making use of SMU packs heavily for what we call platform specific fixes.
Starting XR 4.2 the ASR9000 team has provided SMU packs for platform specific fixes that we consider to be mission critical. The packs or what Microsoft would call service packs are collection of fixes that you will want to pick up in order to maintain stability on the base release you are running.
• How are critical bugs which fall between C-SMUs handled?
- We make an assessment based on the impact, severity and workaround if present, if needed we deliver an individual SMU between SMUs packs
• Does the C-SMU supersede all other previous PD SMUs?
- The C-SMU will supersede the previous C-SMU as well as any other SMUs that are released and are a reload type, or have sort of dependency to be included in the C-SMU
• Will this policy cover all SMUs?
- This policy if for Platform Dependent (PD) fixes at this point. That is PRM and ucode fixes
• Will there still be individual SMUs released?
- Yes we will still release individual none reload PI or PD SMUs, the C-SMU process is for PD reload SMUs
• What is the benefit of the SMU Packs and why are we doing that?
- To provide maintenance updates specific to asr9k
- With timed delivery
- To simplify it for the customer and reduce the complexity of maintaining SMU supersedes and pre-requisites
• I have a DDTS that I want for my customer, can get this get in the SMU pack?
- It depends. General SMU guidelines apply. If the fix is critical enough it can get included
• What happens if I have a last minute DDTS/SMU requirement?
- Once we enter Dev Test phase we will NOT make an exception to integrate it and you will need to wait until the next smu pack.
• My customer is not happy with the SMU Pack as he claims he now has to test for more fixes because the pack includes more fixes
- That is a misperception. Every PRM/UCODE smu will pick up changes from a preceeding one, so “clean” smu’s with only the one fix don’t really
exist. In fact this pack methodology makes it easier to understand what the customer is going to get. Comprehensive test coverage is always needed
• Why are the packs reload?
- Because they contain driver, prm and ucode changes which inherently require a reload. Later on moving forward we will get more ISSU like behavior for ucode changes.
• Do SMU packs affect general timelines?
- Not really. Before you had to wait already anyway 6-8 weeks for a SMU delivery. Now it is the same way effectively.
• How many DDTS’s are integrated in a PACK?
- There is no defined number of fixes in there, but we try to keep it limited.
• Are SMU packs the new maintenance releases?
- If you want to call it like that, sure. SMU packs is our way to provide added stability of critical fixes to the release in question without having the need to upgrade to new minor releases. This policy remains in effect for as long as needed.
• How many packs per release?
- Depends on the need, expect 2 to 3 packs.
We understand that as part of the above story it is pretty hard to keep up with what is recommended, what smu’s are available, which ones you need to run, installing them and keeping track of all those dependencies.
Fortunately we have some good news.
As part of the ASR9000 Craft Tool (ACT), we have a SMU manager capability that is getting more and more enhanced as we move forward, but will help in managing these SMU’s for you on your devices.
ACT is free for download from CCO. You need version 1.5 to have the SMU mgmt capability.
ACT is a java applet based tool that helps you managing the ASR9000 from an operational standpoint, and now also the software of it.
ACT requires the MGBL pie active on the device under mgmt so that the XML capability is running.
It requires a telnet or SSH connection from the device where ACT is running to the ASR9000.
After you have installed ACT and you are ready to launch, the application a window similar as the below will appear.
Click on
The next thing you need to do is to specify the location of where the SMU’s will be located on your local repository.
This needs to be a tftp server that is accessible by the station that you run the ACT app on as well as the ASR9000’s that you want to manage the package of.
At the same time, you will need to provide the META FILE location.
This meta file will be made available for download on a per release basis that knows everything about all SMU’s for that release and their dependencies.
This META file is the brain of the ACT smu management, so it is important that this file is current.
Today, the ACT can’t download SMU’s or META files automatically from CCO, but that is something we are working on, so right now you will need to pull in these files manually. For sure the META file.
Based on the SMU’s you want to run, you may need to pull them in, but ACT will guide you through it.
Picture showing the directory path dialog as to where the SMU’s can be found on your local repository.
Right now, we only support TFTP, but there are plans also to support SFTP and FTP.
You may not have the need to run all SMU’s that the META file knows about because certain features are not being used for instance. Also depending on the functionality of the device and where it sits in your network, a SMU may not be applicable to that device. For instance you have an L2VPN design and you run STP. You may want to run an STP related SMU on your edge/customer facing devices, but not necessarily on your core devices.
The Custom Meta file option provides you an ability to pre-select certain SMU’s and save these custom sets to a file.
Select the
You can then run a compliance report against a device under management to see if the system is running the set of SMU’s that you want it to run based on the specified custom meta file.
When you are about to select SMU’s for a custom meta file or to be applied to the device under management
there are a few things that are interesting from the SMU selection page.
The scroll down list either displays ALL smu’s available, or a subset if a custom META file was specified.
This page shown is very important as it also shows you the SMU impact, whether it is recommended or not
and what the implications are of installing it, whether it is a hitless SMU or a reload requirement.
When you select a particular SMU to be installed onto the system, you can click the “install add” option, which basically pushes the file from your local repository (TFTP server) to the ASR9000 being managed.
If the tool detects that there is a dependency (eg pre-requisite requirement) the tool will identify that and alert you of that necessity and provide you the option to “fix” that dependency by applying the pre-req smu automatically if it is available on your local repository.
When you double click a row in the smu selection page, the “internal” smu details are being presented to you.
The dependencies as well as some other details in terms of component it touches are being detailed out.
This is merely for information and you can ignore pretty much this page, but it is nice to know in case you like to track the SMU internals.
You can see that everything with a DDTS number is hyperlinked. Which allows you to open the bug toolkit which will report the DDTS details, headline, release notes and workarounds applicable.
Also it shows you the integrated in versions.
The bug tool kit spawned when clicking on the hyperlink of the DDTS:
N/A.
Xander Thuijs CCIE #6775
Principal Engineer, ERBU
ASR9000
Hi Xander,
Have a query based on the recent observaion with respect to the SMU's and Upgrade
Step (1) Test Lab Router was running 5.2.4 packages + SMU's( 17SMU's) from which upgrade is carried out to 5.3.4 Packages + SMU's using a tar file. No issue was observed during the upgrade. Note is that there were many 5.3.4 inactive packages before upgraded from 5.2.4 to 5.3.4 as previous many times the upgrade has been performed. Inactive packages of 5.3.4 were not removed
Step (2) Same test Lab router running 5.2.4 Packages + SMU's (16 SMU's) from which upgrade is carried out to 5.3.4 packages = SMU's using the same tar file. The install activation failed with the message saying a "X" SMU is required for "Y" SMU to be installed. Note is that all the inactive packages were cleared before the install add/activation unlike.
We were trying to analyse why during test Step(1) the install activation was not failing. The two differences between Step (1) and Step (2) are
a. Step (2) had One less SMU in 5.2.4 compared to Step (1) before the upgrade.
b. During Step (2) when router was running with 5.2.4 there wer eno inactive packages of 5.3.4.
Could you please help in understanding which of these two would have triggered the failure.
yikes, this is very vague right :)
it could be that the missing smu in the 16 vs 17 is that "y" being asked for...
it would help to know what the install list is and what x and y were so I can make a better recommendation as to what has heppened here.
so question "back" is, what is the smu delta between item 1 and 2.
cheers!
xander
Hi Xander,
Figured out the reason.
During Step (1), we had installed an Eng SMU for a fix in 5.3.4 during earlier upgrades and the SMU was in the inactive list which we did not clear. We installed a tar ball for the latest upgrade of 5.2.4 to 5.3.4 during which we had taken the production SMU for the same fix. Unfortunately the Eng SMU and the Production SMU had the same name. The Eng SMU never enforced the Pre-requisite and because of the same name of the Prod SMU was not taking into effect to enforce the pre-requisite.
Once we cleared the inactive packages and tried install add/activating the tar ball, we could see the pre-requisite is enforced.
Thanks for your help
hi visb,
nice going, thanks for the closure on this. it is a bit interesting that the eng smu has the same version as the production smu. even if that is the case, the pre-req should have been satisfied.
one thing to note is indeed that if there is a smu on the disk, but inactive, an addition or overwrite of that same version smu will fail as it thinks it is present already. it depends a bit on the way the eng smu is built.
anyways water under the bridge now :)
cheers
xander
Hi Xander,
Doing a post upgrade task/verification with list:
admin install commit.
admin show install active summary -> displays only active smus and pies
admin show install committed summary -> displays persistent pies and smus that will be used during next reboots but includes superseded smus.
Why are superseded considered or marked as persistent since they are clearly not active??
Regards
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: