cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7028
Views
29
Helpful
33
Replies

Improving install experience in IOS XR (ASR9K/CRS)

Eddie Chami
Cisco Employee
Cisco Employee

This time its a question to our customers:

If you insisted on keeping SMUs, SPs, Install rollback and the things you have gotten used to today, how would you change the install process to make it simpler, but still provide what we do now?

Or asked differently what do struggle with today when it comes to Installing/Upgrading Software and what would you like to see improved.

We want to hear your feedback. You can send a message too, please don't hold back..

Eddie.

 

33 Replies 33

Mathieu, inline:

>For example we were running 4.3.4 on our ASR9K and faced few bugs. Most of them were fixed in >SP3. While upgrading a router from 4.3.4 to SP3, we ran into a bug that completely crashed our >router. We had to turboboot everything. We raised a ticket with TAC and they told us we had to install >asr9k-px-4.3.4.CSCul58246.tar before upgrading to SP3. Unfortunately, this fix needs >asr9k->px->4.3.4.CSCug75299.tar and asr9k-px-4.3.4.CSCui94441.tar. This means to upgrade a >router from 4.3.4 to SP3, you actually need to reboot it 3 times!!! By optimizing as much as we can >this process, this means a maintenance of ~50min which is a lot of downtime for our customers!

Mathieu the SP Readme has been updated to include these per-requistes, so hopefully in the future it shouldn't be required. This process of pre-req smus for SP are not needed from 5.1.3 on.

>Why not releasing a 4.3.4-SP image each time your release a SP? At least providers could turboboot >it already patched?

Turboboot is the last thing we want you guys to do, its slow and painful, turboboot should rarely be used.

>Concerning turboboot, the transfert speed thru an integrated management port is catastrophic. We >can't specify a large block size to speed up things so even if we have a TFTP directly connected to >the port, transfers are way too slow. A good way would be to be able to transfer files with HTTP / FTP >during a turboboot.

We don't have a TCP stack in ROMMON and we don't plan to support it. We will support things like ONIE and IPXE in the future.

>Then if transfers are done in a fast & efficient way, we could save time by directly sending an >uncompressed image over the network instead of waiting for the router to decompress the archive.

I like the uncompressed image idea, we are exploring that.

Thanks for the feedback and we'll keep you posted.

Eddie.

 

>Turboboot is the last thing we want you guys to do, its slow and painful, turboboot should rarely be used.

Then you really have to do some education within Cisco. During the presales period as well as after buying the ASR9K, we've always been told to do turboboot from one major release to the other. The explanation is that it would clear all the previously installed SMU that could wrongly interact with the new install (a bit like reinstalling windows from scratch in the old days instead of trying a risky upgrade).

 

>We don't have a TCP stack in ROMMON and we don't plan to support it. We will support things like ONIE and IPXE in the future.

Then do you plan to support larger block size for TFTP? The CLI accepts it but gives you an error when you start the transfer. The idea is really to have good transfer rate to minimize downtime.

 

Peter L
Level 1
Level 1

Hi

A couple of things that could be improved.

 

1. Specified release date for Service pack

Would be nice to know when a service pack is supposed to be release for a release. So you can decide if you need to install a smu or if you can wait for the service pack. Is there a rule for when you create and release a service pack  now?

 

2. Download all IOS-XR software in CSM.

It would like to be able to download all software through CSM, and create a .tar directly from CSM containing the "system" .pie you want and the SMU/SP you want to run.

 

3. Reboot time.

Would be nice if you try to optimize this. Reboot time isnt bad on the ASR9K compared to some other devices but if it possible to shorten reboot time it would be nice. Less downtime when doing a upgrade equals happier customers :)

 

Regards Peter

Totally agree, over all with point # 2

Thanks Diego, we are working on number 2.
 

Hi Peter,

>1. Specified release date for Service pack

We will take care of that. I'll start posting this to the support forum.

>2. Download all IOS-XR software in CSM.

We are working through that.

>3. Reboot time.

Peter, we have improved this dramatically, i'll show you data on that soon. Which release is your data based on?

 

Eddie.

Hi Eddie

Regarding Reboot time.

Have been upgrading from 4.3.2 to 5.1.3 and from 5.1.3 to 5.2.2. If i remember correctly it took about 7-8 minutes to boot to 5.2.2 from 5.1.3. Thats not bad as i stated before compared to other devices. But if it's possible to reduce the reboot time it would be great. For me i think under 5 minutes would be something to aim for if its technically possible.

 

 

Thanks Peter. We will revert back.

dfranjoso
Level 1
Level 1

Hello All,

 

A simple upgrade procedure on the NV Edge cluster with minimal packet loss (in the milliseconds order) would be a very interesting feature.  Being simple a keyword on this sentence. 

David

Hi David,


>A simple upgrade procedure on the NV Edge cluster with minimal packet loss (in the milliseconds >order) would be a very interesting feature.  Being simple a keyword on this sentence. 

With an L2 networks + NV Edge, we can do msec upgrades, its fairly easy with L2. With a large Italian Operator we upgraded a system in 250msecs.

If L3/MPLS is also configured it gets tricky, for a network which just has IGP/MPLS/L2VPN/L2 we can achieve under 10 seconds upgrade on NV Edge.

For L3 with full Internet table(500K) we can do an upgrade in 110 seconds of outage. For partial table (ie 200k), can be done in 60sec.. The formula for outage for L3 is simple... Number of routes divided by 5000..  So how i got 110 above.. 520,000 (BGP)/5000 = 104. 104 + number of seconds for BGP to get established.. say 6 seconds.. so 110.

The above can be achieved provided both chassis in the cluster are dual homed north and south.

The methods above are also improved/simplified if NV Satellite is deployment with NV Edge.

Eddie.

brluers
Level 1
Level 1

For The most part (not all)  If you install a service pack you cannot install individual SMU's.  This means if my customer installs a service pack, then a few weeks later run into a bug for which their is a smu, but the smu cant be applied due to having installed the service pack.  This means my customer has to wait up to 3 months for the next service pack to come out with the fix included.  This is the main reason I do not recommend service packs to my customers.  However without service packs, the SMU list can be quite large and cumbersome to manage.

Very good feedback, some comments:

The above is only applicable to about 10% of the SMUs where a SMU can not sit on an SP, those 10% are SMUs which touch the OS-MBI pkg, which is sacred. The remaining 90% can install just fine.

 

To address the 10% above, we have some choices, i would like to hear what you folks think about this.

 

1) Release SPs more regularly.. (maybe every 4-6 weeks) so you don't need to wait 8 weeks for the fix.

2) Support SMUs + SPs together..

 

I prefer 1) so you don't need to worry about managing SMU+SP on the same system. We developed SPs to provide an alternative, so i want to relieve the smu mgmt from the operators that want simplicity. 

 

Thoughts? Other please chime in.

 

Eddie.

Hi

For a new box or when you upgrade a box the SP is really good to use. No need to check all the SMU's and see what you need. You can just take the latest SP and be pretty sure that all the latest patches are included.

 

The problem comes when the box is live in the network, Say that you have encountered a minor bug that has been fixed with a SMU that is hitless, and its important for you to apply a fix for this bug in your network.

In that case i would like to install only that SMU and be sure that there isn't a problem that I'm running a release with a SPx installed.

Probably the SMU will be included in the next SP but the problem is that as far as i have seen all SP needs reboot. And we cant reboot a box for a minor bug.

So supporting SMU and SP together would be good. Maybe you can just mark the SMU that doesnt work with SP in CSM and put an note about it in release note for SMU.

 

/Peter

Peter, thanks for the feedback on SP. It's what i've been getting from customers.

We will look at how we can support mixing of SMUs and SPs together, without going overboard.

Eddie Chami
Cisco Employee
Cisco Employee

>Then you really have to do some education within Cisco. During the presales period as well as after

We will work on the education.

>buying the ASR9K, we've always been told to do turboboot from one major release to the other. The >explanation is that it would clear all the previously installed SMU that could wrongly interact with the

I'm sorry that your told this, and the fact that you had to do that is even more painful for me. There is Zero intersection here between what you had before and what you want to upgrade too.

>new >install (a bit like reinstalling windows from scratch in the old days instead of trying a risky >upgrade).

I get your analogy, but its not the fact, please don't turboboot unless its for DR (Disaster Recovery). Anyone that tells you otherwise tell him to go back and check his facts or you can use this discussion board to talk about it.  We will do our bit to change that meth. Thanks for your feedback and please keep it coming.