cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
810
Views
0
Helpful
4
Replies

Nexus 5596 OS upgrade delay

rmeans
Level 3
Level 3

I am opening this post OS upgrade for my Nexus 5k devices.  The upgrade completed successfully but I am not comfortable with events during the upgrade.

I have a pair of 5596 with 17 fex connected to both 5596.  Prior to issuing the “install all” command, I set up an extended ping to both 5596.  I begin the upgrade with the primary vPC device.  After the initial check, I answered “yes” to begin the disruptive upgrade.  After answering “yes”, both 5596 stopped responding to ping.  After 10 minutes, the other 5596 (not install all) began responding to ping.  I connected to the device and began monitoring vPC and fex status.  After another 5 to 10 minutes passed, the original 5596 (install all) begin responding to ping.  I was now connected to both 5596.  I monitored the status of the fex for the next hour.   The status for each fex varied between online, offline, connected, downloading, and AA mismatch.  After waiting an hour and without any apparent progress in the upgrade of the fex, I decided to “install all” the 2nd 5596.  Again, 10 minutes of no ping response for each 5596 before the original 5596 (the first install all) began responding.  Another 5 to 10 and the 2nd 5596 (second install all) began responding to ping.  At this point, I connected to both 5596.  I monitored the fex status.  Within 20 to 30 minutes all 17 fex had come “online” for both 5596.

In addition to loss of ping, I also experienced significate loss of server resources.  Services connected to the 5596 and n2k were not available.

 

Is what I experienced normal?

 

What I was expecting…

  • Install all the first 5596
    • Loss of ping during upgrade of 5596 (about 10 minutes)
  • Once 5596 is up
    • Each fex upgrades sequentially
    • No network outage.  90+% of my infrastructure is dual connected.
  • During the fex upgrade, I expect to see.
    • Upgraded 5596 – shows each fex downloading then “online”.
    • The not yet upgraded 5596 – shows each fex as AA mismatch
  • Once all fex have stabilized
    • Assuming 30 minutes but I don’t know how long it takes the fex to upgrade (17 total).
    • Upgrade the second 5596
      • No network impact
      • Expect loss of ping to the upgrading 5596
4 Replies 4

Muthurajeshwaran Natesan
Cisco Employee
Cisco Employee

Hi,

can you please tell me from where to where you are doing extended  ping and if you have topo can you pls share that ?

 

Thanks,

Muthu.

 

 

I upgraded from 6.0.2.n2.2 to 6.0.2.n2.6.  My ping test was to the SVI (not mgmt).

 

I have discovered a couple of items since the upgrade.  First, each n2k takes 10 to 12 minutes to upgrade.  I have almost 20 Nexus 2k.  That's over 3 hrs.  I waited about 1 hour.  It is likely the upgrade was happening as expected but I didn't wait.  Second, the interface I was testing with (SVI) can become non-responsive during upgrades but production traffic still passes through the Nexus.

I think the big concern here is ping test ? why its non-responsive during the primary switch upgrade. Though ping traffic is initiated from the secondary switch. Please correct me if am wrong.

 

As u said you are trying to ping SVI interface. Whether the  SVI interface is configured, within the box or its connected behind the fex or some other device. Please let  me know.

Thanks,

Muthu.
 

glen.grant
VIP Alumni
VIP Alumni

 what version did you start at and what version were you going to ?  You can't just jump from say 5 code to like 7 code .  You have to jump to like 6 code then on to 7 code .  We tried and saw all kinds of funky stuff like what you saw . I find it  ridiculous that you have to upgrade multiple times just to get to a newer version .  If you don't follow the upgrade paths in the release notes you can bet it will be disruptive and you should account for that .  I havent seen an upgrade yet on a 5K or a 7k that hasn't been disruptive though the 7K's seemed to be somewhat better probably because we stayed within the same 6.X code.  Fexes do take a long time to upgrade and if you read cisco's info it is supposed to be nondisruptive while they do .   But we saw the same stuff where start the upgrade and you can't ping either supervisor anymore and it freaked us out .

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: