cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4066
Views
30
Helpful
20
Replies

Number of ISR4K devices not recognizing upgraded flash under ROMMON

Hello All, 

I am facing what seems like a unique problem in which I have a number of ISR4K devices (4331's and 4351's) that when we upgrade the flash, they will not boot properly from flash. 

Going through the process and booting to USB, when in IOSXE you can see the Flash and write the image to it and confirm it wrote with no issues. You set the boot command to boot from the new flash and it fails and goes into ROMMON. When in ROMMON you try to boot to flash or DIR flash and it just tells you the "media drive is unreadable"

I have tried multiple versions of IOSXE with no luck. I have tried format/FSCK the flash and rewrite with same issues. These devices cannot be downgraded to older ROMMON, so they are on 17.6.1. We have swapped flash multiple times as well to confirm this is NOT flash related. We have moved known working flash from another 4300 upgrade to a problematic chassis and replicated the problem and vice versa fine. 

Has anyone run into this before that could lend some insight? TAC wants me to RMA all the routers, which I am hesitant to do at this very moment as I feel this is software related in some fashion. 

20 Replies 20

ROMMON can be downgraded. 

I suspect the issue is the ROMMON 17.6.1 is bugged.  Make sure to get TAC to replicate the issue and do not back-off until they have provided a Bug ID for this behaviour.  

Please share the Bug ID &/or the TAC Case #.

I also would like to add the following:

  1. Depending on what region in the world the TAC case is opened under, TAC might attempt to replicate the issue with Polaris-over-Virtual (aka IOS-over-Virtual).  When they do this, they will not be able to replicate this because Polaris-over-Virtual will not be able to see "ROMMON" issue(s).  TAC must get their hands on a real router.  
  2. Depending on the region of TAC, they may be motivated to RMA the appliance.  If they do, ask for an EFA (Engineering Field Analysis).  This is a very difficult and time-consuming process where TAC hardware team will disassemble and dissect each component and test it out.  Offer the TAC agent a choice:  Replicate the ROMMON issue (easy process) or perform an EFA (very difficult process) but never take the option of "RMA the appliance" and sweep this potential bug under the rug.  
  3. Notify your management and rope your company's Cisco AM/SE.  If this is correct, this is a very bad case and Cisco will need to pull that firmware file off the repository (another time-consuming process to Cisco).  Your Cisco AM/SE will work with TAC and all the necessary paperwork to remove the offending file.

Please keep us updated with the development.

And I second what @Leo Laohoo is saying there too - if they just do a standard RMA then the faulty unit(s) will get refurbished or scrapped by logistics/warehouse (which is outsourced to 3rd parties) and they will not take the problem any further and progress the fix.  So you have to be firm with them to make sure they do something about it.

Rich R
VIP
VIP

You'll have to wait for a fix I think.  This isn't the first time they've done this.  Just a few years they managed to completely brick the routers with 16.9(1r) and 16.12(1r) ROMMON - see https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvr18589
That was even worse - the router would not boot at all - not even to ROMMON, not even from USB - RMA was the only option.
Rolling back to earlier ROMMON will not be an option on the newer hardware because it won't be supported by the old ROMMON.  If TAC haven't already escalated this to BU then they MUST do this immediately.  As Leo says INSIST on them opening a bug for it (no delays/excuses) which must then go to the BU straight away otherwise this will not get fixed quickly enough.  Remind them about the mess they caused with CSCvr18589 if you have to.
They will need to release a new standalone ROMMON with the fix or new IOS with new ROMMON built in.

Thanks guys. Trust me I pushed with them on Friday and reminded them I just purchased $1.5M in ISR 4K's because I have about $4M worth of 8200L's on order for about a year they have not fulfilled and I needed a stop gap to hold me over for projects. 

I said from the very first call this felt ROMMON related, but the engineer disregarded me until I confirmed it on my own. 

Luckily enough atleast 85/110 of the ISR4K's I have sitting on pallets can be downgraded to get me by and get them out in the field. I have to check another 25 I have sitting on a pallet. So this may be related to a smaller set of hardware for me. 

This is exactly why I did not want to proceed with a blind RMA as I hate guessing. They stated the whole EFA process as well, but that takes time. I asked if a bug report was being created from this and they stated there was a lengthy process to follow to get this opened. I advised my fear was they would blindly take RMA's and just fire them back out in the field for the next guy to have an issue. 


@pietro manicioto wrote:

I said from the very first call this felt ROMMON related, but the engineer disregarded me until I confirmed it on my own. 


Get your Cisco AM/SE involved. 

If it is possible to share the TAC Case #, please do.  


@pietro manicioto wrote:
This is exactly why I did not want to proceed with a blind RMA as I hate guessing. They stated the whole EFA process as well, but that takes time. I asked if a bug report was being created from this and they stated there was a lengthy process to follow to get this opened. I advised my fear was they would blindly take RMA's and just fire them back out in the field for the next guy to have an issue. 

What method is used to correspond with TAC, is it email?  If it is email, please look at the email address or name of the TAC agent if he/she has an "-X" at the end of his/her name.  If it has, then you are corresponding to a contractor.  He/She is not even a permanent (blue tag) Cisco employee but a contractor (red tag).  They have very little training about how to troubleshoot something and raising a Bug ID is one of those things which they have no knowledge of doing.  

If the TAC agent is indeed a red tag, get your Cisco AM/SE involved because they can get the case re-assigned to someone more "professional".