cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
17998
Views
122
Helpful
58
Replies

Ask the Expert: IOS-XR Architecture and Troubleshooting

Monica Lluis
Level 9
Level 9
 

This session will provide an opportunity to learn and ask questions about Cisco IOS XR Software architecture which is modular and fully distributed network operating system used by most of the leading service providers in the industry. IOS XR is widely used in big platforms like GSR, CRS, NCS and ASR9K etc. You will also learn different troubleshooting scenarios and use cases on Cisco IOS XR Infrastructure, Configuration Management, XR Monitoring and Operations, process crash decode etc. c.

To participate in this event, please use the Reply Button to ask your question.

Ask questions from Monday June 6 to Friday June 17, 2016

Featured Experts

Raj Pathak is a customer support engineer in High-Touch Technical Services at Cisco specializing in service provider technologies and platforms. He serves as a support engineer for technical issues supporting Cisco IOS XR Software customers on Cisco CRS, ASR 9K and Cisco XR 12000 Series Routers. Raj has more than 10 years of experience in the IT industry and holds double CCIE certification (38760) in routing and switching and Data Center. He has already delivered webcast and other ask the expert session on cisco support community.

Sudhir Kumar is a customer support engineer in High-Touch Technical Services at Cisco specializing in service provider technologies and platforms. His areas of expertise include Cisco GSR, CRS, ASR 9K and Cisco XR 12000 Series Routers. Sudhir has more than 12 years of experience in the IT industry and holds CCIE certification (35219) in Service provider and Routing and switching and Data Centers. He has already delivered webcast and other ask the expert session on cisco support community. 


Find other  https://supportforums.cisco.com/expert-corner/events.

** Ratings Encourage Participation! **
Please be sure to rate the Answers to Questions

https://supportforums.cisco.com/expert-corner/events ">https://supportforums.cisco.com/expert-corner/events.

We look forward to your participation. This event is open to all, including partners. Please Share this event in your social channels. Have a technical question? Get answers here before opening a TAC case by visiting the Cisco Support Community. 

I hope you and your love ones are safe and healthy
Monica Lluis
Community Manager Lead
58 Replies 58

Charles - As Sudhir explained, we cannot simply remove the disk from a running RP. However, with disk mirroring, we can prepare a backup disk that may be used to boot up a new system with the required base image.

For more information on disk mirroring, please refer to the following link.

http://www.cisco.com/c/en/us/td/docs/routers/crs/software/crs_r4-2/system_management/configuration/guide/b_sysman_cg42crs/b_sysman_cg42crs_chapter_011.html

The following document explains the concept of a graceful disk upgrade procedure. Although this document is for the XR12k platform, the same concept applies to the CRS RP that uses a flash disk.

http://www.cisco.com/c/en/us/td/docs/routers/xr12000/xr_line_cards/flashdisk/flashdisk.html

An additional reference on Pre-staged Migration:

http://www.cisco.com/c/en/us/td/docs/ios_xr_sw/iosxr_r3-8/migration/user/guide/up38Book/up38upinst.pdf

Thanks Osman.

addelanto
Level 1
Level 1

Hi

We're having problems to replace FP40 card on crs3 , ios xr 5.3.3 .

New FP once inserted remains  in MIB-RUNNING state , after jumping in different states :

Tue Jun  7 15:20:52 2016      MBI RUNNING              MBI Active
Tue Jun  7 15:20:01 2016      MBI BOOTING              MBI Hello rcvd
Tue Jun  7 15:19:35 2016      ROMMON                   Boot Reply Sent
Tue Jun  7 15:19:35 2016      PRESENT                  Boot Req Received
Tue Jun  7 15:17:40 2016      PRESENT                  TBRINGDOWN Timeout
Tue Jun  7 15:17:29 2016      PRESENT                  POWERON RELAY
Tue Jun  7 15:17:29 2016      BROUGHTDOWN              POWERON
Tue Jun  7 15:17:29 2016      XR FAIL GRACEFUL         TXR_FAIL_GRACEFUL Timeout
Tue Jun  7 15:17:24 2016      MBI RUNNING              Bringdown
Tue Jun  7 15:04:16 2016      MBI RUNNING              MBI Active
Tue Jun  7 15:03:25 2016      MBI BOOTING              MBI Hello rcvd
Tue Jun  7 15:02:59 2016      ROMMON                   Boot Reply Sent
Tue Jun  7 15:02:59 2016      PRESENT                  Boot Req Received
Tue Jun  7 15:01:04 2016      PRESENT                  TBRINGDOWN Timeout
Tue Jun  7 15:00:53 2016      PRESENT                  POWERON RELAY
Tue Jun  7 15:00:53 2016      BROUGHTDOWN              POWERON
Tue Jun  7 15:00:53 2016      XR FAIL GRACEFUL         TXR_FAIL_GRACEFUL Timeout
Tue Jun  7 15:00:48 2016      MBI RUNNING              Bringdown
Tue Jun  7 14:47:40 2016      MBI RUNNING              MBI Active
Tue Jun  7 14:46:49 2016      MBI BOOTING              MBI Hello rcvd
Tue Jun  7 14:46:23 2016      ROMMON                   Boot Reply Sent
Tue Jun  7 14:46:23 2016      PRESENT                  Boot Req Received
Tue Jun  7 14:46:23 2016      NOT PRESENT              OIR insertion
Tue Jun  7 14:44:02 2016      NOT PRESENT              POWEROFF RELAY
Tue Jun  7 14:44:02 2016      NOT PRESENT              POWEROFF
Tue Jun  7 14:44:02 2016      MBI RUNNING              OIR removal

Maybe defective ?

Thanks a lot

Regards

Antonello

Very Likely, however, you may like to reseat in a different slot if you have an available empty in order to rule out any issue with the slot.

Also, visual inspection for any damage in the connectors and bent pins is recommended 

Hi All

Sorry for the late reply

Cisco TAC Engineer diagnosed faulty PLIM, now we're waiting for replacement by RMA.

I'll keep you up to date.

Thanks a lot

Regards

Antonello

Hello Antonello,

Glad to know your issue has been resolved. Feel free to ask more questions if you have.

Thanks

Raj Pathak

Hello Antonello,

There could be multiple reasons for card not to boot or stuck in MBI state, first step is to move the card on known working slot to see if the problem follows.

If you see the issue is not with the card may be you can copy paste specific logs from router syslog for further investigation?

Thanks

Raj Pathak

Hi Antonello,

Adding to Raj's point. Please find some card states description, which would help you understand the card state during troubleshooting.

Card States:

NOT-PRESENT
The node is not present in the system, waiting for OIRd. Indicates that the card is not properly seated, not getting powered, or another low-level issue.

PRESENT
Upon a node being powered-on and Shelf Manager detecting the presence of a card using the OIRd register we changes states from: NOT_PRESENT –> PRESENT
Shelf Manager is pending the receipt of a boot request.

ROMMON
Shelf Manager receives a boot request and the card type is known, change states: (PRESENT –> ROMMON)

MBI-BOOT
Shelf Manager validates the boot request and sends a boot reply. In order to validate the boot image Shelf Manager will call the mbimgr API.
If the node’s bootflash does not have the boot image the image will be downloaded via the Control Ethernet TFTP server. state changed: (ROMMON –> MBI-BOOTING)

MBI-RUNNING
The mbi-hello process is initialized during MBI Band by the init process and starts sending MBI-HELLO messages to shelf manager. Upon reception of the first MBI-HELLO message (MBI-HELLO HB) shelf manager will transition the node state from MBI-BOOT to MBI-RUN. When Shelf manager receives the first MBI-HELLO, the node state changes to RUNNING_MBI.


XR-RUN
The hbagent process is initalized by sysmgr process and starts the hand-off operation with mbi-hello process. After the hand-off is completed the mbi-hello process will exit and hbagent process will start sending XR-HELLO messages to shelf manager. Upon reception of the first XR-HELLO message shelfmgr will transition the node state from MBI-RUNNING to XR-RUN.Now, the node begins sending heart beat messages every second to keep it in RUNNING_ENA (IOS-XR RUN)


XR_MBI_FAIL_GRACEFUL
The node is about to be reset or powered down due to admin config commands, alarms, other internal requests etc. while it was in XR_RUNNING or MBI_RUNING state. The FSM parks a node in this state for 5 seconds so that applications get a chance to learn about impending failure and do the required cleanup, if any, on their side. The difference between this and the 

BRINGDOWN
Once a node reaches this state, it would be reset (power down followed by power up).

IN_RESET
If a node keeps hitting BRINGDOWN state without being able to transition to XR_RUNNING state consecutively 5 times, then the node is deemed a failure and power would be cut off for that node. This state is referred to as IN_RESET. After this the node will no longer auto-boot.

UNPOWERED
If a node is powered off due to configuration or programmatically, then the node enters UNPOWERED state. (A node can be programmatically powered off when any incompatibilities are found – for example, wrong card in a chassis etc.)

Thanks
Sudhir Kumar

jameel noori
Level 1
Level 1

Hello Raj/Sudhir,

Thanks for this knowledge sharing event.

I often use to perform upgrade downgrade SMU's in XR ( asr9k, CRS ) nodes with standard set of procedure given to me. Is there any way know if SMU is restart SMU or non-restart SMU ? Any quick way to check ?

Regards,

Jameel

Hi Jameel,

There are several ways you can fetch this information. Normally this information is available in readme file for the tar. Also, you can use few commands to get SMU restart details.

If you are NOT done with install add step then you can use "admin show install pie-info <> detail" and look for restart information.

If you are done with install add then run "admin show install package <> detail.

Example:

disk0:asr9k-doc-px-4.3.1
asr9k-doc-px V4.3.1[Default] Asr9k DOC Composite
[composite package]
[root package, grouped contents]
Vendor : Cisco Systems
Desc : Asr9k DOC Composite
Build : Built on Sat May 11 21:56:02 UTC 2013
Source : By iox-bld2 in /auto/srcarchive7/production/4.3.1/all/workspace for pie
Card(s): RP, CRS-RP-X86, CRS8-RP-x86, CRS16-RP-x86, ASR9001-RP
Restart information:<<<<<<<<<<
Default:
parallel impacted processes restart
Size Compressed/Uncompressed: 5470KB/24MB (21%)
Components in package disk0:asr9k-doc-px-4.3.1, package asr9k-doc-px:
disk0:asr9K-doc-supp-4.3.1
asr9K-doc-supp V4.3.1[Default] asr9k doc package
Vendor : Cisco Systems
Desc : asr9k doc package
Build : Built on Sat May 11 21:55:55 UTC 2013
Source : By iox-bld2 in /auto/srcarchive7/production/4.3.1/all/workspace for pie
Card(s): RP, CRS-RP-X86, CRS8-RP-x86, CRS16-RP-x86, ASR9001-RP
Restart information:
Default:
parallel impacted processes restart
Size Compressed/Uncompressed: 5470KB/24MB (21%)

Thanks
Sudhir Kumar

Thanks Cisco Team!

Could you please help me understand that why do we need restart SMU ? Cant we just install and run the package?

Regards,

Jameel


Hi Jameel,

I understand your question is why do we need restart SMU ?

Execution of new binary is needed to take effect without any disruption to the system, Restart should be completed in sub seconds as opposed to reload which takes more time.

You can also refer to below links for more SMU details.

https://supportforums.cisco.com/document/121401/asr9000xr-concept-smu-and-managing-them

http://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/ios-xr-software/116332-maintain-ios-xr-smu-00.html

Thanks
Sudhir Kumar

Jameel,

On top of what Sudhir has provided, the easiest and quickest way to know the SMU impact is through CSM, you have the option of SMU client, which does SMU management and SMU server which does SMU management and also automation of installing and upgrading software(including SMUs). Please take a look at CSM Server or client here.

CSM server: (LINUX)

https://software.cisco.com/download/release.html?mdfid=282414851&flowid=2137&softwareid=284777134&release=1.0&relind=AVAILABLE&rellifecycle=&reltype=latest

CSM Client: (windows based)

https://software.cisco.com/download/release.html?mdfid=282414851&flowid=2137&softwareid=284777134&release=1.0&relind=AVAILABLE&rellifecycle=&reltype=latest

Regards

Eddie.

suhail.malik
Level 1
Level 1

Hello

Can you please tell me  Where does RSA keys gets stored in IOS-XR ?

Hi Suhail,

Your question is acknowledged, I'll check and get back to you.

Thanks
Sudhir Kumar