キャンセル
次の結果を表示 
次の代わりに検索 
もしかして: 
cancel
2864
閲覧回数
0
いいね!
0
コメント
shoriuch
Cisco Employee
Cisco Employee

[toc:faq]

概要

本ドキュメントでは、ASR9000 シリーズルータを使用している場合に、admin reload を繰り返し実施すると module が IN-RESET と状態となる事象の解説と、復旧方法について説明します。

admin reload location <loc> と hw-module location <loc> reload の違い

admin reload location <loc> と hw-module location <loc> reload はどちらもモジュールを reload することが出来るコマンドですが、実際に reload に至るまでの内部的な動作が異なります。

差分のポイントは、モジュールの initialization や operation を担当する shelfmgr process を介しての再起動であるか、そうでないかです。

reload コマンドの場合、その再起動は shelfmgr を介しません。

対して、hw-module reload コマンドの場合は、shelfmgr を通してモジュールの再起動が実施されます。

モジュールの IN-RESET 状態とは

モジュールのステータスは様々な種類がありますが、IN-RESET とは shelfmgr が該当のモジュールを power down に固定した状態です。

障害などでモジュールが再起動を繰り返すような場合に、それを検知し、ネットワークへの影響を最小限とするため、モジュールがそれ以上起動を試みないようにした状態が IN-RESET です。

RP/0/RP0/CPU0:ASR9K#show plat

Node            Type                      State            Config State
-----------------------------------------------------------------------------
0/RP0/CPU0      A99-RP2-TR(Active)        IOS XR RUN       PWR,NSHUT,MON
0/RP1/CPU0      A99-RP2-TR(Standby)       IOS XR RUN       PWR,NSHUT,MON
0/0/CPU0        A9K-MOD400-TR             IOS XR RUN       PWR,NSHUT,MON
0/0/1           A9K-MPA-4X10GE            OK               PWR,NSHUT,MON
0/1/CPU0        A9K-MOD400-TR             IOS XR RUN       PWR,NSHUT,MON
0/1/0           A9K-MPA-2X100GE           OK               PWR,NSHUT,MON
0/9/CPU0        A9K-40GE-SE               IN-RESET         PWR,NSHUT,MON

admin reload コマンドは module を IN-RESET にしてしまう場合がある

上述の通り、admin reload コマンドの場合は、shelfmgr がその再起動処理に介在しません。

その為、shelfmgr から見ると、短い期間での admin reload による reload は IN-RESET への遷移の対象となります。

具体的には、1時間以内に admin reload コマンドで特定のモジュールを5回以上再起動した場合、IN-RESET 状態となります。

後何回の再起動で IN-RESET となるかは、show shelfmgr status コマンドで九人ができます。

RP/0/RP0/CPU0:ASR9K#show shelfmgr status location 0/9/CPU0 
Mon Jun 19 10:21:01.670 JST
Nodeid 0x8b1, inst 1

Platform Node Status for 0/9/CPU0 (0_8b_1)
----------------------------------
Current State: IOS XR RUN
Current Substate: IMDR_STATE_NONE
Configuration:
Power is enabled
Bootup enabled.
Monitoring enabled
Boot Requests: 0 Max Allowed: 10
Bringdown Count: 0 Max Allowed: 5 <<<<<<<
Card Reset Count: 0 Max Allowed: 4

Last Reset Code: 14, CPU Reset
Card In Reset: FALSE Shutdown Reason: 12
No FSM timers are set

Slot supports CBC processor. Card is Present andCBC state is Online
CBC reset reason is 0x35
CBC reports card type as: 0x4302a0
Estimated 275 Watts of power required. Power is Reserved.



RP/0/RP0/CPU0:ASR9K#admin reload location 0/9/CPU0

Preparing system for backup. This may take a few minutes especially for large configurations.
[Done]
RP/0/RP0/CPU0:Jun 19 10:22:33.043 JST: reload[65940]: %MGBL-SCONBKUP-6-INTERNAL_INFO : Reload debug script successfully spawned
Proceed with reload? [confirm]
LC/0/9/CPU0:Jun 19 10:22:33.064 JST: mbi-hello[67]: %PLATFORM-MBI_HELLO-6-NODE_RELOAD : Reload request received. Reloading in 5 secs
LC/0/9/CPU0:Jun 19 10:22:38.068 JST: mbi-hello[67]: %PLATFORM-MBI_HELLO-6-NODE_PWRCYL : Power CycleRequest Received. Power Cycling now !
RP/0/RP1/CPU0:Jun 19 10:22:38.209 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:22:38.213 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:22:38.214 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_CPU_RESET : Node 0/9/CPU0 CPU reset detected.
RP/0/RP0/CPU0:Jun 19 10:22:38.216 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/9/CPU0 A9K-40GE-SE state:BRINGDOWN
RP/0/RP0/CPU0:Jun 19 10:22:38.229 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory
RP/0/RP0/CPU0:Jun 19 10:22:38.253 JST: invmgr[269]: %PLATFORM-INV-6-NODE_STATE_CHANGE : Node: 0/9/CPU0, state: BRINGDOWN
RP/0/RP0/CPU0:Jun 19 10:22:39.231 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory
RP/0/RP0/CPU0:Jun 19 10:22:41.232 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory
RP/0/RP1/CPU0:Jun 19 10:22:44.232 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_POST_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:22:44.237 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_POST_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:22:44.239 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/9/CPU0 A9K-40GE-SE state:ROMMON
RP/0/RP0/CPU0:Jun 19 10:22:45.234 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory

RP/0/RP0/CPU0:ASR9K#show shelfmgr status location 0/9/CPU0
Mon Jun 19 10:23:01.017 JST
Nodeid 0x8b1, inst 1

Platform Node Status for 0/9/CPU0 (0_8b_1)
----------------------------------
Current State: ROMMON
Current Substate: IMDR_STATE_NONE
Configuration:
Power is enabled
Bootup enabled.
Monitoring enabled
Boot Requests: 1 Max Allowed: 10
Bringdown Count: 1 Max Allowed: 5 <<< カウンタが上昇していることがわかります。このカウンタは、1時間毎に1ずつ減っていきます。
Card Reset Count: 0 Max Allowed: 4
The name of timer in codes is PCTL_FSM_CRESET_TIMEOUT_PULSE.
Last Reset Code: 14, CPU Reset
Card In Reset: FALSE Shutdown Reason: 12
Timer BOOTREQ_SANITY 10 set. 134 of 150 seconds remaining.

Slot supports CBC processor. Card is Present andCBC state is Online
CBC reset reason is 0x37
CBC reports card type as: 0x4302a0
Estimated 275 Watts of power required. Power is Reserved.


Heartbeat monitoring is disabled.
Last rx sequence number: 166399 Last tx sequence number: 166399
Missed ticks: 0 Remote HB Missed Count: 0
Last HB TX status: 0

MBI Reset Pending Type: 64
SysMgr Node Band State: 0x800000 FINAL
SysMgr IMDR Node substate: 0

Boot request card type: 0x4302a0. Boot-up was allowed.
Last FSM Shutdown Reason: 1
Rommon Version 3.03

Line card memory mode is mixed. Card mode --BB-------E----------
Same RSPs

admin reload コマンドによって、IN-RESET に遷移した場合の出力例と復旧方法

admin reload による再起動を5回実施した場合の出力例です。

RP/0/RP0/CPU0:ASR9K#show shelfmgr status location 0/9/CPU0
Nodeid 0x8b1, inst 1

Platform Node Status for 0/9/CPU0 (0_8b_1)
----------------------------------
Current State: BRINGDOWN
Current Substate: IMDR_STATE_NONE
Configuration:
Power is enabled
Bootup enabled.
Monitoring enabled
Boot Requests: 3 Max Allowed: 10
Bringdown Count: 4 Max Allowed: 5    <<<< 現在4回実施しています。
Card Reset Count: 0 Max Allowed: 4

Last Reset Code: 14, CPU Reset
Card In Reset: FALSE Shutdown Reason: 12
Timer BOOTREQ_SANITY 10 set. 145 of 150 seconds remaining.

Slot supports CBC processor. Card is Present andCBC state is Online
CBC reset reason is 0x3c
CBC reports card type as: 0x4302a0
Estimated 275 Watts of power required. Power is Reserved.


Heartbeat monitoring is disabled.
Last rx sequence number: 116 Last tx sequence number: 116
Missed ticks: 0 Remote HB Missed Count: 0
Last HB TX status: 0

MBI Reset Pending Type: 64
SysMgr Node Band State: 0x800000 FINAL
SysMgr IMDR Node substate: 0

Boot request card type: 0x4302a0. Boot-up was allowed.
Last FSM Shutdown Reason: 12
Rommon Version 3.03

Line card memory mode is mixed. Card mode --BB-------E----------
Same RSPs

RP/0/RP0/CPU0:ASR9K#admin reload location 0/9/CPU0 <<< 5回目の再起動を実施

Preparing system for backup. This may take a few minutes especially for large configurations.
[Done]
Proceed with reload? [confirm]
RP/0/RP0/CPU0:Jun 19 10:45:54.538 JST: reload[65940]: %MGBL-SCONBKUP-6-INTERNAL_INFO : Reload debug script successfully spawned
LC/0/9/CPU0:Jun 19 10:45:54.554 JST: mbi-hello[67]: %PLATFORM-MBI_HELLO-6-NODE_RELOAD : Reload request received. Reloading in 5 secs
LC/0/9/CPU0:Jun 19 10:45:59.559 JST: mbi-hello[67]: %PLATFORM-MBI_HELLO-6-NODE_PWRCYL : Power Cycle Request Received. Power Cycling now !
RP/0/RP0/CPU0:Jun 19 10:45:59.720 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:45:59.721 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-0-MAX_RESET_BRINGDOWN : Can not boot node 0/9/CPU0 A9K-40GE-SE due to multiple resets, putting it IN_RESET state. The probable cause is an unexpected event on the node or a failure in communication with the node. Please refer to the Cisco ASR 9000 System Error Message Reference Guide for further information if needed.
RP/0/RP0/CPU0:Jun 19 10:45:59.721 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_CPU_RESET : Node 0/9/CPU0 CPU reset detected.
RP/0/RP1/CPU0:Jun 19 10:45:59.724 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:45:59.727 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/9/CPU0 A9K-40GE-SE state:IN-RESET
RP/0/RP0/CPU0:Jun 19 10:45:59.736 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory
RP/0/RP0/CPU0:ASR9K#RP/0/RP0/CPU0:Jun 19 10:46:00.737 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory
RP/0/RP0/CPU0:Jun 19 10:46:02.739 JST: mibd_entity[341]: %HA-HA_EM-7-FMFD_CONNECTION_FAIL : Could not connect to /dev/fm/fd_wdsysmon.d/node0_9_CPU0 : No such file or directory

RP/0/RP0/CPU0:ASR9K#
RP/0/RP0/CPU0:ASR9K#
RP/0/RP0/CPU0:ASR9K#show shelfmgr status location 0/9/CPU0
Nodeid 0x8b1, inst 1

Platform Node Status for 0/9/CPU0 (0_8b_1)
----------------------------------
Current State: IN-RESET
Current Substate: IMDR_STATE_NONE
Configuration:
Power is enabled
Bootup enabled.
Monitoring enabled
Boot Requests: 4 Max Allowed: 10
Bringdown Count: 5 Max Allowed: 5
Card Reset Count: 0 Max Allowed: 4

Last Reset Code: 14, CPU Reset
Card In Reset: TRUE Shutdown Reason: 2
No FSM timers are set

Slot supports CBC processor. Card is Present andCBC state is Online
CBC reset reason is 0x3d
CBC reports card type as: 0x4302a0
Estimated 275 Watts of power required. Power is off..


Heartbeat monitoring is disabled.
Last rx sequence number: 193 Last tx sequence number: 193
Missed ticks: 0 Remote HB Missed Count: 0
Last HB TX status: 0

MBI Reset Pending Type: 64
SysMgr Node Band State: 0x800000 FINAL
SysMgr IMDR Node substate: 0

Boot request card type: 0x4302a0. Boot-up was denied.
Last FSM Shutdown Reason: 2
Rommon Version 3.03

Line card memory mode is mixed. Card mode --BB-------E----------
Same RSPs
RP/0/RP0/CPU0:ASR9K#


IN-RESET になってしまった場合、それ以上 admin reload コマンドを実施しても、モジュールは起動してきません。

該当のカウンタをリセットしたり、モジュールを起動させるためには、hw-module location <loc> reload コマンドを実施します。

RP/0/RP0/CPU0:ASR9K#hw-module location 0/9/CPU0 reload
WARNING: This will take the requested node out of service.
Do you wish to continue?[confirm(y/n)]u y

RP/0/RP0/CPU0:Jun 19 10:47:40.907 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-USER_RESET : Node 0/9/CPU0 is reset due to user reload request
RP/0/RP0/CPU0:Jun 19 10:47:40.909 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/9/CPU0 A9K-40GE-SE state:IOS XR FAILURE
RP/0/RP0/CPU0:Jun 19 10:47:47.629 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_POST_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:47:47.631 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/9/CPU0 A9K-40GE-SE state:ROMMON
RP/0/RP1/CPU0:Jun 19 10:47:47.632 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_POST_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:47:47.989 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:47:47.992 JST: shelfmgr[427]: %PLATFORM-SHELFMGR-6-NODE_STATE_CHANGE : 0/9/CPU0 A9K-40GE-SE state:BRINGDOWN
RP/0/RP1/CPU0:Jun 19 10:47:47.992 JST: canb-server[156]: %PLATFORM-CANB_SERVER-7-CBC_PRE_RESET_NOTIFICATION : Node 0/9/CPU0 , Power Cycle (0x05000000)
RP/0/RP0/CPU0:Jun 19 10:47:47.995 JST: invmgr[269]: %PLATFORM-INV-6-NODE_STATE_CHANGE : Node: 0/9/CPU0, state: BRINGDOWN


RP/0/RP0/CPU0:ASR9K#show shelfmgr status location 0/9/CPU0
Nodeid 0x8b1, inst 1

Platform Node Status for 0/9/CPU0 (0_8b_1)
----------------------------------
Current State: BRINGDOWN
Current Substate: IMDR_STATE_NONE
Configuration:
Power is enabled
Bootup enabled.
Monitoring enabled
Boot Requests: 1 Max Allowed: 10
Bringdown Count: 0 Max Allowed: 5
Card Reset Count: 0 Max Allowed: 4

Last Reset Code: 14, CPU Reset
Card In Reset: FALSE Shutdown Reason: 4
Timer BOOTREQ_SANITY 10 set. 149 of 150 seconds remaining.

Slot supports CBC processor. Card is Present andCBC state is Online
CBC reset reason is 0x40
CBC reports card type as: 0x4302a0
Estimated 275 Watts of power required. Power is Reserved.


Heartbeat monitoring is disabled.
Last rx sequence number: 193 Last tx sequence number: 193
Missed ticks: 0 Remote HB Missed Count: 0
Last HB TX status: 0

MBI Reset Pending Type: 64
SysMgr Node Band State: 0x800000 FINAL
SysMgr IMDR Node substate: 0

Boot request card type: 0x4302a0. Boot-up was allowed.
Last FSM Shutdown Reason: 4
Rommon Version 3.03

Line card memory mode is mixed. Card mode --BB-------E----------
Same RSPs
RP/0/RP0/CPU0:ASR9K#
RP/0/RP0/CPU0:ASR9K#
Getting Started

検索バーにキーワード、フレーズ、または質問を入力し、お探しのものを見つけましょう

シスコ コミュニティをいち早く使いこなしていただけるよう役立つリンクをまとめました。みなさんのジャーニーがより良いものとなるようお手伝いします