Problem
RP and FC doesn't reloads incase PLATFORM-CROSSBAR-2-ACCESS_FAILURE is seen.
Challenges
EEM script can be used to reload RP and FC after detecting the syslog. EEM script in this case can be challenging because of following reasons:
1. RP reload will cause RP failover, hence the FC reload needs to be performed from Stby RP(new Active after failover).
2. EEM script can be run only from Active RP, hence EEM script needs to be triggered twice: first time to reload RP, Second time to reload FC after switchover.
3. The FC reload should be done only if the RP failover is triggered with RP reload(triggered by script after detecting PLATFORM-CROSSBAR-2-ACCESS_FAILURE.
Solution
There are two ways to achieve this:
1. Have two scripts: one to reload RP and another to reload FC. FC reload script to be run after detecting the RP failover because of RP reload.
2. Single script to perform both RP reload and FC reload post RP switchover.
This document talks about the solution using #2.
Following line in the script is the key here:
::cisco::eem::event_register_syslog pattern "PLATFORM-CROSSBAR-2-ACCESS_FAILURE : Set.*fab_xbar_sp1.* on XBAR_0_Slot_13|INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_POWERED_OFF.*for card 0/RP.*" maxrun_sec 600
The script monitors for two syslogs:
- Set.*fab_xbar_sp1.* on XBAR_0_Slot_13
- INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_POWERED_OFF.*for card 0/RP.*
Script reloads the RP with "Set.*fab_xbar_sp1.* on XBAR_0_Slot_13" Syslog.
FC is reloaded with INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_POWERED_OFF.*for card 0/RP.* syslog.
Configuration
•Create directory on which, the script needs to be saved in router harddisk.
RP/0/RP0/CPU0:ASR9912-A#mkdir harddisk:/scripts
RP/0/RP0/CPU0:ASR9912-A#mkdir harddisk:/scripts location 0/rp1/CPU0
•Copy attached scripts into the script's directory of both Active and Standby RP
RP/0/RP0/CPU0:ASR9912-A#copy ftp://x.x.x.x/pcie_failure_rp_script.tcl harddisk:/scripts
RP/0/RP0/CPU0:ASR9912-A#copy ftp://x.x.x.x/pcie_failure_rp_script.tcl harddisk:/scripts location 0/RP1/CPU0
RP/0/RP0/CPU0:ASR9912-A#copy ftp://x.x.x.x/pcie_failure_fc_script.tcl harddisk:/scripts
RP/0/RP0/CPU0:ASR9912-A#copy ftp://x.x.x.x/pcie_failure_fc_script.tcl harddisk:/scripts location 0/RP1/CPU0
•Apply below EEM related configurations
event manager directory user policy harddisk:/scripts/
event manager policy pcie_failure_rp_script.tcl username event_manager_user persist-time 3600
event manager policy pcie_failure_fc_script.tcl username event_manager_user persist-time 3600
username event_manager_user
username event_manager_user group root-lr
username event_manager_user group cisco-support
aaa authorization eventmanager default local
Verification
"show event manager policy registered" can be used to check if script is successfully registered.