10-23-2012 04:31 AM
Hello Guys,
One of our 12406 IOS XR Routers perfomed an internal switchover this early morning
I cant get anything from the logs to explain the cause of this switchover.
All i can see from the router is the below
"
Last switch-over Tue Oct 23 08:04:47 2012: 6 hours, 22 minutes ago
"
This caused connectivity loss of around six hours from 0153 hours to 0804hrs
My questions are
a)Why does it take so long for the router to decide to on failover between the Supervisor engines?
b)Is there a way I can shorten this failover time to minimize the downtime?
c)What could have caused this switchover/Is there anything else i can do on the router to find the root cause.
Thanks,
Nelson Swai
10-23-2012 04:44 AM
Hello Nelson,
a) It should not take that long. Switchover when SSO is operational is about seconds, if any interruption at all (features like NSR keep routing protocols up). Something unusual must have happened.
b) Hard to say, we need to find out what happened.
c) Many things:
- crash of the primary
- loss of internal communication
- manual intervention (OIR)
Ideally this should be handled via a TAC service request but here is the data you can collect:
show log / syslog server output
show reboot last crashinfo location [RP that failed ie 0/8/CPU0 is slot 8)
show reboot last syslog location [RP]
show reboot last trace location [RP]
show redundancy
show redundancy trace
and check :
dir harddisk:/dumper
dir harddisk:/dumper location [RP]
for any file saved there with a timestamp matching the time of the crash
(if no harddisk: like PRP2, check disk0:/dumper)
I hop this helps,
Max
10-31-2012 10:41 PM
You should have a case open with TAC and attach the outputs suggested to the case. This will be the next logical step.
Hash
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide