Hi -

kinshaun0512 · ‎11-23-2016

Hi,

I got the logging error message about IPC and RPC below. I have checked CPU and memory is working fine.

Nov 23 19:17:45 AEST: %IPC-SW1_STBY-5-WATERMARK: 2882 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3090000

Nov 23 19:18:24 AEST: %IPC-SW1_STBY-5-WATERMARK: 6191 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3030000

Nov 23 19:21:27 AEST: %RPC-SW2-4-CORE_SAT_RPC_FAIL: RPC between Core and Remote Switchvs_rp_environmental:env_clear_alarm_rp failed (non-fatal). Expected when VSL goes down.

Nov 23 19:23:54 AEST: %RPC-SW2-4-CORE_SAT_RPC_FAIL: RPC between Core and Remote Switchvs_rp_environmental:env_raise_alarm_rp failed (non-fatal). Expected when VSL goes down.

Possibly is this related to bug id CSCso14087? I couldn't find a bug related to the RPC fail though. Here is a cropped show version output.

SWITCH1#sh ver
Cisco IOS Software, s2t54 Software (s2t54-ADVIPSERVICESK9-M), Version 15.1(2)SY6, RELEASE SOFTWARE (fc4)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2015 by Cisco Systems, Inc.
Compiled Thu 10-Sep-15 01:36 by prod_rel_team

ROM: System Bootstrap, Version 12.2(50r)SYS3, RELEASE SOFTWARE (fc1)

If you require further outputs to troubleshoot, do let me know.

Thanks in advance, will provide rating for help on answering.

Paul Chapman · ‎11-23-2016

Hi -

Post results of the following commands.

show redundancy states
show platform hardware pfc mode
show switch virtual
show switch virtual role
show switch virtual link

PSC

nmajid · ‎11-23-2016

Paul, here are the requested output from the device where we picked up logs showing these error.

Nov 24 03:38:41 AEST: %RPC-SW2-4-CORE_SAT_RPC_FAIL: RPC between Core and Remote Switchvs_rp_environmental:env_raise_alarm_rp failed (non-fatal). Expected when VSL goes down.
Nov 24 03:29:59 AEST: %IPC-SW1_STBY-5-WATERMARK: 6374 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3080000
Nov 24 03:30:40 AEST: %IPC-SW1_STBY-5-WATERMARK: 2968 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3090000
Nov 24 03:31:12 AEST: %IPC-SW1_STBY-5-WATERMARK: 6377 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3010000
Nov 24 03:32:26 AEST: %IPC-SW1_STBY-5-WATERMARK: 6375 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3080000
Nov 24 03:32:59 AEST: %IPC-SW1_STBY-5-WATERMARK: 6375 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3030000
Nov 24 03:33:39 AEST: %IPC-SW1_STBY-5-WATERMARK: 6077 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3050000
Nov 24 03:34:53 AEST: %IPC-SW1_STBY-5-WATERMARK: 6376 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3080000
Nov 24 03:35:26 AEST: %IPC-SW1_STBY-5-WATERMARK: 6376 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3030000
Nov 24 03:36:00 AEST: %IPC-SW1_STBY-5-WATERMARK: 2969 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3090000
Nov 24 03:37:21 AEST: %IPC-SW1_STBY-5-WATERMARK: 6377 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3080000
Nov 24 03:37:54 AEST: %IPC-SW1_STBY-5-WATERMARK: 6377 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3030000
Nov 24 03:38:33 AEST: %IPC-SW1_STBY-5-WATERMARK: 6079 messages pending in rcv for the port Card22/0:Request(3060000.C) seat 3050000
Nov 24 03:42:21 AEST: %RPC-SW2-4-CORE_SAT_RPC_FAIL: RPC between Core and Remote Switchvs_rp_environmental:env_clear_alarm_rp failed (non-fatal). Expected when VSL goes down.

HS-EN-XD-1#show redundancy states
my state = 13 -ACTIVE
peer state = 1 -DISABLED
Mode = Duplex
Unit = Secondary
Unit ID = 38

Redundancy Mode (Operational) = sso
Redundancy Mode (Configured) = sso
Redundancy State = sso
Maintenance Mode = Disabled
Manual Swact = disabled (the peer unit is still initializing)
Communications = Up

client count = 143
client_notification_TMR = 30000 milliseconds
keep_alive TMR = 9000 milliseconds
keep_alive count = 0
keep_alive threshold = 18
RF debug mask = 0x0

HS-EN-XD-1#show platform hardware pfc mode
PFC operating mode : PFC4
Configured PFC operating mode : non-XL

HS-EN-XD-1#show switch virtual
Switch mode : Virtual Switch
Virtual switch domain number : 20
Local switch number : 2
Local switch operational role: Virtual Switch Active
Peer switch number : 1
Peer switch operational role : Virtual Switch Standby
HS-EN-XD-1#show switch virtual role
RRP information for Instance 2

--------------------------------------------------------------------
Valid Flags Peer Preferred Reserved
Count Peer Peer

--------------------------------------------------------------------
TRUE V 1 1 1

Switch Switch Status Priority Role Local Remote
Number Oper(Conf) SID SID
--------------------------------------------------------------------
LOCAL 2 UP 100(100) ACTIVE 0 0
REMOTE 1 UP 200(200) STANDBY 8918 7005

Peer 0 represents the local switch

Flags : V - Valid

In dual-active recovery mode: No

HS-EN-XD-1#show switch virtual link
VSL Status : UP
VSL Uptime : 2 years, 7 weeks, 3 days, 18 hours, 52 minutes
VSL SCP Ping : Pass
VSL ICC Ping : Pass
VSL Control Link : Te2/5/4
VSL Encryption : Configured Mode - On, Operational Mode - On

Paul Chapman · ‎11-23-2016

Looks like you have failed over to the supervisor on chassis 2 and the sup on chassis 1 is having a problem. You should connect a console cable to the sup on chassis 1 and see what is happening. Is it constantly rebooting?

PSC

nmajid · ‎11-23-2016

i dont think so..see redundancy status below.

strangely i saw switch environment alert much earlier but then it resolved by itself.

still checking.. any useful suggestions or inputs are very much welcomed.

witch environment alert resolved.

HS-EN-XD-1#sh environment alarm switch 2
environmental alarms:
no alarms

HS-EN-XD-1#sh environment alarm switch 1
environmental alarms:
no alarms

HS-EN-XD-1#sh clo
09:25:54.401 AEST Thu Nov 24 2016

S-EN-XD-1#sh environment alarm switch 2
environmental alarms:
system minor alarm on switch 2 power-supply 1 power-output-fail (raised 00:01:59 ago)

HS-EN-XD-1#sh clo
01:29:13.785 AEST Thu Nov 24 2016

---------------------------------------

HS-EN-XD-1#show redundancy switchover
Switchovers this system has experienced : 5
Last switchover reason : active unit failed
Uptime since this supervisor switched to active : 29 weeks, 1 day, 14 hours, 16 minutes
Total system uptime from reload : 2 years, 7 weeks, 4 days, 50 minutes

HS-EN-XD-1#sh clo
09:30:41.447 AEST Thu Nov 24 2016

Paul Chapman · ‎11-24-2016

Hi -

Based on your output, I'm not really changing my opinion about your standby chassis.

Switchover occurred 29 weeks ago
Switchover reason: "active unit failed"
Chassis 2 is active, not standby (atypical)

If you do a "show module switch all" what is the status of the supervisor in chassis 1?

Can you connect to the standby supervisor using "remote login standby"?

PSC

IPC-SW1_STBY-5-WATERMARK and RPC-SW2-4-CORE_SAT_RPC_FAIL logs