cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6779
Views
15
Helpful
3
Replies

VPC-2-PEER_KEEP_ALIVE_RECV_FAIL log message on Nexus 3064

Arthur Kant
Level 1
Level 1

We have a pair of Nexus 3064 switches running VPC.  These have been in production for 3 years now running the same code version since the initial deploy.  We recently (this year) started getting log messages for Keep alive failures.  I enabled debug logging and they show the following:

 

2020 Sep 18 05:51:40.427 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 18 05:51:40.427 sw1.dc24 %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 100, VPC peer keep-alive receive has failed
2020 Sep 18 05:53:16.462 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 18 05:53:16.462 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_SUCCESS: In domain 100, vPC peer keep-alive receive is successful
2020 Sep 18 15:14:46.102 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 18 15:14:46.102 sw1.dc24 %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 100, VPC peer keep-alive receive has failed
2020 Sep 18 15:14:57.108 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 18 15:14:57.108 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_SUCCESS: In domain 100, vPC peer keep-alive receive is successful
2020 Sep 19 07:40:14.931 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 19 07:40:14.931 sw1.dc24 %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 100, VPC peer keep-alive receive has failed
2020 Sep 19 07:40:25.936 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 19 07:40:25.936 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_SUCCESS: In domain 100, vPC peer keep-alive receive is successful
2020 Sep 21 06:14:06.268 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 21 06:14:06.268 sw1.dc24 %VPC-2-PEER_KEEP_ALIVE_RECV_FAIL: In domain 100, VPC peer keep-alive receive has failed
2020 Sep 21 06:14:34.281 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_INT_LATEST: In domain 100, VPC peer-keepalive received on interface Vlan10
2020 Sep 21 06:14:34.281 sw1.dc24 %VPC-5-PEER_KEEP_ALIVE_RECV_SUCCESS: In domain 100, vPC peer keep-alive receive is successful

 

-----

 

show vpc peer-keepalive

vPC keep-alive status : peer is alive
--Peer is alive for : (180507) seconds, (803) msec
--Send status : Success
--Last send at : 2020.09.25 10:47:01 390 ms
--Sent on interface : Vlan10
--Receive status : Success
--Last receive at : 2020.09.25 10:47:01 390 ms
--Received on interface : Vlan10
--Last update from peer : (0) seconds, (188) msec

vPC Keep-alive parameters
--Destination : 10.0.0.2
--Keepalive interval : 1000 msec
--Keepalive timeout : 5 seconds
--Keepalive hold timeout : 3 seconds
--Keepalive vrf : keepalive
--Keepalive udp port : 3200
--Keepalive tos : 192

 

------


vlan 10
name KEEPALIVE

 

interface Vlan10
description vPC Keepalive
no shutdown
vrf member keepalive
ip address 10.0.0.1/30

 

-----


interface Ethernet1/43
description Keepalive Connection to sw2.dc24 e1/43
switchport mode trunk
switchport trunk allowed vlan 10

 

Ethernet1/43 is up
admin state is up, Dedicated Interface
Hardware: 100/1000/10000 Ethernet, address: fc99.4755.1b72 (bia fc99.4755.1b72)
Description: Keepalive Connection to sw2.dc24 e1/43
MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, medium is broadcast
Port mode is trunk
full-duplex, 10 Gb/s, media type is 10G
Beacon is turned off
Auto-Negotiation is turned off, FEC mode is Auto
Input flow-control is off, output flow-control is off
Auto-mdix is turned off
Rate mode is dedicated
Switchport monitor is off
EtherType is 0x8100
EEE (efficient-ethernet) : n/a
Last link flapped 109week(s) 4day(s)
Last clearing of "show interface" counters 29w0d
0 interface resets
30 seconds input rate 1232 bits/sec, 0 packets/sec
30 seconds output rate 944 bits/sec, 0 packets/sec
Load-Interval #2: 5 minute (300 seconds)
input rate 960 bps, 0 pps; output rate 680 bps, 0 pps
RX 17630663 unicast packets 9688393 multicast packets 962 broadcast packets
27320018 input packets 2447072885 bytes
0 jumbo packets 0 storm suppression packets
0 runts 0 giants 0 CRC 0 no buffer
0 input error 0 short frame 0 overrun 0 underrun 0 ignored
0 watchdog 0 bad etype drop 0 bad proto drop 0 if down drop
0 input with dribble 0 input discard
0 Rx pause
TX 17629610 unicast packets 880725 multicast packets 1073 broadcast packets
18511408 output packets 1812251192 bytes
0 jumbo packets
0 output error 0 collision 0 deferred 0 late collision
0 lost carrier 0 no carrier 0 babble 0 output discard
0 Tx pause

 

 

*** I am not sure what the strategy was to make this a tagged vlan on a trunk but it reported no logs for 2 years.  The vlan is not trunked or accessed anywhere else on the switch.

Software
BIOS: version 4.0.0
NXOS: version 7.0(3)I4(6)
BIOS compile time: 12/06/2016
NXOS image file is: bootflash:///nxos.7.0.3.I4.6.bin
NXOS compile time: 3/9/2017 22:00:00 [03/10/2017 01:05:18]

 

- sw2 reports the same errors but at different times which dont sync up with sw1 events.

- We have another deployment at a test site which has the same configuration and has never reported an error.  The difference between the sites is that the test site is running older code:

Software
BIOS: version 2.6.0
loader: version N/A
kickstart: version 6.0(2)U6(8)
system: version 6.0(2)U6(8)


 and also less traffic levels.

 

Any thoughts??  My only direction at this time to reconfigure and take the trunking out and do a straight routed port ..etc or increase keepalive interval above 1000msec

2 Accepted Solutions

Accepted Solutions

balaji.bandi
Hall of Fame
Hall of Fame

If the kit was running last 3 years with out any issue, you see suddendly this issue - make sure cables are intact and there in no other issue Layer 2-  then Looks like bug :

 

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve06744/?rfs=iqvred

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

View solution in original post

Reza Sharifi
Hall of Fame
Hall of Fame

Hi,

Not sure if this has anything to do with the issue you are having, but keep-alive is usually not configured in a VLAN on the Nexus series. You just connect the Nexus switches to a 3rd switch and put the 3rd switch ports in a vlan.

HTH

View solution in original post

3 Replies 3

balaji.bandi
Hall of Fame
Hall of Fame

If the kit was running last 3 years with out any issue, you see suddendly this issue - make sure cables are intact and there in no other issue Layer 2-  then Looks like bug :

 

https://bst.cloudapps.cisco.com/bugsearch/bug/CSCve06744/?rfs=iqvred

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Reza Sharifi
Hall of Fame
Hall of Fame

Hi,

Not sure if this has anything to do with the issue you are having, but keep-alive is usually not configured in a VLAN on the Nexus series. You just connect the Nexus switches to a 3rd switch and put the 3rd switch ports in a vlan.

HTH

Arthur Kant
Level 1
Level 1

thank you for the insight... I am unsure if we are running into the bug ..and why would it spring up \after 2 years?     Next action on our end is to reconfigure interfaces to a standard routed port and see if that jobs something loose.  

Review Cisco Networking products for a $25 gift card