cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
208
Views
10
Helpful
6
Replies

Placing ports into an existing VDC caused major crash

Racso
Beginner
Beginner

Greetings,

So we just had a situation when placing a set of ports from a VDC to another VDC caused a major crash on all the VDCs except the dafult one

The ports were originally on VDC A, where they didn't have any configuration, so we have no reason what caused the crash when placing them on the VDC B

Also keep in mind that in the VDC B there were already like 12 ports from the module that the migrating ports are, so I don't think it was a module incompatibility

Any guess? Or how to even begin to figure out this tshoot?

1 Accepted Solution

Accepted Solutions

Hey Andrea, thanks for your analysis, greatly appreciated!

We found out the crash was related to removing all the M3 ports from the VDC A, causing a L3 network crash that we use to reach all the other VDC environments, it wasn't a full VDC crash as I suspected

Thanks once again for your input!

View solution in original post

6 Replies 6

Andrea Testino
Cisco Employee
Cisco Employee

Hi Racso,

Could you run the following CLI (save your session as .txt) and attach it here or unicast it to me? I can take a look. Definitely should not happen.

#### In Admin VDC ####

term width 511
term length 0
show run vdc
show vdc
show module
show cores
show version
show logging log
show logging nvram
show accounting log | i i allocate
show module internal exceptionlog

#### In VDC "B" ####

term width 511
term length 0
show cores
show logging log
show logging nvram
- Andrea, CCIE #56739 R&S

Hey Andrea, thanks for your assistance,

Let me gather that information and post it here

Ok the VDC B is the Server VDC

We are trying to assign ports 6/1-6/4 to this Server VDC

Hi,

Thanks for the outputs -- Can you describe the VDC crashes in more detail? I ask because Im not seeing any signs of VDC crashes anywhere in these logs. No core files, no VDC up/down/reloading, etc syslogs either.
What I do see is the interfaces being allocated to the SERVER VDC and then moved to the CORE VDC where they currently reside. I did notice some EIGRP flaps during the change -- is that what you meant by VDC crash?

Allocation of Eth6/1-4 to AGG-SERVER-1:

Thu Aug 11 12:47:31 2022:type=update:id=10.200.24.254@pts/5:user=adminmpc:cmd=configure terminal ; vdc AGG-SERVER-1 ; allocate interface Ethernet6/1-4 (SUCCESS)
2022 Aug 11 12:47:31 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_ADD: vdc_mgr: Interface Ethernet6/1 has been added to this vdc
2022 Aug 11 12:47:31 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_ADD: vdc_mgr: Interface Ethernet6/2 has been added to this vdc
2022 Aug 11 12:47:31 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_ADD: vdc_mgr: Interface Ethernet6/3 has been added to this vdc
2022 Aug 11 12:47:31 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_ADD: vdc_mgr: Interface Ethernet6/4 has been added to this vdc
2022 Aug 11 12:52:26 Admin-1 %VDC_MGR-5-VDC_STATE_CHANGE: vdc 2 state changed to updating
2022 Aug 11 12:52:26 Admin-1 %VDC_MGR-5-VDC_STATE_CHANGE: vdc 3 state changed to updating
2022 Aug 11 12:52:26 Admin-1-AGG-SERVER-1 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet6/1 is down (Interface removed)
2022 Aug 11 12:52:26 Admin-1-AGG-SERVER-1 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet6/2 is down (Interface removed)
2022 Aug 11 12:52:26 Admin-1-AGG-SERVER-1 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet6/3 is down (Interface removed)
2022 Aug 11 12:52:26 Admin-1-AGG-SERVER-1 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet6/4 is down (Interface removed)
2022 Aug 11 12:52:28 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_DELETE: vdc_mgr: Interface Ethernet6/1 has been removed from this vdc
2022 Aug 11 12:52:28 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_DELETE: vdc_mgr: Interface Ethernet6/2 has been removed from this vdc
2022 Aug 11 12:52:28 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_DELETE: vdc_mgr: Interface Ethernet6/3 has been removed from this vdc
2022 Aug 11 12:52:28 Admin-1-AGG-SERVER-1 %VDC_MGR-5-VDC_MEMBERSHIP_DELETE: vdc_mgr: Interface Ethernet6/4 has been removed from this vdc

Allocation of Eth6/1-4 to CORE VDC instead a few seconds after:

Thu Aug 11 12:52:32 2022:type=update:id=10.200.24.254@pts/5:user=adminmpc:cmd=configure terminal ; vdc CORE-1 ; allocate interface Ethernet6/1-4 (SUCCESS)

2022 Aug 11 12:52:32 Admin-1 %VDC_MGR-5-VDC_STATE_CHANGE: vdc 2 state changed to active
2022 Aug 11 12:52:32 Admin-1 %VDC_MGR-5-VDC_STATE_CHANGE: vdc 3 state changed to active

 Noticed some correctable errors a few weeks back on Module 6 but nothing out of the ordinary (yet):

********* Exception info for module 6 ********

exception information --- exception instance 1 ----
Module Slot Number: 6
Device Id : 172
Device Name : Garuda
Device Errorcode : 0xcac0061b
Device ID : 172 (0xac)
Device Instance : 00 (0x00)
Dev Type (HW/SW) : 06 (0x06)
ErrNum (devInfo) : 27 (0x1b)
System Errorcode : 0x4244009b non-fatal error
Error Type : Informational
PhyPortLayer : Ethernet
Port(s) Affected : Ethernet6/1-12
Error Description : GRD_EFC_EFC_INT2__2_FLD_L3_1588_PRS_ERR__67
DSAP : 0 (0x0)
UUID : 0 (0x0)
Time : Sun Jul 31 17:30:42 2022
(Ticks: 62E6F482 jiffies)

exception information --- exception instance 2 ----
Module Slot Number: 6
Device Id : 80
Device Name : Eureka
Device Errorcode : 0xc5000206
Device ID : 80 (0x50)
Device Instance : 00 (0x00)
Dev Type (HW/SW) : 02 (0x02)
ErrNum (devInfo) : 06 (0x06)
System Errorcode : 0x411c001e EEM Event Correctable ECC Interrupt
Error Type : Informational
PhyPortLayer : Ethernet
Port(s) Affected : Ethernet6/1-12
Error Description : EU_FT_INT_B0_CORR_ECC_ERR_INT
DSAP : 0 (0x0)
UUID : 0 (0x0)
Time : Fri May 20 04:21:47 2022
(Ticks: 62874F9B jiffies)


I did notice your Module 3 is having some port loopback failures:

2022 May 27 23:14:58 Admin-1 %DIAG_PORT_LB-2-PORTLOOPBACK_TEST_FAIL: Module:3 Test:PortLoopback failed 10 consecutive times. Faulty module: affected ports:39 Error:Loopback test failed. Unable to analyze the reason for failure <<<

Admin-1# show vdc


Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3

vdc_id vdc_name state mac type lc
------ -------- ----- ---------- --------- ------
1 Admin-1 active 50:87:89:49:03:41 Admin None
2 CORE-1 active 50:87:89:49:03:42 Ethernet m1 m1xl m2xl f2e
3 AGG-SERVER-1 active 50:87:89:49:03:43 Ethernet m1 m1xl m2xl f2e

Admin-1# show module

Mod Ports Module-Type Model Status
--- ----- ----------------------------------- ------------------ ----------
1 0 Supervisor Module-2 N7K-SUP2 active *
2 0 Supervisor Module-2 N7K-SUP2 ha-standby
3 48 1/10 Gbps BASE-T Ethernet Module N7K-F248XT-25E ok
4 48 1/10 Gbps BASE-T Ethernet Module N7K-F248XT-25E ok
6 24 10 Gbps Ethernet Module N7K-M224XP-23L ok

<snip>
Mod Online Diag Status
--- ------------------
1 Pass
2 Pass
3 Fail <<<<

Worth re-running the loopback test (non-disruptive) to see if this is a false-positive or that module needs to be replaced (assuming possible). The test runs every 15 minutes:

N7K(config)# no diagnostic monitor module 3 test 6
N7K(config)# diagnostic clear result module 3 test 6
N7K(config)# diagnostic monitor module 3 test 6
N7K(config)# diagnostic start module 3 test 6

I'll wait for your comments/details with regards to the VDC crashes!

- Andrea, CCIE #56739 R&S

Hey Andrea, thanks for your analysis, greatly appreciated!

We found out the crash was related to removing all the M3 ports from the VDC A, causing a L3 network crash that we use to reach all the other VDC environments, it wasn't a full VDC crash as I suspected

Thanks once again for your input!

That will definitely do it :).   Makes sense. 

Happy to help.

- Andrea, CCIE #56739 R&S
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Recognize Your Peers