03-30-2019 11:00 AM
hello everyone
I have two Nexus 6001 that each one of them have connected 5 Nexus 2000.
Each Nexus 2000 has 48 ports.
Sometimes, without any reason, 25 ports of two of the Nexus 2000 got down. The device gives an error on the log that says "Link not connected". After two minutes all the ports get up.
There's no link failure because we have tested all the cables. There's no one physically disconnecting the ports.
Do any of you have had the same failure and know how to fixed it?
Thank you in advance.
Regards.
03-31-2019 04:43 AM
Hello,
It sounds like your FEX is designed with Static Pinning, meaning the 48 ports are divided in 2 groups, and each uses statically only 1 of the Fabric Uplinks. When an Uplink fails, the related FEX ports go down.
Issue the following command
show fex detail | inc Pinning
If Max-Links equals 2 or more, you are in Static Pinning mode with 2 groups or more. Then search the Uplink for an issue during your FEX ports problem.
show int <N6K_FEX_Port> | inc flapped
...
Remi Astruc
03-31-2019 10:15 AM
Hello Remi Astruc
After executing the command these are the outputs
nexus6001# show fex detail | inc pinning
pinning-mode: static Max-links: 1
pinning-mode: static Max-links: 1
pinning-mode: static Max-links: 1
pinning-mode: static Max-links: 1
nexus6001# sh int Ethernet160/1/17 | inc fla
Last link flapped 2d02h
04-01-2019 03:08 AM
Hello,
What kind of devices are connected to these 25 ports ? Are these going to same kind of servers/hosts ? Are all the links on both the fex's going down at the exact same time (check logs for timestamps)
Also, check ethpc and ethpm logs.
1. sh system internal ethpm event-history interface eth x/y/z
2. sh platform software ethpc event-history interface eth x/y/z
BR,
AK
04-08-2019 04:23 AM
Hello,
Thank you for your answer.
There aren’t the same devices at the other side of the N2K switch. Yes, all the ports get down at the same time.
I attached the tests just from one interface because, as you can see, the information is very extense.
N6K1# show system internal ethpm event-history interface Ethernet154/1/42
>>>>FSM: <Ethernet154/1/42> has 584 logged transitions<<<<<
1) Event:ESQ_REQ length:38, at 671609 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_ETH_PORT_SEC(191), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
2) Event:ESQ_REQ length:38, at 671624 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_L2FM(221), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
3) Event:ESQ_REQ length:38, at 671626 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_ELTM(192), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
4) Event:ESQ_REQ length:38, at 671628 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_ENM(614), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
5) Event:ESQ_REQ length:38, at 671661 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_ETH_SPAN(174), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
RRtoken:0x26908806
6) Event:ESQ_RSP length:38, at 672868 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_RX] Src:MTS_SAP_ETH_SPAN(174), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
RRtoken:0x26908806
7) Event:ESQ_REQ length:38, at 672916 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_VIM(403), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
8) Event:ESQ_REQ length:38, at 672954 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_EVB(1243), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
9) Event:ESQ_REQ length:38, at 673005 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_FWM(602), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
RRtoken:0x2690880A
10) Event:ESQ_RSP length:38, at 835929 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_RX] Src:MTS_SAP_FWM(602), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
RRtoken:0x2690880A
11) Event:ESQ_REQ length:38, at 836203 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_TX] Dst:MTS_SAP_QD(612), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
RRtoken:0x2690885C
12) Event:ESQ_RSP length:38, at 836265 usecs after Sun Apr 7 23:19:33 2019
Instance:530123328, Seq Id:0x1, Ret:SUCCESS
[E_MTS_RX] Src:MTS_SAP_QD(612), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)
RRtoken:0x2690885C
...............................
N6K1# show platform software ethpc event-history interface Ethernet154/1/42
1) Event IF_PCFG_RSP, len: 8, at 643894 usecs after Mon Apr 8 09:52:16 2019
Sent port cfg message response to ethpm - Id: 0x2734bff5, Status: SUCCESS
2) Event IF_PCFG_RSP, len: 8, at 233368 usecs after Mon Apr 8 09:52:16 2019
Sent port cfg message response to ethpm - Id: 0x2734be64, Status: SUCCESS
3) Event IF_PCFG_RSP, len: 8, at 835729 usecs after Mon Apr 8 09:50:02 2019
Sent port cfg message response to ethpm - Id: 0x27345b6e, Status: SUCCESS
4) Event IF_PCFG_RSP, len: 8, at 467430 usecs after Mon Apr 8 09:50:02 2019
Sent port cfg message response to ethpm - Id: 0x273459c3, Status: SUCCESS
5) Event IF_PCFG_RSP, len: 8, at 62341 usecs after Mon Apr 8 09:33:52 2019
Sent port cfg message response to ethpm - Id: 0x27303f84, Status: SUCCESS
6) Event IF_PCFG_RSP, len: 8, at 606336 usecs after Mon Apr 8 09:33:51 2019
Sent port cfg message response to ethpm - Id: 0x27303db5, Status: SUCCESS
7) Event IF_PCFG_RSP, len: 8, at 945852 usecs after Sun Apr 7 23:19:33 2019
Sent port cfg message response to ethpm - Id: 0x26908973, Status: SUCCESS
8) Event IF_PCFG_RSP, len: 8, at 564196 usecs after Sun Apr 7 23:19:33 2019
Sent port cfg message response to ethpm - Id: 0x269087ee, Status: SUCCESS
9) Event IF_PCFG_RSP, len: 8, at 964514 usecs after Sun Apr 7 23:17:58 2019
Sent port cfg message response to ethpm - Id: 0x26904ec6, Status: SUCCESS
10) Event IF_PCFG_RSP, len: 8, at 526033 usecs after Sun Apr 7 23:17:58 2019
Sent port cfg message response to ethpm - Id: 0x26904d06, Status: SUCCESS
11) Event IF_PCFG_RSP, len: 8, at 753944 usecs after Sun Apr 7 23:17:23 2019
Sent port cfg message response to ethpm - Id: 0x26903892, Status: SUCCESS
12) Event IF_PCFG_RSP, len: 8, at 348974 usecs after Sun Apr 7 23:17:23 2019
Sent port cfg message response to ethpm - Id: 0x269035c1, Status: SUCCESS
13) Event IF_PCFG_RSP, len: 8, at 174383 usecs after Fri Apr 5 15:50:15 2019
Sent port cfg message response to ethpm - Id: 0x2335c654, Status: SUCCESS
14) Event IF_PCFG_RSP, len: 8, at 769797 usecs after Fri Apr 5 15:50:14 2019
Sent port cfg message response to ethpm - Id: 0x2335c43e, Status: SUCCESS
15) Event IF_PCFG_RSP, len: 8, at 852022 usecs after Fri Apr 5 15:47:04 2019
Sent port cfg message response to ethpm - Id: 0x23347841, Status: SUCCESS
16) Event IF_PCFG_RSP, len: 8, at 485003 usecs after Fri Apr 5 15:47:04 2019
Sent port cfg message response to ethpm - Id: 0x2334768c, Status: SUCCESS
17) Event IF_PCFG_RSP, len: 8, at 986446 usecs after Fri Apr 5 15:46:42 2019
Sent port cfg message response to ethpm - Id: 0x23346ab7, Status: SUCCESS
18) Event IF_PCFG_RSP, len: 8, at 586683 usecs after Fri Apr 5 15:46:42 2019
Sent port cfg message response to ethpm - Id: 0x23346909, Status: SUCCESS
19) Event IF_PCFG_RSP, len: 8, at 781876 usecs after Sat Mar 30 22:53:00 2019
Sent port cfg message response to ethpm - Id: 0x1aa97b9f, Status: SUCCESS
20) Event IF_PCFG_RSP, len: 8, at 116063 usecs after Sat Mar 30 22:53:00 2019
Sent port cfg message response to ethpm - Id: 0x1aa979c3, Status: SUCCESS
21) Event IF_PCFG_RSP, len: 8, at 504186 usecs after Sat Mar 30 22:51:50 2019
Sent port cfg message response to ethpm - Id: 0x1aa77d0d, Status: SUCCESS
22) Event IF_PCFG_RSP, len: 8, at 244622 usecs after Sat Mar 30 22:51:49 2019
Sent port cfg message response to ethpm - Id: 0x1aa77271, Status: SUCCESS
23) Event IF_PCFG_RSP, len: 8, at 41596 usecs after Thu Mar 28 15:16:23 2019
Sent port cfg message response to ethpm - Id: 0x1540f1e3, Status: SUCCESS
24) Event IF_PCFG_RSP, len: 8, at 547761 usecs after Thu Mar 28 15:16:22 2019
Sent port cfg message response to ethpm - Id: 0x1540f055, Status: SUCCESS
25) Event IF_PCFG_RSP, len: 8, at 186147 usecs after Thu Mar 28 15:14:07 2019
Sent port cfg message response to ethpm - Id: 0x1540a0bc, Status: SUCCESS
Thank you all for your time.
Regards.
04-07-2019 11:06 PM
Hello Team,
I would like to know few things before we further proceed to find the reason.
let me know whether HIF or NIF ports went down?
All the HIF ports went down? what was the status of the interfaces? ex, Hardware failure?
What is the connected to those ports?
Whether interface was part of Port-channel? if yes was it configured as lacp or on?
you can verify the reason for this, whether parent switch or remote device, kindly take the below event-history and match it.
"If the PROTOCOL_DOWN” was triggered first and “PHYSICAL_DOWN” later which means neighbor is initiating the link down."
show system internal ethpm event-history interface ethernet <if_num>
show lacp internal event-history errors <<<< from this output look for "PARTNER_PDU_OUT_OF_SYNC" for the interface which you are concern about.
Thanks,
Shivakumar Hulipalled
04-08-2019 07:49 AM
Hello,
This are all HIF ports. Any NIF ports gets down when it happens.
Yes, when it occurs, the message is "Link not connected"
To those ports are different type of devices connected, not just provider or one machine.
There aren't part of a port-channel, they are individual ports from each other.
We have provided the output show system internal ethpm event-history interface ethernet <if_num> in another comment of one of the ports.
Thank you for your time.
Best regards.
04-08-2019 06:00 PM
Hello Team,
Thanks for providing the information.
Considering the logs i dont find any event which i have mentioned about was there "PROTOCOL_DOWN” triggered first and “PHYSICAL_DOWN” or vice varsa to conclude who is the reason for this.
When you confirm that even NIF ports are going down then in that case your FEX will get isolated from the parent switch.
What is the syslog it will get generated prior NIF port going down?
FEX is connected to which model of parent switch? Is this connected tp N7K? if yes to which module type it connected.
what is the version of the FEX? how frequent HIF are going to not connected?
Do you see any High CPU in the parent switch? [show processes cpu history] or any crash taken place? [show cores]
show system reset-reason fex <XYZ>
Thanks,
Shivakumar Hulipalled
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide