cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2236
Views
0
Helpful
7
Replies

ports get down without any reason in nexus 2000

hello everyone 

 

I have two Nexus 6001 that each one of them have connected 5 Nexus 2000.

Each Nexus 2000 has 48 ports.

 

Sometimes, without any reason, 25 ports of two of the Nexus 2000 got down. The device gives an error on the log that says "Link not connected". After two minutes all the ports get up.

 

There's no link failure because we have tested all the cables. There's no one physically disconnecting the ports.

 

Do any of you have had the same failure and know how to fixed it?

 

Thank you in advance.

Regards.

 

 

7 Replies 7

Remi Astruc
Level 1
Level 1

Hello,

It sounds like your FEX is designed with Static Pinning, meaning the 48 ports are divided in 2 groups, and each uses statically only 1 of the Fabric Uplinks. When an Uplink fails, the related FEX ports go down.

Issue the following command

show fex detail | inc Pinning

If Max-Links equals 2 or more, you are in Static Pinning mode with 2 groups or more. Then search the Uplink for an issue during your FEX ports problem.

show int <N6K_FEX_Port> | inc flapped

...

 

Remi Astruc

 

Hello Remi Astruc

 

After executing the command these are the outputs

 

nexus6001# show fex detail | inc pinning
pinning-mode: static Max-links: 1
pinning-mode: static Max-links: 1
pinning-mode: static Max-links: 1
pinning-mode: static Max-links: 1

 

nexus6001# sh int Ethernet160/1/17 | inc fla
Last link flapped 2d02h

 

 

 

 

akdhingr
Level 1
Level 1

Hello,

 

What kind of devices are connected to these 25 ports ? Are these going to same kind of servers/hosts ? Are all the links on both the fex's going down at the exact same time (check logs for timestamps)

 

Also, check ethpc and ethpm logs.

 

1. sh system internal ethpm event-history interface eth x/y/z
2. sh platform software ethpc event-history interface eth x/y/z

 

BR,

AK

Hello,

 

Thank you for your answer.

There aren’t the same devices at the other side of the N2K switch. Yes, all the ports get down at the same time.

 

I attached the tests just from one interface because, as you can see, the information is very extense.

 

 

N6K1# show system internal ethpm event-history interface Ethernet154/1/42

 

 

>>>>FSM: <Ethernet154/1/42> has 584 logged transitions<<<<<

 

1) Event:ESQ_REQ length:38, at 671609 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_ETH_PORT_SEC(191), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

 

2) Event:ESQ_REQ length:38, at 671624 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_L2FM(221), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

 

3) Event:ESQ_REQ length:38, at 671626 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_ELTM(192), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

 

4) Event:ESQ_REQ length:38, at 671628 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_ENM(614), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

 

5) Event:ESQ_REQ length:38, at 671661 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_ETH_SPAN(174), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

    RRtoken:0x26908806

 

6) Event:ESQ_RSP length:38, at 672868 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_RX] Src:MTS_SAP_ETH_SPAN(174), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

    RRtoken:0x26908806

 

7) Event:ESQ_REQ length:38, at 672916 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_VIM(403), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

 

8) Event:ESQ_REQ length:38, at 672954 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_EVB(1243), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

 

9) Event:ESQ_REQ length:38, at 673005 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_FWM(602), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

    RRtoken:0x2690880A

 

10) Event:ESQ_RSP length:38, at 835929 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_RX] Src:MTS_SAP_FWM(602), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

    RRtoken:0x2690880A

 

11) Event:ESQ_REQ length:38, at 836203 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_TX] Dst:MTS_SAP_QD(612), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

    RRtoken:0x2690885C

 

12) Event:ESQ_RSP length:38, at 836265 usecs after Sun Apr  7 23:19:33 2019

    Instance:530123328, Seq Id:0x1, Ret:SUCCESS

    [E_MTS_RX] Src:MTS_SAP_QD(612), Opc:MTS_OPC_ETHPM_PORT_PHY_CLEANUP(61446)

    RRtoken:0x2690885C

............................... 

 

          

               

N6K1# show platform software ethpc event-history interface Ethernet154/1/42

 

 

1) Event IF_PCFG_RSP, len: 8, at 643894 usecs after Mon Apr  8 09:52:16 2019

     Sent port cfg message response to ethpm - Id: 0x2734bff5, Status: SUCCESS

 

 

2) Event IF_PCFG_RSP, len: 8, at 233368 usecs after Mon Apr  8 09:52:16 2019

     Sent port cfg message response to ethpm - Id: 0x2734be64, Status: SUCCESS

 

 

3) Event IF_PCFG_RSP, len: 8, at 835729 usecs after Mon Apr  8 09:50:02 2019

     Sent port cfg message response to ethpm - Id: 0x27345b6e, Status: SUCCESS

 

 

4) Event IF_PCFG_RSP, len: 8, at 467430 usecs after Mon Apr  8 09:50:02 2019

     Sent port cfg message response to ethpm - Id: 0x273459c3, Status: SUCCESS

 

 

5) Event IF_PCFG_RSP, len: 8, at 62341 usecs after Mon Apr  8 09:33:52 2019

     Sent port cfg message response to ethpm - Id: 0x27303f84, Status: SUCCESS

 

 

6) Event IF_PCFG_RSP, len: 8, at 606336 usecs after Mon Apr  8 09:33:51 2019

     Sent port cfg message response to ethpm - Id: 0x27303db5, Status: SUCCESS

 

 

7) Event IF_PCFG_RSP, len: 8, at 945852 usecs after Sun Apr  7 23:19:33 2019

     Sent port cfg message response to ethpm - Id: 0x26908973, Status: SUCCESS

 

 

8) Event IF_PCFG_RSP, len: 8, at 564196 usecs after Sun Apr  7 23:19:33 2019

     Sent port cfg message response to ethpm - Id: 0x269087ee, Status: SUCCESS

 

 

9) Event IF_PCFG_RSP, len: 8, at 964514 usecs after Sun Apr  7 23:17:58 2019

     Sent port cfg message response to ethpm - Id: 0x26904ec6, Status: SUCCESS

 

 

10) Event IF_PCFG_RSP, len: 8, at 526033 usecs after Sun Apr  7 23:17:58 2019

     Sent port cfg message response to ethpm - Id: 0x26904d06, Status: SUCCESS

 

 

11) Event IF_PCFG_RSP, len: 8, at 753944 usecs after Sun Apr  7 23:17:23 2019

     Sent port cfg message response to ethpm - Id: 0x26903892, Status: SUCCESS

 

 

12) Event IF_PCFG_RSP, len: 8, at 348974 usecs after Sun Apr  7 23:17:23 2019

     Sent port cfg message response to ethpm - Id: 0x269035c1, Status: SUCCESS

 

 

13) Event IF_PCFG_RSP, len: 8, at 174383 usecs after Fri Apr  5 15:50:15 2019

     Sent port cfg message response to ethpm - Id: 0x2335c654, Status: SUCCESS

 

 

14) Event IF_PCFG_RSP, len: 8, at 769797 usecs after Fri Apr  5 15:50:14 2019

     Sent port cfg message response to ethpm - Id: 0x2335c43e, Status: SUCCESS

 

 

15) Event IF_PCFG_RSP, len: 8, at 852022 usecs after Fri Apr  5 15:47:04 2019

     Sent port cfg message response to ethpm - Id: 0x23347841, Status: SUCCESS

 

 

16) Event IF_PCFG_RSP, len: 8, at 485003 usecs after Fri Apr  5 15:47:04 2019

     Sent port cfg message response to ethpm - Id: 0x2334768c, Status: SUCCESS

 

 

17) Event IF_PCFG_RSP, len: 8, at 986446 usecs after Fri Apr  5 15:46:42 2019

     Sent port cfg message response to ethpm - Id: 0x23346ab7, Status: SUCCESS

 

 

18) Event IF_PCFG_RSP, len: 8, at 586683 usecs after Fri Apr  5 15:46:42 2019

     Sent port cfg message response to ethpm - Id: 0x23346909, Status: SUCCESS

 

 

19) Event IF_PCFG_RSP, len: 8, at 781876 usecs after Sat Mar 30 22:53:00 2019

     Sent port cfg message response to ethpm - Id: 0x1aa97b9f, Status: SUCCESS

 

 

20) Event IF_PCFG_RSP, len: 8, at 116063 usecs after Sat Mar 30 22:53:00 2019

     Sent port cfg message response to ethpm - Id: 0x1aa979c3, Status: SUCCESS

 

 

21) Event IF_PCFG_RSP, len: 8, at 504186 usecs after Sat Mar 30 22:51:50 2019

     Sent port cfg message response to ethpm - Id: 0x1aa77d0d, Status: SUCCESS

 

 

22) Event IF_PCFG_RSP, len: 8, at 244622 usecs after Sat Mar 30 22:51:49 2019

     Sent port cfg message response to ethpm - Id: 0x1aa77271, Status: SUCCESS

 

 

23) Event IF_PCFG_RSP, len: 8, at 41596 usecs after Thu Mar 28 15:16:23 2019

     Sent port cfg message response to ethpm - Id: 0x1540f1e3, Status: SUCCESS

 

 

24) Event IF_PCFG_RSP, len: 8, at 547761 usecs after Thu Mar 28 15:16:22 2019

     Sent port cfg message response to ethpm - Id: 0x1540f055, Status: SUCCESS

 

 

25) Event IF_PCFG_RSP, len: 8, at 186147 usecs after Thu Mar 28 15:14:07 2019

     Sent port cfg message response to ethpm - Id: 0x1540a0bc, Status: SUCCESS

 

 

 

Thank you all for your time.

Regards.

shulipal
Cisco Employee
Cisco Employee

Hello Team,

 

I would like to know few things before we further proceed to find the reason.

let me know whether HIF or NIF ports went down?

All the HIF ports went down? what was the status of the interfaces? ex, Hardware failure?

What is the connected to those ports?

Whether interface was part of Port-channel? if yes was it configured as lacp or on?

you can verify the reason for this, whether parent switch or remote device, kindly take the below event-history and match it.

"If the PROTOCOL_DOWN” was triggered first and “PHYSICAL_DOWN” later which means neighbor is initiating the link down."

 

show system internal ethpm event-history interface ethernet <if_num>
show lacp internal event-history errors <<<< from this output look for "PARTNER_PDU_OUT_OF_SYNC" for the interface which you are concern about.

 

Thanks,

Shivakumar Hulipalled

 

Hello,

This are all HIF ports. Any NIF ports gets down when it happens.

Yes, when it occurs, the message is "Link not connected"

To those ports are different type of devices connected, not just provider or one machine.

There aren't part of a port-channel, they are individual ports from each other.

We have provided the output show system internal ethpm event-history interface ethernet <if_num> in another comment of one of the ports.
 
Thank you for your time.

Best regards.

Hello Team,


Thanks for providing the information.

Considering the logs i dont find any event which i have mentioned about was there "PROTOCOL_DOWN” triggered first and “PHYSICAL_DOWN” or vice varsa to conclude who is the reason for this.

 

When you confirm that even NIF ports are going down then in that case your FEX will get isolated from the parent switch.

What is the syslog it will get generated prior NIF port going down?

 

FEX is connected to which model of parent switch? Is this connected tp N7K? if yes to which module type it connected.

what is the version of the FEX? how frequent HIF are going to not connected?

 

Do you see any High CPU in the parent switch? [show processes cpu history] or any crash taken place? [show cores]

show system reset-reason fex <XYZ>

 

Thanks,

Shivakumar Hulipalled