cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3793
Views
0
Helpful
6
Replies

lacp over vpc on nx7702

tiwang
Level 3
Level 3

hi out there

 

We encountered a problem where a portchannel to a IBM host suddenly died - the link is running as 2 link Portchannel ( ethernet 101/1/11 on nx1 and ethernet 102/1/11 on nx2) 

These links are build as portchannel spanning the two nexus trough a vPC

nx #1

interface port-channel111
  description SAP
  switchport
  switchport access vlan 2
  vpc 111

 

vPC status
------------------------------------------------------
id    Port         Status Consistency Active VLANs
----- ------------ ------ ----------- ----------------
111   Po111        up     success     2

 

interface Ethernet101/1/11
  description SAP
  switchport
  switchport access vlan 2
  channel-group 111 mode active
  no shutdown

 

 sh int po 111
port-channel111 is up
admin state is up
 vPC Status: Up, vPC number: 111
  Hardware: Port-Channel, address: 0041.d228.374c (bia 0041.d228.374c)
  Description: SAP
  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec
  reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, medium is broadcast
  Port mode is access
  full-duplex, 10 Gb/s
  Input flow-control is off, output flow-control is on
  Auto-mdix is turned off
  Switchport monitor is off
  EtherType is 0x8100
  Members in this channel: Eth101/1/11
  Last clearing of "show interface" counters never
  6 interface resets
  Load-Interval #1: 30 seconds
    30 seconds input rate 2216 bits/sec, 2 packets/sec
    30 seconds output rate 2091560 bits/sec, 742 packets/sec
    input rate 2.22 Kbps, 2 pps; output rate 2.09 Mbps, 742 pps
  Load-Interval #2: 5 minute (300 seconds)
    300 seconds input rate 2120 bits/sec, 2 packets/sec
    300 seconds output rate 3123960 bits/sec, 793 packets/sec
    input rate 2.12 Kbps, 2 pps; output rate 3.12 Mbps, 793 pps
  RX
    4282567481 unicast packets  690254 multicast packets  128062 broadcast packe
ts
    4283385797 input packets  2388294822502 bytes
    0 jumbo packets  0 storm suppression packets
    0 runts  0 giants  0 CRC/FCS  0 no buffer
    0 input error  0 short frame  0 overrun   0 underrun  0 ignored
    0 watchdog  0 bad etype drop  0 bad proto drop  0 if down drop
    0 input with dribble  0 input discard
    0 Rx pause
  TX
    9456756464 unicast packets  76216532 multicast packets  6493734 broadcast pa
ckets
    9539466730 output packets  4672927026231 bytes
    0 jumbo packets
    0 output error  0 collision  0 deferred  0 late collision
    0 lost carrier  0 no carrier  0 babble  0 output discard
    0 Tx pause

 

similar on the other nexus

 

here is the logging:

2018 Sep  2 17:10:14 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel111: Ethernet101/1/11 is down
2018 Sep  2 17:10:14 DCDISTSW01 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel111: first operational port changed from Ethernet101/1/11 to none
2018 Sep  2 17:10:14 DCDISTSW01 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel111 is down (No operational members)
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface port-channel111,bandwidth changed to 100000 Kbit
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface Ethernet101/1/11 is down (Initializing)
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel111 is down (No operational members)
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-SPEED: Interface port-channel111, operational speed changed to 10 Gbps
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-IF_DUPLEX: Interface port-channel111, operational duplex mode changed to Full
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface port-channel111, operational Receive Flow Control state changed to off
2018 Sep  2 17:10:15 DCDISTSW01 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface port-channel111, operational Transmit Flow Control state changed to on
2018 Sep  2 17:10:25 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_SUSPENDED: Ethernet101/1/11: Ethernet101/1/11 is suspended by protocol, no LACP PDUs received
2018 Sep  2 17:31:11 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel111: Ethernet101/1/11 is up
2018 Sep  2 17:31:11 DCDISTSW01 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel111: first operational port changed from none to Ethernet101/1/11
2018 Sep  2 17:31:11 DCDISTSW01 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface port-channel111,bandwidth changed to 10000000 Kbit
2018 Sep  2 17:31:11 DCDISTSW01 %ETHPORT-5-IF_UP: Interface Ethernet101/1/11 is up in mode access
2018 Sep  2 17:31:11 DCDISTSW01 %ETHPORT-5-IF_UP: Interface port-channel111 is up in mode access
2018 Sep  2 17:47:23 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_DOWN: port-channel111: Ethernet101/1/11 is down
2018 Sep  2 17:47:23 DCDISTSW01 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel111: first operational port changed from Ethernet101/1/11 to none
2018 Sep  2 17:47:23 DCDISTSW01 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel111 is down (No operational members)
2018 Sep  2 17:47:23 DCDISTSW01 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface port-channel111,bandwidth changed to 100000 Kbit
2018 Sep  2 17:47:23 DCDISTSW01 %ETHPORT-5-IF_DOWN_INITIALIZING: Interface Ethernet101/1/11 is down (Initializing)
2018 Sep  2 17:47:23 DCDISTSW01 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel111 is down (No operational members)
2018 Sep  2 17:47:24 DCDISTSW01 %ETHPORT-5-SPEED: Interface port-channel111, operational speed changed to 10 Gbps
2018 Sep  2 17:47:24 DCDISTSW01 %ETHPORT-5-IF_DUPLEX: Interface port-channel111, operational duplex mode changed to Full
2018 Sep  2 17:47:24 DCDISTSW01 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface port-channel111, operational Receive Flow Control state changed to off
2018 Sep  2 17:47:24 DCDISTSW01 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface port-channel111, operational Transmit Flow Control state changed to on
2018 Sep  2 17:47:34 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_SUSPENDED: Ethernet101/1/11: Ethernet101/1/11 is suspended by protocol, no LACP PDUs received
2018 Sep  2 18:44:15 DCDISTSW01 %ETHPORT-5-IF_DOWN_CFG_CHANGE: Interface Ethernet101/1/11 is down(Config change)
2018 Sep  2 18:44:16 DCDISTSW01 %ETHPORT-5-IF_DOWN_ADMIN_DOWN: Interface Ethernet101/1/11 is down (Administratively down)
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_ADMIN_UP: Interface Ethernet101/1/11 is admin up .
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-SPEED: Interface Ethernet101/1/11, operational speed changed to 10 Gbps
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_DUPLEX: Interface Ethernet101/1/11, operational duplex mode changed to Full
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface Ethernet101/1/11, operational Receive Flow Control state changed to off
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface Ethernet101/1/11, operational Transmit Flow Control state changed to on
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-SPEED: Interface port-channel111, operational speed changed to 10 Gbps
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_DUPLEX: Interface port-channel111, operational duplex mode changed to Full
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_RX_FLOW_CONTROL: Interface port-channel111, operational Receive Flow Control state changed to off
2018 Sep  2 18:44:24 DCDISTSW01 %ETHPORT-5-IF_TX_FLOW_CONTROL: Interface port-channel111, operational Transmit Flow Control state changed to on
2018 Sep  2 18:44:34 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_SUSPENDED: Ethernet101/1/11: Ethernet101/1/11 is suspended by protocol, no LACP PDUs received
2018 Sep  2 18:46:09 DCDISTSW01 %ETH_PORT_CHANNEL-5-PORT_UP: port-channel111: Ethernet101/1/11 is up
2018 Sep  2 18:46:09 DCDISTSW01 %ETH_PORT_CHANNEL-5-FOP_CHANGED: port-channel111: first operational port changed from none to Ethernet101/1/11
2018 Sep  2 18:46:09 DCDISTSW01 %ETHPORT-5-IF_BANDWIDTH_CHANGE: Interface port-channel111,bandwidth changed to 10000000 Kbit
2018 Sep  2 18:46:09 DCDISTSW01 %ETHPORT-5-IF_UP: Interface Ethernet101/1/11 is up in mode access
2018 Sep  2 18:46:09 DCDISTSW01 %ETHPORT-5-IF_UP: Interface port-channel111 is up in mode access
 
the main problem is that the link went down - and the traffic didn't continue on the other interface.
as you can see we had to disable the interface and bring it up again to get connectivity up again
 
can some tell me if there are some known issues with the fairly simple setup we have here?
 
br ti

 

6 Replies 6

ADP_89
Level 1
Level 1

Hello,

 

Was the port-channel up on the other side? Can you provide the output for the following command on both devices?

 

"show lacp internal event-history interface ethernet 101/1/11"

 

Thanks,

ADP

See if we take the first box here - DCDISTSW01:

182) FSM:<Ethernet101/1/11> Transition at 202572 usecs after Sat Sep  1 12:38:07 2018
    Previous state: [LACP_ST_WAIT_PORT_CHANNEL_SELECT]
    Triggered event: [LACP_EV_PARTNER_PDU_IN_SYNC_COLLECT_ENABLED_DISTRIBUTING_DISABLED]
    Next state: [FSM_ST_NO_CHANGE]
183) FSM:<Ethernet101/1/11> Transition at 202589 usecs after Sat Sep  1 12:38:07 2018
    Previous state: [LACP_ST_WAIT_PORT_CHANNEL_SELECT]
    Triggered event: [LACP_EV_INTEROP_MODE_PROCESS_PDU]
    Next state: [FSM_ST_NO_CHANGE]
184) FSM:<Ethernet101/1/11> Transition at 207925 usecs after Sat Sep  1 12:38:08 2018
    Previous state: [LACP_ST_WAIT_PORT_CHANNEL_SELECT]
    Triggered event: [LACP_EV_PORT_SELECTED_MEMBER_FOR_AGGREGATION]
    Next state: [LACP_ST_ATTACHED_TO_AGGREGATOR]
185) FSM:<Ethernet101/1/11> Transition at 212492 usecs after Sat Sep  1 12:38:08 2018
    Previous state: [LACP_ST_ATTACHED_TO_AGGREGATOR]
    Triggered event: [LACP_EV_PARTNER_PDU_IN_SYNC_COLLECT_ENABLED_DISTRIBUTING_DISABLED]
    Next state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_RECEIVE_PATH]
186) FSM:<Ethernet101/1/11> Transition at 352263 usecs after Sat Sep  1 12:38:08 2018
    Previous state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_RECEIVE_PATH]
    Triggered event: [LACP_EV_PORT_RECEIVE_PATH_ENABLED_AS_CHANNEL_MEMBER_MESSAGE]
    Next state: [LACP_ST_PORT_MEMBER_RECEIVE_ENABLED]
187) FSM:<Ethernet101/1/11> Transition at 352393 usecs after Sat Sep  1 12:38:08 2018
    Previous state: [LACP_ST_PORT_MEMBER_RECEIVE_ENABLED]
    Triggered event: [LACP_EV_PARTNER_PDU_IN_SYNC_COLLECT_ENABLED_DISTRIBUTING_DISABLED]
    Next state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_TRANSMIT_PATH]
188) FSM:<Ethernet101/1/11> Transition at 358757 usecs after Sat Sep  1 12:38:08 2018
    Previous state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_TRANSMIT_PATH]
    Triggered event: [LACP_EV_PORT_HW_PATH_ENABLED]
    Next state: [LACP_ST_PORT_MEMBER_COLLECTING_AND_DISTRIBUTING_ENABLED]
 
this looks a bit strange to me - there is no events at all for the date in question - the neighbour DCDISTSW02 similar:
 

88) FSM:<Ethernet102/1/11> Transition at 211597 usecs after Sat Aug 11 12:03:30 2018
Previous state: [LACP_ST_WAIT_PORT_CHANNEL_SELECT]
Triggered event: [LACP_EV_PORT_SELECTED_MEMBER_FOR_AGGREGATION]
Next state: [LACP_ST_ATTACHED_TO_AGGREGATOR]

89) FSM:<Ethernet102/1/11> Transition at 338197 usecs after Sat Aug 11 12:03:30 2018
Previous state: [LACP_ST_ATTACHED_TO_AGGREGATOR]
Triggered event: [LACP_EV_PARTNER_PDU_IN_SYNC_COLLECT_ENABLED_DISTRIBUTING_DISABLED]
Next state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_RECEIVE_PATH]

90) FSM:<Ethernet102/1/11> Transition at 470911 usecs after Sat Aug 11 12:03:30 2018
Previous state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_RECEIVE_PATH]
Triggered event: [LACP_EV_PORT_RECEIVE_PATH_ENABLED_AS_CHANNEL_MEMBER_MESSAGE]
Next state: [LACP_ST_PORT_MEMBER_RECEIVE_ENABLED]

91) FSM:<Ethernet102/1/11> Transition at 471082 usecs after Sat Aug 11 12:03:30 2018
Previous state: [LACP_ST_PORT_MEMBER_RECEIVE_ENABLED]
Triggered event: [LACP_EV_PARTNER_PDU_IN_SYNC_COLLECT_ENABLED_DISTRIBUTING_DISABLED]
Next state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_TRANSMIT_PATH]

92) FSM:<Ethernet102/1/11> Transition at 479018 usecs after Sat Aug 11 12:03:30 2018
Previous state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_TRANSMIT_PATH]
Triggered event: [LACP_EV_PORT_HW_PATH_ENABLED]
Next state: [LACP_ST_PORT_MEMBER_COLLECTING_AND_DISTRIBUTING_ENABLED]


Curr state: [LACP_ST_PORT_MEMBER_COLLECTING_AND_DISTRIBUTING_ENABLED]
 
ethernet 101 is the fex on box #1 whereas ethernet 102 is the fex on box #2
 

Can you post the output of show vpc and show port-channel summary from both the Nexus

hi nazimkha - I think we have found the root-cause - looks as some driver-problems with the server we are servicing here - as you can see in my next post it looks promising (but the vpc is nicely converged etc)

br ti

Hello,

 

Are you sure you scrolled down to very bottom on DCDISTSW01?

The output should terminate with "Curr state" as you can see on DCDISTSW02 .

 

ADP

hi again

thanks for your suggestions and input - we might have found the root cause here because the server-team got some "fixes" for the ibm server - and now it looks much nicer - we have added a new poerchannel just to play safe and now is the last entry:

 

 

68) FSM:<Ethernet101/1/16> Transition at 428408 usecs after Sat Sep  8 00:46:45 2018

    Previous state: [LACP_ST_PORT_MEMBER_RECEIVE_ENABLED]

    Triggered event: [LACP_EV_PARTNER_PDU_IN_SYNC_COLLECT_ENABLED_DISTRIBUTING_DISABLED]

    Next state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_TRANSMIT_PATH]

 

69) FSM:<Ethernet101/1/16> Transition at 439049 usecs after Sat Sep  8 00:46:45 2018

    Previous state: [LACP_ST_WAIT_FOR_HW_TO_PROGRAM_TRANSMIT_PATH]

    Triggered event: [LACP_EV_PORT_HW_PATH_ENABLED]

    Next state: [LACP_ST_PORT_MEMBER_COLLECTING_AND_DISTRIBUTING_ENABLED]

 

now is a single days uptime not an indication of stability - but the state here looks promising..