ā12-01-2013 01:08 PM - edited ā03-07-2019 04:52 PM
Hello,
We have a problem with stacked 2x 3750-X's ethernet links bundled to port channel to 2960-X access switches (there are multiple of 2960-X's).
The setup is as follows:
Cat3750-X_1 GE 1/0/3 <--------> Cat2960-X GE 1/0/47
Cat3750-X_2 GE 2/0/3 <--------> Cat2960-X GE 1/0/48
Configuration:
3750-X side:
!
interface Port-channel23
description Trunk to sw-access3
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
end
!
interface GigabitEthernet1/0/3
description Trunk to sw-access3
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
channel-group 23 mode active
end
!
interface GigabitEthernet2/0/3
description Trunk to sw-access3
switchport trunk encapsulation dot1q
switchport mode trunk
switchport nonegotiate
channel-group 23 mode active
end
2960-X side:
!
interface Port-channel1
description Trunk sw-core
switchport mode trunk
switchport nonegotiate
end
!
interface GigabitEthernet1/0/47
description Trunk sw-core
switchport mode trunk
switchport nonegotiate
channel-group 1 mode active
end
!
interface GigabitEthernet1/0/48
description Trunk sw-core
switchport mode trunk
switchport nonegotiate
channel-group 1 mode active
end
Port Channel config to the other 3 access switches is identical.
Everything was working great for the last 5 weeks and now we have these problems:
Logs:
3750-X:
Nov 29 10:30:30.434 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/3, changed state to down
Nov 29 10:30:30.711 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/1, changed state to down
Nov 29 10:30:32.044 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/2, changed state to down
Nov 29 10:30:33.806 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/2, changed state to down
Nov 29 10:31:18.716 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/3, changed state to down
Nov 29 10:31:18.733 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel23, changed state to down
Nov 29 10:31:19.748 CET: %LINK-3-UPDOWN: Interface Port-channel23, changed state to down
Nov 29 10:31:20.050 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/3, changed state to down
Nov 29 10:31:22.054 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/2, changed state to down
Nov 29 10:31:22.071 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel22, changed state to down
Nov 29 10:31:23.086 CET: %LINK-3-UPDOWN: Interface Port-channel22, changed state to down
Nov 29 10:31:23.388 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/2, changed state to down
Nov 29 10:32:25.286 CET: %PM-4-ERR_RECOVER: Attempting to recover from loopback err-disable state on Gi2/0/21 (sw-core-2)
Nov 29 10:32:29.747 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/21, changed state to up
Nov 29 10:32:32.666 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/3, changed state to up
Nov 29 10:32:35.744 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/21, changed state to up
Nov 29 10:32:35.946 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/2, changed state to up
Nov 29 10:32:38.730 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/3, changed state to up
Nov 29 10:32:39.653 CET: %LINK-3-UPDOWN: Interface Port-channel23, changed state to up
Nov 29 10:32:40.685 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel23, changed state to up
Nov 29 10:32:42.195 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/2, changed state to up
Nov 29 10:32:42.874 CET: %LINK-3-UPDOWN: Interface Port-channel22, changed state to up
Nov 29 10:32:43.914 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel22, changed state to up
Nov 29 10:33:22.584 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/3, changed state to up
Nov 29 10:33:25.939 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/2, changed state to up
Nov 29 10:33:29.386 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/3, changed state to up
Nov 29 10:33:31.702 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/2, changed state to up
2960-X:
Nov 29 10:56:08.535 CET: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet1/0/48.
Nov 29 10:56:08.535 CET: %PM-4-ERR_DISABLE: loopback error detected on Gi1/0/48, putting Gi1/0/48 in err-disable state
Nov 29 10:56:09.538 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/48, changed state to down
Nov 29 10:56:10.548 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/48, changed state to down
Nov 29 10:58:08.551 CET: %PM-4-ERR_RECOVER: Attempting to recover from loopback err-disable state on Gi1/0/48
Nov 29 10:58:13.378 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/48, changed state to up
Nov 29 10:58:20.096 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/48, changed state to up
Nov 29 11:29:49.744 CET: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet1/0/48.
Nov 29 11:29:49.747 CET: %PM-4-ERR_DISABLE: loopback error detected on Gi1/0/48, putting Gi1/0/48 in err-disable state
Nov 29 11:29:50.754 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/48, changed state to down
Nov 29 11:29:51.768 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/48, changed state to down
Nov 29 11:29:59.744 CET: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet1/0/47.
Nov 29 11:29:59.748 CET: %PM-4-ERR_DISABLE: loopback error detected on Gi1/0/47, putting Gi1/0/47 in err-disable state
No errors on any interfaces...:
GigabitEthernet1/0/48 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is c025.5c64.c8b0 (bia c025.5c64.c8b0)
Description: 802.1Q Trunk sw-core
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:02, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 241000 bits/sec, 154 packets/sec
5 minute output rate 146000 bits/sec, 67 packets/sec
328302508 packets input, 249207555791 bytes, 0 no buffer
Received 27119257 broadcasts (14200187 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 14200187 multicast, 0 pause input
0 input packets with dribble condition detected
193253722 packets output, 34501328920 bytes, 0 underruns
0 output errors, 0 collisions, 2 interface resets
2 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
Has anyone encountered similar problems? Any help/ideas appriciated.
Best Regards,
Adi
Solved! Go to Solution.
ā12-02-2013 01:05 PM
Adi,
The updated logs about the flapping MAC very strongly suggest that a switching loop exists in your network. I wonder why STP has not prevented it from occuring. Is there perhaps some server, perhaps MS Windows, using aggregated links? Sometimes a misconfigured station with multiple NICs can cause switching loops (if some kind of bridging or link bundling is activated on it).
Best regards,
Peter
ā12-01-2013 01:57 PM
Hi Adi,
This is an interesting issue. May I ask you a couple of questions?
The loopback error you are seeing is caused by a port receiving its own LOOP frames that are sent each 10 seconds. Normally, this should never happen because a correctly behaved neighboring switch would never send a frame back through the port it came in. However, if the MAC learning is disabled, or if there is an unblocked loop in the network, the frame may eventually loop back to the port where it was originated. Such a port will be err-disabled. I am therefore looking for any reason that would allow the LOOP frame to get back to its originating port.
Best regards,
Peter
ā12-01-2013 03:02 PM
Try using "channel-group # mode ON".
What IOS are you using for both the 3750X and 2960X?
ā12-01-2013 03:03 PM
Hi,
I know portchannles are locally significant and it shouldn't matter that you have po23 on the 3750s and po1 on the 2960s, but just to eliminate the portchannel id inconsistencies, can you configure both the 3750s and 2960 with the same po id and test again?
Also, what if you configure one side as active and one side as passive?
HTH
ā12-01-2013 10:59 PM
Leo and Reza,
Thanks for joining but I respectfully disagree with both suggestions. I am afraid neither of them will help. Please don't take the following lines as bashing any of you - nothing could be farther from me. It is just me speaking out my thoughts so you can correct me if I am wrong at any point.
Reza, just as you mentioned: the Port-channel ID is purely a local number and is never transmitted to a neighbor. In addition, the 2960 may not support the same range of Port-channel IDs as the 3750, so making the Port-channel ID identical on both switches may not be possible. I see no way, based on the knowledge how EtherChannel and LACP/PAgP operate, how the change of Port-channel ID could have an impact. Configuring one side as active and the other as passive will have no influence as the active side will nevertheless make the passive side engage into creating an EtherChannel.
Leo, I have always strongly opposed and disagreed with the use of channel-group mode on unless absolutely and vitally necessary. I have went long and far on CSC to explain why I consider this mode of EtherChannel operation to be dangerous. It has its own set of failure scenarios that can compound the issue we are trying to solve here. This command could theoretically have an impact only if one switch was considering the ports as bundled while the other did not. I believe we should investigate the state of EtherChannel using show etherchannel summary before making this - in my humble opinion - dangerous change.
Best regards,
Peter
ā12-02-2013 01:14 PM
Thanks Peter.
ā12-01-2013 11:13 PM
Hello everyone,
thank you for taking interest in my problem.
@Peter
1. Yes it does happen with all the 2960-X's (3 of them). here is the log from this morning (note that last log from last night was me configuring the switch):
Dec 1 22:17:56.599 CET: %SYS-5-CONFIG_I: Configured from console by admin on vty0 (192.168.249.11)
Dec 2 07:19:14.705 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/1, changed state to down
Dec 2 07:19:15.468 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/3, changed state to down
Dec 2 07:19:15.468 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel23, changed state to down
Dec 2 07:19:15.761 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/1, changed state to down
Dec 2 07:19:16.156 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/3, changed state to down
Dec 2 07:19:16.407 CET: %LINK-3-UPDOWN: Interface Port-channel23, changed state to down
Dec 2 07:19:16.516 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/3, changed state to down
Dec 2 07:19:17.179 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/3, changed state to down
Dec 2 07:19:18.823 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/2, changed state to down
Dec 2 07:19:19.855 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/2, changed state to down
Dec 2 07:19:24.116 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/1, changed state to down
Dec 2 07:19:24.133 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel21, changed state to down
Dec 2 07:19:25.156 CET: %LINK-3-UPDOWN: Interface Port-channel21, changed state to down
Dec 2 07:19:25.861 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/1, changed state to down
Dec 2 07:21:18.027 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/1, changed state to up
Dec 2 07:21:18.505 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/2, changed state to down
Dec 2 07:21:18.522 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel22, changed state to down
Dec 2 07:21:19.034 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/1, changed state to up
Dec 2 07:21:19.277 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/3, changed state to up
Dec 2 07:21:19.545 CET: %LINK-3-UPDOWN: Interface Port-channel22, changed state to down
Dec 2 07:21:19.847 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/2, changed state to down
Dec 2 07:21:20.283 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/3, changed state to up
Dec 2 07:21:22.414 CET: %LINK-3-UPDOWN: Interface GigabitEthernet2/0/2, changed state to up
Dec 2 07:21:23.421 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet2/0/2, changed state to up
Dec 2 07:21:27.967 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/1, changed state to up
Dec 2 07:21:30.886 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/1, changed state to up
Dec 2 07:21:31.784 CET: %LINK-3-UPDOWN: Interface Port-channel21, changed state to up
Dec 2 07:21:32.790 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel21, changed state to up
Dec 2 07:21:34.585 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/3, changed state to up
Dec 2 07:21:37.504 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/3, changed state to up
Dec 2 07:21:38.469 CET: %LINK-3-UPDOWN: Interface Port-channel23, changed state to up
Dec 2 07:21:39.475 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel23, changed state to up
Dec 2 07:23:22.398 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/2, changed state to up
Dec 2 07:23:25.938 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/2, changed state to up
Dec 2 07:23:26.902 CET: %LINK-3-UPDOWN: Interface Port-channel22, changed state to up
Dec 2 07:23:27.909 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel22, changed state to up
Last night I effectivly broke the PortChannels by removing one link, thus leaving only one link active (here is the view from access switch):
sw-access3#sh etherchannel summary
Flags: D - down P - bundled in port-channel
I - stand-alone s - suspended
H - Hot-standby (LACP only)
R - Layer3 S - Layer2
U - in use f - failed to allocate aggregator
M - not in use, minimum links not met
u - unsuitable for bundling
w - waiting to be aggregated
d - default port
Number of channel-groups in use: 1
Number of aggregators: 1
Group Port-channel Protocol Ports
------+-------------+-----------+-----------------------------------------------
1 Po1(SU) LACP Gi1/0/47(P)
sw-access3#sh spanning-tree root
Root Hello Max Fwd
Vlan Root ID Cost Time Age Dly Root Port
---------------- -------------------- --------- ----- --- --- ------------
VLAN0001 4097 7c69.f6a2.5480 4 2 20 15 Gi1/0/48
VLAN0005 4101 7c69.f6a2.5480 4 2 20 15 Gi1/0/48
...
so the one standalone port is now STP root port, and the one in the PortChannel is Alternate.
2. STP version is RSTP and no ports are in inconsistent state at any switch
3. MAC address aging time is the default of 300 seconds throughout the network
4. All switches are learning on all the vlans
5. I'm sure there are no loops (at least when I was there last time (this is customers network)) but will double check!
6. We may try this as a last option tonight if we don't figure it out
@Leo
IOS versions are:
3750-X's: 15.0(2)SE4
2960-X's: 15.0(2)EX1
Stock that came with the switches, I've checked the IOS, there are some newer revisions but no mention of similar problems so no plan yet for SW upgrade.
I'm strongly against static port-channels and I'm not considering using them, and since I've broken the PortChannels last night and the same problem is happening, seems it's not the cause.
@Reza:
I've had too many configurations that use inconsistent PC id's to think that could be the problem, and again
since I've broken the PortChannels last night and the same problem is happening, seems it's not the cause.
Currently thinking that this is a Layer 1 problem, will investigate cabeling first.
Thanks guys!
Best Regards,
Adi
ā12-01-2013 11:48 PM
Hello,
As Peter stated earlier, the error messages seem to be because these keepalive frames are being received back by the switch that originated it (the 3750-X in this case).
These 'LOOP' frames are quite interesting in how they are built - they are sent to and from the local mac-address of the switch originating it and they are not meant to be forwarded by the receiving switch. The adjacent, receiving switch should simply consume the frame and not pass it on.
Apart from all of what Peter suggested, I would also recommend enabling mac-move notifications. This should help determine any loops in the layer 2 topology. I'd also suggest to rule out any high CPU problems on the switches - if the CPU is pegged, the frame might not be consumed and could be wrongly forwarded instead. This is just a theory, I have not really seen this happen in my experience yet.
Regards,
Aninda
ā12-02-2013 01:00 AM
Hello Aninda,
I've just noticed this in the log of one access switch:
Dec 2 07:14:19.212 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/27, changed state to down
Dec 2 07:14:20.211 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/27, changed state to down
Dec 2 07:14:25.402 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/27, changed state to up
Dec 2 07:14:26.405 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/27, changed state to up
Dec 2 07:14:30.680 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/27, changed state to down
Dec 2 07:14:32.687 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/27, changed state to up
Dec 2 07:14:33.732 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/27, changed state to down
Dec 2 07:14:34.731 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/27, changed state to down
Dec 2 07:15:00.521 CET: %LINK-3-UPDOWN: Interface GigabitEthernet1/0/27, changed state to up
Dec 2 07:15:01.524 CET: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0/27, changed state to up
Dec 2 07:19:12.598 CET: %SW_MATM-4-MACFLAP_NOTIF: Host c89c.dc7f.e653 in vlan 1 is flapping between port Gi1/0/21 and port Gi1/0/19
Dec 2 07:19:12.941 CET: %SW_MATM-4-MACFLAP_NOTIF: Host 00c0.ee94.27b9 in vlan 1 is flapping between port Gi1/0/21 and port Gi1/0/19
Dec 2 07:19:13.112 CET: %SW_MATM-4-MACFLAP_NOTIF: Host eca8.6b2b.30b6 in vlan 1 is flapping between port Gi1/0/21 and port Gi1/0/19
Dec 2 07:19:13.343 CET: %SW_MATM-4-MACFLAP_NOTIF: Host 5cf3.fc09.a3c8 in vlan 1 is flapping between port Gi1/0/21 and port Gi1/0/19
Dec 2 07:19:13.776 CET: %SW_MATM-4-MACFLAP_NOTIF: Host 7c69.f6a2.54e4 in vlan 100 is flapping between port Gi1/0/48 and port Gi1/0/19
Dec 2 07:19:14.307 CET: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet1/0/47.
Dec 2 07:19:14.307 CET: %PM-4-ERR_DISABLE: loopback error detected on Gi1/0/47, putting Gi1/0/47 in err-disable state
I've missed it this morning so we will investigate whats connected to ports 19 and 21 now.
Thanks for all the ideas.
Br,
Adi
ā12-02-2013 01:05 PM
Adi,
The updated logs about the flapping MAC very strongly suggest that a switching loop exists in your network. I wonder why STP has not prevented it from occuring. Is there perhaps some server, perhaps MS Windows, using aggregated links? Sometimes a misconfigured station with multiple NICs can cause switching loops (if some kind of bridging or link bundling is activated on it).
Best regards,
Peter
ā12-02-2013 01:19 PM
Currently thinking that this is a Layer 1 problem, will investigate cabeling first.
Adi,
If you think this is a Layer 1 issue, can you post the output to the command "sh interface
Another thing, do you have any other switch models (like a 2960S)? The reason why I ask is because this is beginning to smell like an IOS bug. Everything you are doing (with Peter's help) is right. It just doesn't make any sense. If you have a switch that is capable of loading IOS version 12.2(55)SE8 or 15.0(2)SE4 then I'd be keen to know what the behaviour is like. Same cable going to the 3750X, same ports. The only thing different is the IOS version. EXn-series IOS is an interim version software.
ā12-09-2013 03:22 AM
Hello everyone,
problem was traced to a number of small hubs used to "extend the network" without our knowledge :).
Thanks to everyone involved in problem solving for providing tips and giving ideas.
sorry for not replying sooner, been really busy.
Best Regards,
Adi
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide