08-13-2013 01:19 AM - edited 03-07-2019 02:53 PM
We had recently an issue on our C2940s and C3560CGs. Their uplinks ( they have only one uplink ) went down during transitions on the core switches that caused loops.
On C2940 the cause is quite clear as they have older IOS and by default keepalive is set on them. We do not have errdisable recovery for loopback event set on any ( default value ).
However it is not clear on WS-C3560CG-8PC-S. On some of them we have 15.0(2)SE and on some other 15.0(2)SE2, but we saw uplinks going down and %ETHCNTR-3-LOOP_BACK_DETECTED on both types.
Obviously a fix would be to set errdisable recovery, but there is a question why this occured on WS-C3560CG-8PC-S with newer IOS and no keepalive set ( by default ).
I am trying to reproduce in my lab and was able to. Here is the scenario.
Using WS-C3560CG-8PC-S and 15.0(2)SE plus something behind this switch where I create loops.
Have one copper uplink g0/10 and one fiber ( 1000BaseSX SFP ) uplink g0/9.
The configs are
interface GigabitEthernet0/10
switchport trunk encapsulation dot1q
switchport mode trunk
ip arp inspection trust
srr-queue bandwidth share 10 80 5 5
srr-queue bandwidth shape 0 0 0 0
priority-queue out
udld port aggressive
mls qos trust dscp
spanning-tree bpdufilter enable
ip dhcp snooping trust
interface GigabitEthernet0/9
end
Can see keepalive is not set on any
Switch#sh int g0/9
GigabitEthernet0/9 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is a418.7515.6609 (bia a418.7515.6609)
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 20/255, rxload 62/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseSX SFP
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:01, output 00:00:06, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 245779000 bits/sec, 260361 packets/sec
5 minute output rate 80629000 bits/sec, 85412 packets/sec
595340814 packets input, 70250668350 bytes, 0 no buffer
Received 593274873 broadcasts (1696 multicasts)
0 runts, 0 giants, 0 throttles
4474 input errors, 65 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 1696 multicast, 0 pause input
0 input packets with dribble condition detected
187392751 packets output, 22112377670 bytes, 0 underruns
0 output errors, 0 collisions, 2 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
Switch#sh int g0/10
GigabitEthernet0/10 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is a418.7515.660a (bia a418.7515.660a)
MTU 1500 bytes, BW 100000 Kbit/sec, DLY 100 usec,
reliability 255/255, txload 216/255, rxload 205/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 100Mb/s, link type is auto, media type is 10/100/1000BaseTX
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:02, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 535645705
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 80752000 bits/sec, 85539 packets/sec
5 minute output rate 84982000 bits/sec, 90021 packets/sec
720547706 packets input, 85023739696 bytes, 0 no buffer
Received 720351746 broadcasts (23151 multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 23151 multicast, 0 pause input
0 input packets with dribble condition detected
198816885 packets output, 23460558468 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
I artificially created loops and got the situation when both uplinks went down because of %ETHCNTR-3-LOOP_BACK_DETECTED:
*Mar 1 01:39:12.645: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet0/9.
*Mar 1 01:39:12.645: %PM-4-ERR_DISABLE: loopback error detected on Gi0/9, putting Gi0/9 in err-disable state
*Mar 1 01:39:12.661: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed state to down
*Mar 1 01:39:12.692: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet0/10.
*Mar 1 01:39:12.692: %PM-4-ERR_DISABLE: loopback error detected on Gi0/10, putting Gi0/10 in err-disable state
*Mar 1 01:39:13.693: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/10, changed state to down
*Mar 1 01:39:14.716: %LINK-3-UPDOWN: Interface GigabitEthernet0/10, changed state to down
Switch#sh int status err
Port Name Status Reason Err-disabled Vlans
Gi0/9 err-disabled loopback
Gi0/10 err-disabled loopback
Does anyone have an explanation for that ? Is it a bug and keepalive is set even if it is seen as not set ? Or can some other Cisco feature initiate the process and shut the port down and produce %ETHCNTR-3-LOOP_BACK_DETECTED error message?
Thanks,
Vlad
08-13-2013 01:29 AM
Hi,
The problem occurs because the keepalive packet is looped back to the port that sent the keepalive.
This mechanism was introduced to detect loops. Disabling keepalive will by using the 'no keepalive' interface command will prevent the port from being errdisabled. In case there are loops , we have STP disabled on the CE facing ports which acts as loop avoidance.
This is behavior with Catalyst 2940, 2950, 2950-LRE, 2955, 2970, 3550, 3560 or 3750 .
Keepalives are sent on ALL interfaces by default in 12.1EA based software. Starting in 12.2SE based releases, keepalives are NO longer sent by default on fiber and uplink interfaces.
2-
As per CSCea46385 for "%ETHCNTR-3-LOOP_BACK_DETECTED: Keepalive packet loop-back detected" , workaround the problem by configuring “no keepalive” on UABU catalyst switches
HTH
Regards
Inayath.
*Plz rate if this info is helpfull.
08-13-2013 01:50 AM
The switch in question is WS-C3560CG-8PC-S with 15.0(2)SE or 15.0(2)SE2 where keepalives are disabled ( not set ) by default ( at least this is my understanding from what I can see on the switch, see above ).
The question is why a port on this specific switch goes down because of %ETHCNTR-3-LOOP_BACK_DETECTED.
Thanks,
Vlad
08-13-2013 01:58 AM
Did you happen to check the bug which i mentioned?
If you required more info on that bug then you need to log a tac case so that we help you in providing indepth details on the same.
Regards
Inayath
08-13-2013 02:17 AM
Hello Inayath,
Thank you. If you meanCSCea46385 I get
So cannot read it directly, but can see the explanation on Cisco Support Community link.
----------
Workaround:
Disable keepalives by using the
no keepalive
interface command. This
will prevent the port from being errdisabled, but it does not resolve the root
cause of the problem. Please see section below for more information.
----------
I do not understand how do I apply "no keepalive" command on an interface which this command is already applied by default on. My guess is this all is related to an older version of IOS where keepalive was set by default.
Thanks,
Vlad
08-13-2013 02:44 AM
I am now trying to reproduce with 12.2(55)EX2 on WS-C3560CG-8PC-S, but not able, means it works fine and the port does not go down.
Looks like it might be a bug in versions 15.0(2)SE or 15.0(2)SE2, even if keepalive is seen as not set it is maybe set and causes the port to go down in case of loops. This is of course my guess and it would have to be confirmed by a Cisco developer.
08-13-2013 02:46 AM
Could you please lot the TAC case for this so that we look into this?
regards
Inayath
08-13-2013 02:58 AM
I am going to do, it will take me some time as I need to go through our partner as the device is under share support ( not a Smartnet ).
08-13-2013 03:03 AM
Could you just try disabling the keepalive on the interface and update me the result? I know that it is default enabled but just to make sure we check the result by adding the command on the interface what effect it would does for us.
thanks in advance.
REgards
Inayath
08-13-2013 05:09 AM
It is interesting.
Applied no keepalive
Switch(config)#int g0/9
Switch(config-if)#no keepalive
Switch(config-if)#int g0/10
Switch(config-if)#no keepalive
and not able to reproduce
Stored config
Switch#wr
Building configuration...
[OK]
Restarted
Switch#reload
Proceed with reload? [confirm]
and able to reproduce
*Mar 1 00:01:51.337: %ETHCNTR-3-LOOP_BACK_DETECTED: Loop-back detected on GigabitEthernet0/10.
*Mar 1 00:01:51.337: %PM-4-ERR_DISABLE: loopback error detected on Gi0/10, putting Gi0/10 in err-disable state
*Mar 1 00:01:52.344: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/10, changed state to down
*Mar 1 00:01:53.366: %LINK-3-UPDOWN: Interface GigabitEthernet0/10, changed state to down
Looks like "no keepalive" is valid only in ( during ) running config, but when startup is read into running during boot it does not apply or apply incorrectly.
Also sho int does not seem to show it as real status as it says Keepalive not set, but somehow it is enabled.
Switch#sh int g0/9
GigabitEthernet0/9 is up, line protocol is up (connected)
Hardware is Gigabit Ethernet, address is a418.7515.6609 (bia a418.7515.6609)
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 7/255, rxload 27/255
Encapsulation ARPA, loopback not set
Keepalive not set
Full-duplex, 1000Mb/s, link type is auto, media type is 1000BaseSX SFP
input flow-control is off, output flow-control is unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters never
Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 106682000 bits/sec, 113004 packets/sec
5 minute output rate 29498000 bits/sec, 31249 packets/sec
46021543 packets input, 5430853667 bytes, 0 no buffer
Received 45770751 broadcasts (205 multicasts)
0 runts, 0 giants, 0 throttles
2823 input errors, 44 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 205 multicast, 0 pause input
0 input packets with dribble condition detected
11860922 packets output, 1399597904 bytes, 0 underruns
0 output errors, 0 collisions, 1 interface resets
0 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
Hope this helps.
Thanks,
Vlad
08-13-2013 05:32 AM
I performed one more test.
Rebooted the switch with default values and sniffed ( through monitor session ) the traffic on the other end. Saw loopback frames.
Applied "no keepalive" command, sniffed again, and did not see loopback frames.
My guess is it is quite clear that the code during the boot does not process the default command "no keepalive" at all or incorrectly.
Vlad
08-13-2013 05:51 AM
I have repeated the same snif test with IOS 12.2(55)EX2.
It is all clear, means behaves correctly and does not send out any loopback frames.
Vlad
08-13-2013 05:57 AM
Yup I had tested long time back and found the same.
Regards
Inayath
08-13-2013 11:22 PM
I've applied the most recent version of IOS - 15.0(2)SE4 - on WS-C3560CG-8PC-S. Performed sniff test and can see that this version is clean and does not send out any loopback keepalives ( with default port values ).
Trying to search relese notes which 15 version this has been fixed in, but there seems to be no word about that.
I would say it is quite important to be informed as obviously with default values it caused quite a huge problem to our production as the uplinks to WS-C3560CG-8PC-Ss went down, and the only way how one can remedy is to have a control over console which is not very nice if you have many of them.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide