11-23-2012 01:36 AM - edited 03-01-2019 05:37 PM
Hi,
I have a problem with an IPv6 tunnel (ipv6ip) on a Cisco 1841 runnining 15.1(4)M4 or 15.1(4)M5.
It appears that a bug was introduced into 15.1(4)M4 and it is related to IPv6 tunnels and IP SLA.
interface Tunnel64
description IPv6 Tunnel to x.x.x.x
ipv6 address 2001:XXXX:XXXX:XXXX::2/64
tunnel source ATM0/1/0.1
tunnel mode ipv6ip
tunnel destination x.x.x.x
!
After reloading the router, I can see the size of the input queue slowly increasing "Input queue: 30/75/0/0". It appears that specific packets are getting stuck in the input queue while still processing the majority of IPv6 packets. After a short period of time the input queue gets wedged "Input queue: 76/75/0/0" and it stops working for IPv6 unless I reload the router.
Tunnel64 is up, line protocol is up
Hardware is Tunnel
Description: IPv6 Tunnel to x.x.x.x
MTU 17920 bytes, BW 100 Kbit/sec, DLY 50000 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation TUNNEL, loopback not set
Keepalive not set
Tunnel source x.x.x.x (ATM0/1/0.1), destination x.x.x.x
Tunnel Subblocks:
src-track:
Tunnel64 source tracking subblock associated with ATM0/1/0.1
Set of tunnels with source ATM0/1/0.1, 1 member (includes iterators), on interface <OK>
Tunnel protocol/transport IPv6/IP
Tunnel TTL 255
Tunnel transport MTU 1480 bytes
Tunnel transmit bandwidth 8000 (kbps)
Tunnel receive bandwidth 8000 (kbps)
Last input 00:00:15, output 00:00:15, output hang never
Last clearing of "show interface" counters never
Input queue: 76/75/0/0 (size/max/drops/flushes); Total output drops: 0
Queueing strategy: fifo
Output queue: 0/0 (size/max)
30 second input rate 0 bits/sec, 0 packets/sec
30 second output rate 0 bits/sec, 0 packets/sec
2253 packets input, 1691254 bytes, 0 no buffer
Received 0 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
1844 packets output, 730645 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
I also have an IP SLA probe on the router to verify if connectivity is working over the IPv6 tunnel:
ip sla 10
icmp-echo 2001:XXXX:XXXX:XXXX::1
!
ip sla schedule 10 life forever start-time now
It appears that IP SLA return packets are getting stuck in the input queue as the input queue increments every time I receive a response to my IP SLA probe (every 60 seconds). I have tried to change the values in the probe (packet size, tos, etc) without any luck. I am able to ping the same IPv6 address normally from the command line without seeing this behaviour.
Can I deduce that this is a potential buffer leak - I can't find anything on Bug Toolkit relating to this.
Has anyone come across this issue before and know any workarounds?
Thanks in advance,
Chris
Solved! Go to Solution.
02-20-2013 04:08 PM
Hi Chris and Andrea,
Thanks for testing the various versions.
I was able to reproduce the issue in order to investigate further. Here's what I have been able to figure out:
The input queue leak was introduced by the fix for
CSCtn36227 Alignment correction at ipv6_checksum with IPv6 ping sweep
it is fixed via
CSCto56317 Backward compatibility regarding pak release strategy in ipv6_ping_send
CSCto56317 was committed into 15.2(1)T, but was never committed into the 15.1(4)M throttle.
I have put in a request to get CSCto56317 fixed in 15.1(4)M throttle. The next potential release that can get the fix is 15.1(4)M7 which is due out in October.
Please note that CSCto56317 is currently an internal defect. I will be making it external, but it may take a day or two for that change to propogate.
Unfortunately, I don't see an easy workaround to prevent the input leak in the meantime. Given that you decided to move back to 15.1(4)M3, I would assume ip sla is a required feature for you both. Another option would be to modify the frequency of the sla icmps and/or then increase the size of the input hold queue (hold-queue 24000 in) to allow more time before the input queue filled up. Changing the frequency from 1 min to 5 minutes and increasing the hold queue to 24000 should allow the device to go ~83 days before needing a reload to clear the input queue.
11-24-2012 02:58 AM
Chris,
I run a few queries against internal DB, I could not find any bug that would match what you're describing.
At the same time I think it would make sense to dump the buffers and check what is exactly there, it should be also quite easlity preoducible in our labs.
Can I ask you to open a TAC case?
M.
11-24-2012 11:57 AM
Hi,
Unfortunately I don't have TAC support for this device. I have a workaround by running older code, but I was curious to see if anyone else was hitting this issue, as it seems a fairly common use case.
I have run some debugs and can see the packets mount up in the "show buffers old" output:
r1#show buffers old
Header DataArea Pool Rcnt Size Link Enc Flags Input Output
6615260C F5C0EE84 Middl 1 96 79 31 2000200 Tu64 Tu64
661538FC F5C0FB84 Middl 1 96 79 31 2000200 Tu64 Tu64
66154BEC F5C10884 Middl 1 96 79 31 2000200 Tu64 Tu64
66155A20 F5C11244 Middl 1 96 79 31 2000200 Tu64 Tu64
66E38C50 F60B88E4 Middl 1 96 79 31 2000200 Tu64 Tu64
66F661B0 F60B8264 Middl 1 96 79 31 2000200 Tu64 Tu64
671EA9AC F60B95E4 Middl 1 96 79 31 2000200 Tu64 Tu64
673F7DD8 F60B7F24 Middl 1 96 79 31 2000200 Tu64 Tu64
674BD860 F60B85A4 Middl 1 96 79 31 2000200 Tu64 Tu64
IP SLA appears to be padding the packets with random data as, although they are all the same size, they contain random payload information:
Buffer information for Middle buffer at 0x673F7DD8
data_area 0xF60B7F24, refcount 1, next 0x0, flags 0x2000200
linktype 79 (IPV6), enctype 31 (TUNNEL), encsize 0, rxtype 26
if_input 0x682D4FE0 (Tunnel64), if_output 0x682D4FE0 (Tunnel64)
inputtime 00:04:35.044 (elapsed 00:12:16.016)
outputtime 00:04:15.536 (elapsed 00:12:35.524), oqnumber 65535
datagramstart 0xF60B7F64, datagramsize 96, maximum size 756
mac_start 0xF60B7F56, addr_start 0x0, info_start 0xF60B7F4C
network_start 0xF60B7F78, transport_start 0xF60B7FA0, caller_pc 0x6005D410
F60B7F20: AFACEFAD 0A303030 /,o-.000
F60B7F28: 3031393A 204E6F76 20323420 31393A33 019: Nov 24 19:3
F60B7F38: 393A3531 2E373139 20474D54 3A20254C 9:51.719 GMT: %L
F60B7F48: 494E4B2D 332D5550 444F574E 3A20496E INK-3-UPDOWN: In
F60B7F58: 74657266 00010000 AAAA0300 45000060 terf....**..E..`
F60B7F68: 85BB0000 F829C1C1 C2F66D70 BCDF8DB1 .;..x)AABvmp<_.1
F60B7F78: 68800000 00243A40 2001067C 204CFFFE h....$:@ ..| L.~
F60B7F88: 00000000 00000001 2001067C 204CFFFE ........ ..| L.~
F60B7F98: 00000000 00000002 81002C36 00040001 ..........,6....
F60B7FA8: 01020304 05060708 090A0B0C 0D0E0F10 ................
F60B7FB8: 11121314 15161718 191A1B1C 090A0B0C ................
F60B7FC8: 0D0E0F10 11121314 15161718 191A1B1C ................
F60B7FD8: DFABC9A8 8011246D 1E7F0000 0101080A _+I(..$m........
F60B7FE8: D0999C89 42D790D3 8D86D89B 4B4D431F P...BW.S..X.KMC.
F60B7FF8: FD21727E 5F23DFB0 2B2E4625 36B4F7BC }!r~_#_0+.F%64w<
F60B8008: B06CB1B8 4D2686C2 E1EFF74C 1D763883 0l18M&.BaowL.v8.
F60B8018: E9958265 03505E66 09360727 38482306 i..e.P^f.6.'8H#.
F60B8028: 98C1B8A5 AE02C409 A06FF610 7EF25DDF .A8%..D. ov.~r]_
F60B8038: ACC16983 C4D8E476 A879E327 714D4ECF ,Ai.DXdv(yc'qMNO
F60B8048: 0DDD0597 59311E36 64046A3F 81FD0042 .]..Y1.6d.j?.}.B
F60B8058: 1A807A57 B8938A03 034AF37D AB923EE6 ..zW8....Js}+.>f
F60B8068: 2676EF08 383E1B5F 1C0B2F77 4883B60B &vo.8>._../wH.6.
F60B8078: 64D22FBA E9C01D6B 247AE37D 17F8A1E9 dR/:i@.k$zc}.x!i
F60B8088: 71BB5C7F E6CE02FF 00000000 00000000 q;\.fN..........
F60B8098: 00000000 00000000 00000000 00000000 ................
F60B80A8: 00000000 00000000 00000000 00000000 ................
F60B80B8: 00000000 00000000 00000000 00000000 ................
F60B80C8: 00000000 00000000 00000000 00000000 ................
F60B80D8: 00000000 00000000 00000000 00000000 ................
F60B80E8: 00000000 00000000 00000000 00000000 ................
F60B80F8: 00000000 00000000 00000000 00000000 ................
F60B8108: 00000000 00000000 00000000 00000000 ................
F60B8118: 00000000 00000000 00000000 00000000 ................
F60B8128: 00000000 00000000 00000000 00000000 ................
F60B8138: 00000000 00000000 00000000 00000000 ................
F60B8148: 00000000 00000000 00000000 00000000 ................
F60B8158: 00000000 00000000 00000000 00000000 ................
F60B8168: 00000000 00000000 00000000 00000000 ................
F60B8178: 00000000 00000000 00000000 00000000 ................
F60B8188: 00000000 00000000 00000000 00000000 ................
F60B8198: 00000000 00000000 00000000 00000000 ................
F60B81A8: 00000000 00000000 00000000 00000000 ................
F60B81B8: 00000000 00000000 00000000 00000000 ................
F60B81C8: 00000000 00000000 00000000 00000000 ................
F60B81D8: 00000000 00000000 00000000 00000000 ................
F60B81E8: 00000000 00000000 00000000 00000000 ................
F60B81F8: 00000000 00000000 00000000 00000000 ................
F60B8208: 00000000 00000000 00000000 00 .............
Thanks,
Chris
02-14-2013 02:23 AM
i have got exactly the same issue... how did you solved ? (if you did) running M3 ?
Gateway#sh int tun 0 | i queue
Input queue: 76/75/100/0 (size/max/drops/flushes); Total output drops: 0
Output queue: 0/0 (size/max)
Gateway#sh buffers old
Header DataArea Pool Rcnt Size Link Enc Flags Input Output
664C15C0 EEA06EA4 Middl 1 96 79 31 200 Tu0 Tu0
664C1A7C EEA071E4 Middl 1 96 79 31 200 Tu0 Tu0
664C1F38 EEA07524 Middl 1 96 79 31 200 Tu0 Tu0
664C23F4 EEA07864 Middl 1 96 79 31 200 Tu0 Tu0
664C28B0 EEA07BA4 Middl 1 96 79 31 200 Tu0 Tu0
664C2D6C EEA07EE4 Middl 1 96 79 31 200 Tu0 Tu0
664C3228 EEA08224 Middl 1 96 79 31 200 Tu0 Tu0
664C36E4 EEA08564 Middl 1 96 79 31 200 Tu0 Tu0
664C3BA0 EEA088A4 Middl 1 96 79 31 200 Tu0 Tu0
664C405C EEA08BE4 Middl 1 96 79 31 200 Tu0 Tu0
664C4518 EEA08F24 Middl 1 96 79 31 200 Tu0 Tu0
664C49D4 EEA09264 Middl 1 96 79 31 200 Tu0 Tu0
664C4E90 EEA095A4 Middl 1 96 79 31 200 Tu0 Tu0
664C534C EEA098E4 Middl 1 96 79 31 200 Tu0 Tu0
664C5808 EEA09C24 Middl 1 96 79 31 200 Tu0 Tu0
66F2BECC EEE92304 Middl 1 96 79 31 200 Tu0 Tu0
66F2C388 EEE92644 Middl 1 96 79 31 200 Tu0 Tu0
66F2D530 EEE90C44 Middl 1 96 79 31 200 Tu0 Tu0
66F40880 EEE8F8C4 Middl 1 96 79 31 200 Tu0 Tu0
6758A5A0 EEE26C64 Middl 1 96 79 31 200 Tu0 Tu0
6758AA5C EEE26FA4 Middl 1 96 79 31 200 Tu0 Tu0
6758AF18 EEE272E4 Middl 1 96 79 31 200 Tu0 Tu0
6758B3D4 EEE27624 Middl 1 96 79 31 200 Tu0 Tu0
6758B890 EEE27964 Middl 1 96 79 31 200 Tu0 Tu0
6758BD4C EEE27CA4 Middl 1 96 79 31 200 Tu0 Tu0
6758C6C4 EEE28324 Middl 1 96 79 31 200 Tu0 Tu0
6758CB80 EEE28664 Middl 1 96 79 31 200 Tu0 Tu0
6758D03C EEE289A4 Middl 1 96 79 31 200 Tu0 Tu0
676597C4 EEE8CB44 Middl 1 96 79 31 200 Tu0 Tu0
6765A13C EEE8D1C4 Middl 1 96 79 31 200 Tu0 Tu0
6765A5F8 EEE8D504 Middl 1 96 79 31 200 Tu0 Tu0
6784118C EEE94A04 Middl 1 96 79 31 200 Tu0 Tu0
67841648 EEE97444 Middl 1 96 79 31 200 Tu0 Tu0
679D2250 EEE8C804 Middl 1 96 79 31 200 Tu0 Tu0
679D2BC8 EEE8DB84 Middl 1 96 79 31 200 Tu0 Tu0
679D3084 EEE8DEC4 Middl 1 96 79 31 200 Tu0 Tu0
679D3540 EEE8E204 Middl 1 96 79 31 200 Tu0 Tu0
68194A08 EEE91C84 Middl 1 96 79 31 200 Tu0 Tu0
6851CBB8 EEE905C4 Middl 1 96 79 31 200 Tu0 Tu0
68520AC0 EEE91944 Middl 1 96 79 31 200 Tu0 Tu0
68526180 EEE91FC4 Middl 1 96 79 31 200 Tu0 Tu0
68528034 EEEAE644 Middl 1 96 79 31 200 Tu0 Tu0
68529800 EEE90F84 Middl 1 96 79 31 200 Tu0 Tu0
6856A69C EEE97784 Middl 1 96 79 31 200 Tu0 Tu0
6856AB58 EEE98B04 Middl 1 96 79 31 200 Tu0 Tu0
685B4A7C EEEAF344 Middl 1 96 79 31 200 Tu0 Tu0
685B53F4 EEEAF9C4 Middl 1 96 79 31 200 Tu0 Tu0
685B6834 EEEB0A04 Middl 1 96 79 31 200 Tu0 Tu0
685B83AC EEE960C4 Middl 1 96 79 31 200 Tu0 Tu0
685B8868 EEE96404 Middl 1 96 79 31 200 Tu0 Tu0
685B8D24 EEE96744 Middl 1 96 79 31 200 Tu0 Tu0
685B969C EEE96DC4 Middl 1 96 79 31 200 Tu0 Tu0
685BA7D4 EEEAFD04 Middl 1 96 79 31 200 Tu0 Tu0
685BC61C EEE92CC4 Middl 1 96 79 31 200 Tu0 Tu0
685BCAD8 EEE93004 Middl 1 96 79 31 200 Tu0 Tu0
685BCF94 EEE93344 Middl 1 96 79 31 200 Tu0 Tu0
685BD450 EEE93684 Middl 1 96 79 31 200 Tu0 Tu0
685C6D74 EEE953C4 Middl 1 96 79 31 200 Tu0 Tu0
685C7230 EEE95704 Middl 1 96 79 31 200 Tu0 Tu0
685C7BA8 EEE95D84 Middl 1 96 79 31 200 Tu0 Tu0
687C2104 EEE92984 Middl 1 96 79 31 200 Tu0 Tu0
687C2A7C EEE97AC4 Middl 1 96 79 31 200 Tu0 Tu0
687C2F38 EEE97E04 Middl 1 96 79 31 200 Tu0 Tu0
687C33F4 EEE98144 Middl 1 96 79 31 200 Tu0 Tu0
6888076C EEEAE984 Middl 1 96 79 31 200 Tu0 Tu0
688E3164 EEE8F244 Middl 1 96 79 31 200 Tu0 Tu0
689C4684 EEE939C4 Middl 1 96 79 31 200 Tu0 Tu0
689C4B40 EEE93D04 Middl 1 96 79 31 200 Tu0 Tu0
689C54B8 EEE94384 Middl 1 96 79 31 200 Tu0 Tu0
689C5974 EEE946C4 Middl 1 96 79 31 200 Tu0 Tu0
689DAA24 EEE8E544 Middl 1 96 79 31 200 Tu0 Tu0
689DAEE0 EEE8E884 Middl 1 96 79 31 200 Tu0 Tu0
689DB39C EEE8EBC4 Middl 1 96 79 31 200 Tu0 Tu0
689DB858 EEE8EF04 Middl 1 96 79 31 200 Tu0 Tu0
68AE11F4 EEE8F584 Middl 1 96 79 31 200 Tu0 Tu0
68AE2358 EEE8FF44 Middl 1 96 79 31 200 Tu0 Tu0
Header DataArea Pool Rcnt Size Original Flags caller_pc
Public particle pools:
02-15-2013 03:09 PM
Your fastest course of action here is to open a TAC case.
Can you reproduce the wedged queue on demand?
Seems like a regression in 15.1(4)M4, then? Flling back to 15.1(4)M3 fixes it? Is it also present in the latest 15.1(4)M?
02-19-2013 10:21 AM
yes, it's incredibly easy to reproduce and it takes just few seconds. yes correct, 15.1(4)M3 is no effected by this issue, i didn't test with 15.1(4)M but 15.1(4)M4/M5 are affected, while the T train is not.
The other problem is that i don't have a valid support contract for my device so i can't open a TAC case directly, any suggestion is highly welcome
Andrea
02-20-2013 04:08 PM
Hi Chris and Andrea,
Thanks for testing the various versions.
I was able to reproduce the issue in order to investigate further. Here's what I have been able to figure out:
The input queue leak was introduced by the fix for
CSCtn36227 Alignment correction at ipv6_checksum with IPv6 ping sweep
it is fixed via
CSCto56317 Backward compatibility regarding pak release strategy in ipv6_ping_send
CSCto56317 was committed into 15.2(1)T, but was never committed into the 15.1(4)M throttle.
I have put in a request to get CSCto56317 fixed in 15.1(4)M throttle. The next potential release that can get the fix is 15.1(4)M7 which is due out in October.
Please note that CSCto56317 is currently an internal defect. I will be making it external, but it may take a day or two for that change to propogate.
Unfortunately, I don't see an easy workaround to prevent the input leak in the meantime. Given that you decided to move back to 15.1(4)M3, I would assume ip sla is a required feature for you both. Another option would be to modify the frequency of the sla icmps and/or then increase the size of the input hold queue (hold-queue 24000 in) to allow more time before the input queue filled up. Changing the frequency from 1 min to 5 minutes and increasing the hold queue to 24000 should allow the device to go ~83 days before needing a reload to clear the input queue.
02-20-2013 04:14 PM
Fantastic work Brett, thanks! And thanks for handling the administrative work to get the bug in the right place. I presume the release note of CSCto56317 will be updated to mark the tunnel interface symptom observed here?
02-20-2013 04:23 PM
Yes, I have updated the release notes of CSCto56317 to describe the input queue leak/wedge issue. Your comment about moving to a version where CSCto56317 is already fixed is also good, but unfortunately the 1841 can't run 15.2(1)T or later where CSCto56317 is already present. If Andrea is seeing this on an ISR-G2 instead of a ISR like Chris is, it is an option.
11-20-2013 12:45 PM
The fix has been committed and will be available in 15.1(4)M8 due out late March.
11-10-2013 03:16 AM
looks like that it wasn't fixed with M7 ...
http://www.cisco.com/en/US/docs/ios/15_1/release/notes/151-4MCAVS.html#wp60810
let's hope with M8 things will change
11-11-2013 01:41 PM
It looks like the commit to 15,1(4)M throttle wasn't done. If you are in need of getting it fixed for M8, it would be best to open a TAC case so the TAC engineer can follow up with development to make sure the fix gets committed into 15.1(4)M8.
02-21-2013 05:03 AM
Thanks Brett, much appreciated. I will stick with 15.1(4)M3 until M7 as I need IP SLA to check my IPv6 tunnel is still working.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide