cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
8242
Views
50
Helpful
47
Replies
Highlighted
Community Manager

Ask the VIP Event: Unicast IP Routing Protocols

Read the bio

With Peter Palúch

Welcome to the Cisco Support Community Ask the Expert conversation. This is an opportunity to get expert insights and learn about Unicast IP routing protocols with Cisco Designated VIP Peter Palúch. Peter Palúch is an assistant professor and a Cisco Networking Academy instructor at the University of Zilina, Slovakia. His fields of interest include routing, switching, and MPLS technologies. He is a seasoned teacher who has cooperated in various educational activities in Slovakia and abroad, focusing particularly on networking and Linux-based network server systems. Peter holds a doctoral degree in the area of VoIP quality degradation factors; he also holds CCIP certification and CCIE certification (#23527) in Routing & Switching.

Remember to use the rating system to let Peter know if you have received an adequate response. 

Peter might not be able to answer each question due to the volume expected during this event. Remember that you can continue the conversation on the Network Management discussion, discussion forum shortly after the event.  This event lasts through October 21, 2011. Visit this forum often to view responses to your questions and the questions of other community members.

47 REPLIES 47
Highlighted
Rising star

Hi Peter,

Firstly , glad to have you here so that we can flood you with all the doubts ad questions and I am sure we will get some very good answers as we always do from you. :-) Been waiting to see you in the "Ask the Expert"

Was wondering if I can ask a question about Eigrp over L2TPv3 here or should I post in the normal forums? Pls advise

Regards,

Kishore

Highlighted

Hello Kishore,

Thank you so much for your kind words! Sure, you are welcome to post your question about EIGRP over L2TPv3 in this forum. Looking forward to reading it.

Best regards,

Peter

Highlighted

Hi Peter,

I knew you woudn't say NO.

Now for the problem:

We  have a l2tpv3 tunnel setup between 2 x 3845 routers. Now, there  is  roughly about 100-150mbps or more traffic flowing across the  tunnel.  Now, some large files seemed to be transferred over the link  and its  causing heaps of fragmentation on the router/reassembly which is spiking  up the  CPU. Now, I read about MTU tuning for l2tpv3 tunnels and added ip pmtu under the tunnel configuration. Now, I can see performance improvement in the CPU

However,   my eigrp behind the routers starts flapping madly after a couple of   minutes. the neighbor adjacency forms and breaks. I removed the ip   pmtu and reset the tunnel and now its all fine. The MTU along the path is 1500bytes

The l2tpv3 tunnel stays up/up and it doesnt have any issues

The network setup is simple.

(3750)sw1----R2(3845)-----R3(3845)----sw4(3750). 

R2-R3   :  L2tpv3 tunnel

sw1-sw4   :  Eigrp neighbors

sw00001#sh ip eigrp neighbors

EIGRP-IPv4:(1) neighbors for process 1

H   Address                 Interface       Hold Uptime   SRTT   RTO  Q  Seq

                                            (sec)         (ms)       Cnt Num

0   10.1.1.1          Fa1/0/1           12 00:00:40    1  5000  1  75857237

1   192.168.1.2     Fa1/0/2           11 00:02:39 1023  5000  1  530779

What I noticed was that I was getting "retry limit exceeed" in the logs and in the "sh ip eigrp neig"  RTO was 5000.

An RTO of 5000 means that the update packets from eigrp are not getting to the  other side or not receiving ACK packets back, Normally this could be due to  unidirectional links or packets not able to make through or packet loss.

Eigrp tries to send the udpate packets 16 times using RTP and if it doesnt get a reply back it will tear the session.

Why is the "ip pmtu" causing such a havoc? The mtu along the path is 1500bytes.

Also, I calculated the MTU for L2TPv3 tunnel and the Tunnel MTU ends up being 1462bytes(4bytes for L2TPv3,20 bytes for IP,14bytes for Ethernet). So,  1500-38=1462bytes).

Now, can i ping the say between the loopbacks on the switches using 1500bytes and df-bit set? All my interfaces are gig. Would the tunnel inherit its MTU from the interface? because in GRE its different as you know the tunnel ip mtu itself is 1476 hence a 1500 byte with df-bit wont go thru.

IOS used :

Cisco IOS Software, C3750 Software (C3750-IPSERVICESK9-M), Version 12.2(44)SE2, RELEASE SOFTWARE (fc2)

Cisco IOS Software, 3800 Software (C3845-SPSERVICESK9-M), Version 12.4(20)YA, RELEASE SOFTWARE (fc1)

I really appreciate your help if you can give me some direction in this regards. I have looked around and havent' seen many posts on eigrp over l2tpv3. There are heaps for EIGRP over GRE issues but not L2TPV3

Looking forward to your reply. please let me know whatever info you want and i will give it to you

Regards

Kishore

Highlighted

Hello Kishore,

Let me replicate the issue in the lab and see if I can get any comparative results. In the meantime, please, can you kindly post the configuration of your L2TPv3 pseudowire?

While seemingly unrelated, what I am thinking about is avoiding the multicasted EIGRP on the link between the two 3750, and instead configuring both 3750 using the neighbor command to force EIGRP to talk by unicast to its peer. Do you believe you could modify your configuration to peer both 3750 using the neighbor command? I would be interested in seeing if that changes the behavior.

Thank you!

Best regards,

Peter

Highlighted

Hi Kishore,

I have performed a simple test in our lab: two 3560 running 15.0(1)SE IP Services were interconnected via an L2TPv3 tunnel configured over a pair of 1841 routers running 15.0(4)M Advanced IP services, i.e.:

3560 ------ (xconnect) 1841 ------ 1841 (xconnect) ------ 3560

All physical interconnections were done using FastEthernet using the default MTU of 1500. The 3560 were running EIGRP over the L2TPv3 tunnel.

In addition, I have tried to put the tunnel under stress: I have connected two PCs to the 3560 switches, one to each switch, and ran the D-ITG traffic generator between the two PCs with a total amount of traffic exceeding the capacity of the interconnection between the 1841 routers (the PCs were connected using GigE while the 1841 were interconnected by a crossover FastE cable).

I was not able to reproduce the EIGRP flapping, even with using the ip pmtu configuration in the pw-class. I occassionally lost a ping while having the link loaded, but apart from that, the EIGRP was stable. I created topology changes, forced the EIGRP to resync - nothing showed any instability. The timers in the show ip eigrp neighbor were low, no packets were stuck in the queue. Thus, the flapping you are experiencing is not something that occurs in general.

Let me have a couple of questions:

  1. If I understand you correctly, using the ip pmtu lowers your CPU load on the routers but causes the EIGRP adjacencies to flap. Is that correct?
  2. Is the CPU load on 3750 switches significantly increased during the period of EIGRP flapping?
  3. Your show ip eigrp neighbor shows two peers, 10.1.1.1 and 192.168.1.2 on Fa1/0/1 and Fa1/0/2, respectively. Are both these peers reachable via a L2TPv3 tunnel?
  4. During the period of EIGRP flapping, does any other stream carried over the L2TPv3 appear to be affected as well?
  5. How are your 3845 routers interconnected? What is the link layer technology?
  6. Is it possible that your 3750 switches advertise so many routes in EIGRP that the size of a single EIGRP Update packet somehow overgrows the usable MTU of 1462B over the tunnel?

Also, would you mind performing an extended ping from one 3750 to another, sweeping a range of packet sizes from 1450 to 1500 bytes and using the DF bit set plus the verbose option? The MTU of the tunnel should indeed be 1462 bytes but let us make sure that it really is set to that value.

Looking forward to your reply.

Best regards,

Peter

Highlighted

Hi Peter,

Thank you so much for labbing it for me.I greatly appreciate that.I know you are one of those engineers who would do anything to help fellow egineers. hats off to that !!

Let me have a couple of questions:

  1. If I understand you correctly, using the ip pmtu lowers your CPU load on the routers but causes the EIGRP adjacencies to flap. Is that correct?
   A.   Yep this is right.
  1. Is the CPU load on 3750 switches significantly increased during the period of EIGRP flapping?
   A  Nope not really stays around 70%
  1. Your show ip eigrp neighbor shows two peers, 10.1.1.1 and 192.168.1.2 on Fa1/0/1 and Fa1/0/2, respectively. Are both these peers reachable via a L2TPv3 tunnel?
   A  Yes, They both are reachable
  1. During the period of EIGRP flapping, does any other stream carried over the L2TPv3 appear to be affected as well?
   A  Just ip traffic and the EIGRP routing protocol traffic
  1. How are your 3845 routers interconnected? What is the link layer technology?
   A  I am pasting the topology for you sans IP's and hostnames
  1. Is it possible that your 3750 switches advertise so many routes in EIGRP that the size of a single EIGRP Update packet somehow overgrows the usable MTU of 1462B over the tunnel?
   A. Possibly, I have 979 prefixes
      

00001#               sh ip eigrp topology summary

EIGRP-IPv4 Topology Table for AS(1)/ID(172.21.1.1)

Head serial 1, next serial 16153429

979 routes, 0 pending replies, 0 dummies

EIGRP-IPv4:(1) enabled on 5 interfaces, 2 neighbors present on 2 interfaces

Quiescent interfaces:  Fa1/0/1

sw00001#

Also, would you mind performing an extended ping from one 3750 to another, sweeping a range of packet sizes from 1450 to 1500 bytes and using the DF bit set plus the verbose option? The MTU of the tunnel should indeed be 1462 bytes but let us make sure that it really is set to that value.

I am pinging the loopbacks

sw00001#ping

Protocol [ip]:

Target IP address: 10.201.32.254

Repeat count [5]:

Datagram size [100]:

Timeout in seconds [2]:

Extended commands [n]: y

Source address or interface: 10.201.31.254

Type of service [0]:

Set DF bit in IP header? [no]: yes

Validate reply data? [no]:

Data pattern [0xABCD]:

Loose, Strict, Record, Timestamp, Verbose[none]: V

Loose, Strict, Record, Timestamp, Verbose[V]:

Sweep range of sizes [n]: y

Sweep min size [36]: 1450

Sweep max size [18024]: 1540

Sweep interval [1]: 4

Type escape sequence to abort.

Sending 115, [1450..1540]-byte ICMP Echos to 10.199.32.254, timeout is 2 seconds:

Packet sent with a source address of 10.201.31.254

Packet sent with the DF bit set

Reply to request 0 (8 ms) (size 1450)

Reply to request 1 (1 ms) (size 1454)

Reply to request 2 (8 ms) (size 1458)

Reply to request 3 (1 ms) (size 1462)

Reply to request 4 (1 ms) (size 1466)

Reply to request 5 (8 ms) (size 1470)

Reply to request 6 (1 ms) (size 1474)

Reply to request 7 (1 ms) (size 1478)

Reply to request 8 (17 ms) (size 1482)

Reply to request 9 (9 ms) (size 1486)

Reply to request 10 (8 ms) (size 1490)

Reply to request 11 (1 ms) (size 1494)

Reply to request 12 (8 ms) (size 1498)

Request 13 timed out (size 1502)

Request 14 timed out (size 1506)

Request 15 timed out (size 1510)

Request 16 timed out (size 1514)

Request 17 timed out (size 1518)

Success rate is 72 percent (13/18), round-trip min/avg/max = 1/5/17 ms

.

As you can see I get 1500bytes across with df-bit set. I dont get it. I should only get 1462bytes with df bit set

Looking forward to your reply.

Best regards,

Peter

Highlighted

Peter,

I am definetly inclining to the fact that its got something to do with EIGRP. I believe when it sends the UPDATE packets using RTP after the neighbor adjacency is formed using the largest MTU it can which is 1500bytes in our case.

At the moment there is no "ip pmtu" so the tunnel is somehow inherting the interface MTU and I am able to ping between the switches using 1500bytes  df-bit set which I shouldn't right? I should only get 1462bytes with df-bit set.

sw00001#ping 10.201.32.254 size 1500 df-bit

Type escape sequence to abort.

Sending 5, 1500-byte ICMP Echos to 10.199.32.254, timeout is 2 seconds:

Packet sent with the DF bit set

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/3/8 ms

bstsw00001#

The EIGRP is working fine now without the "ip pmtu". Once i put it on i believe the eigrp UPDATE packets seem to get fragmented and EIGRP doesnt seem to like it on the other end. I did some debugs on eigrp and could'nt find anything useful.

Regards

Highlighted

Hello Kishore,

Actually, I believe that the situation is kind of reversed.

The fact that you are able to ping the other end of the L2TP tunnel with the DF bit set using a ping size that clearly exceeds the available MTU over the tunnel currently suggests that one tunnel endpoint performs the fragmentation as necessary, and the other endpoint reassembles the packets. As the fragmentation is most probably performed only after encapsulating the original packet, and the DF bit is not inherited without the ip pmtu command, your packets are fragmented even though they carry the DF bit set. That is the reason why your pings are capable of being carried over the L2TP tunnel even if being oversized and with the DF bit set. That could theoretically also explain the reason why EIGRP currently works, as its packets can be fragmented and reassembled at the opposite tunnel endpoint.

With the ip pmtu command, the inner DF bit is carried to the outer DF bit. I suspect that with that command, the oversized pings would not work beyond a certain size. Also, the EIGRP may be creating such large packets that would need fragmentation. However, I am not sure if the EIGRP implementation actually honors ICMP Packet-too-big messages, and it is possible that the EIGRP continues sending overlong UPDATE packets that eventually get dropped.

I will see if I can test this hypothesis during tomorrow by creating a couple of thousand routes in an EIGRP router and trying to run it over an L2TP tunnel.

Best regards,

Peter

Highlighted

Hi Peter,

Thanks heaps for your explanation about the DF-bit with ip pmtu. I missed that point, I read it but somehow i didnt pay attention to it as I was fumbling looking at different blogs. That makes sense as to why the 1500 packets with DF bit get through.

"if you enable the ip pmtu command in the pseudowire class the Don't Fragment (DF) bit in the tunnel header  is set according to the DF bit value received from the customer edge  (CE) router, or statically if the ip dfbit set option is enabled.".. from cisco website

the ip pmtu command, the inner DF bit is carried to the outer DF bit. I suspect that with that command, the oversized pings would not work beyond a certain size. Also, the EIGRP may be creating such large packets that would need fragmentation. However, I am not sure if the EIGRP implementation actually honors ICMP Packet-too-big messages, and it is possible that the EIGRP continues sending overlong UPDATE packets that eventually get dropped.


This is most likely the case I am inclining too. I was wondering if the UPDATE packets in EIGRP actually use the DF-bit option or respond to icmp packet-too-big messages. I did a debug and couldnt see any of these messages.If this is the case then to fix it I have 2 options I guess

1. I need to use a route-map to clear the df-bit and let the router fragment them

2. not use ip pmtu

3. Lower the TCP mss and ip mtu on the 3750  so that eigrp sends smaller packets. This will lead to more cef switching but atleast the eigrp packets go thru.

I have just come across another site 10 mins ago where the EIGRP packets are dropping. However the tunnel mtu seems to be 1462bytes here.

The tunnel destination router is a 3rd party router. However,we haven't used any "ip pmtu" or anything at our end. I am checkign with the 3rd party group to find about the settings on their pseudowire.

sw000010#ping 10.201.32.254 size 1462 df-bit

Type escape sequence to abort.

Sending 5, 1462-byte ICMP Echos to 10.199.32.254, timeout is 2 seconds:

Packet sent with the DF bit set

!!!!!

Success rate is 100 percent (5/5), round-trip min/avg/max = 1/7/9 ms

sw000010#ping 10.201.32.254 size 1463 df-bit

Type escape sequence to abort.

Sending 5, 1463-byte ICMP Echos to 10.199.32.254, timeout is 2 seconds:

Packet sent with the DF bit set

M.M.M

Success rate is 0 percent (0/5)

It's got about the same number of prfixes.

sw000010#sh ip eigrp topology  summ

EIGRP-IPv4 Topology Table for AS(1)/ID

Head serial 1, next serial 425049

979 routes, 0 pending replies, 0 dummies

EIGRP-IPv4:(1) enabled on 2 interfaces, 2 neighbors present on 2 interfaces

Quiescent interfaces:  Gi1/0/1 Po1

our end config on the pseudowire is pretty simple.

pseudowire-class xyz

encapsulation l2tpv3

protocol l2tpv3 l2-dyn

ip local interface Loopback0

interface GigabitEthernet0/0

description L2TPV3 Tunnel

no ip address

duplex auto

speed auto

media-type rj45

xconnect 10.1.1.2 1002 pw-class xyz

Will wait for your investigation and expert reply. Once again thanks heaps for all your help.

Regards,

Kishore

Highlighted

Hello Kishore,

I have just implemented a similar setup using Dynamips and 2691 12.4T IOSes, and it seems that I have created the flapping EIGRP scenario even without the ip pmtu command. I will definitely need more time to investigate what is going on, and preferably verify that on a real gear, too - Wireshark is not very helpful here because it incorrectly dissects the L2TPv3 packets (I guess I will have to modify the Wireshark L2TP dissector as well). However, I have noticed that the L2TP tunnel endpoints occassionally send an ICMP Time Exceeded for Reassembly messages to each other, as if some packet fragment was not delivered in due time and the original packet could not be reconstituted.

I will let you know as soon as I have more information. By the way, would you mind running the debug ip icmp on your L2TP tunnel endpoints and see if any similar messages are being produced/received?

Best regards,

Peter

Highlighted

hi peter,

thanks for your pain in helping me out. I am happy that you could replicate what I could see. It makes it harder when you can't replicate it. BTW. may I ask how you could generate like 1000 odd prefixes in GNS3 for EIGRP. I wanted to do the same but sort of was thinking where am I going to generate that many prefixes.

I will run the debugs now on our production router for those ICMP's and see if I can see what your are seeing. I just have to be a bit careful as the cpu spikes now and then for other reasons..but i can try this during some low CPU times.

Will keep you posted of the results.

Thanks once again

Kind Regards,

Kishore

Highlighted

Hi Kishore,

Just a quick reply - I am continuing to work on your issue.

Creating 1024 routes was fairly easy using TCL scripts: in older 12.4T IOSes, it can be done as follows:

tclsh

conf t

for {set i 0} {$i < 4} {incr i} {

for {set j 0} {$j < 256} {incr j} {

  ip route 100.$i.$j.0 255.255.255.0 null0 }}

router eigrp 1

redistribute static

end

tclquit

Newer 12.4T IOSes do not allow you to enter configuration mode while in tclsh environment, and instead use a different command to insert commands into configuration - don't remember exactly which one but I can find out if necessary.

Best regards,

Peter

Highlighted

Hello Kishore,

Sadly, I was not able to reproduce the problem using real gear - I will have to double-check the dynamips simulation.

However, can you please verify the output of the show system mtu on both your 3750 interconnected via the L2TP tunnel and make sure that it is set to 1500 bytes, especially the routing MTU? This is very important.

Best regards,

Peter

EDIT: I was able to reproduce it after actually configuring the ip pmtu in the PW class More details later.

Highlighted

Hello Kishore,

When testing the topology on a real gear, I have discovered that a particular 12.4 IOS version running on my 2801 routers sets the DF bit in the outer IP headers for all packets carried through an L2TP tunnel if the ip pmtu is enabled, regardlessly of the DF bit setting of the inner IP header. This is in direct contrast to the description in documentation, and it appears to me to be a bug. EIGRP packets are not DF-marked, yet they are encapsulated into DF-marked packets. Naturally, this behavior would explain the problems you are experiencing: the resulting IP packet after encapsulation would be prohibited from being fragmented, resulting in the inability of large EIGRP packets to be carried through the tunnel. In essence, they would even not be able to leave the 3845 router where they were encapsulated.

This problem is exhibited in two ways, assuming the ip pmtu is enabled and the ip dfbit set is disabled on both ends of a pseudowire:

  1. When consulting the show l2tp session all or show l2tun session all output, it says: "DF bit on, ToS reflect disabled, ToS value 0, TTL value 255" - the DF bit is reported to be "on" even though the ip dfbit set command is not used.
  2. When pinging the 3750 from the other using the ping size of 1500 bytes, the ping will fail, regardlessly of setting the df-bit option.

What exact IOS version is run on your 3845 routers?

The former idea with the ICMP Time Exceeded messages is wrong - it was a dynamips problem, please disregard it.

Best regards,

Peter

Content for Community-Ad