Solved: BGP session flapping

Antonio_1_2 · ‎10-20-2011

Hello,

I have a BGP session between routers R1 and R2.

R1 MTU 9000 --- network----- MTU 4000 ----- network----MTU9000 R2

As it can be seen above, on path from R1 to R2 there are various network devices, and the bottleneck is L2 network somewhere

in between with maksimum MTU of 4000 Bytes.

BGP session were constantly flapping every 3 minutes. It turned out that R2 didn't send any keepalives to R1.

(that explains 3 minutes - due to holdtime)

After we configured MTU 4000 on R1 BGP session stabilized.

I must say that we already have situation in the network like this regarding MTU but never had a problems with BGP.

The only difference is that now R1 is the only router with IOS ver 15.1. R2 has 50 BGP sessions and there is no problems whatsoever.

1. In documentation can be seen that only BGP update message uses MTU of the configured interface. Keeepalive and Open uses smaller MTU.

In my opinion keepalive messages couldn't get stucked behind BGP update message in BGP output queue because MTU of 4000 bytes are on layer 2 network. R2 can't know that there is a bottleneck in the network and should send all BGP messages out of interface.

2. Did anyone expirience problems like this with BGP?

Is this bug or normal behavior? If this is normal what causes problem that R2 doesn't send keepalives to R1 until R1 MTU is reduced to 4000Bytes?

if anyone has idea please help.

Thanks,

A.

Nagendra Kumar Nainar · ‎10-21-2011

Hi Antonio,

1. In documentation can be seen that only BGP update message uses MTU of the configured interface. Keeepalive and Open uses smaller MTU.

In my opinion keepalive messages couldn't get stucked behind BGP update message in BGP output queue because MTU of 4000 bytes are on layer 2 network. R2 can't know that there is a bottleneck in the network and should send all BGP messages out of interface.

Whenever BGP UPDATE is sent, the rotuer will not assume that keepalive for that interval is sent and will not send any keepalive. SO if this UPDATE packet is missing, it will result in a situation where neighbor will not receive any UPDATE or KEEPALIVE and flap the session.

In your case, BGP comes up as initialization messages like OPEN message are very small and session will move to Established state. Now when BGP router sends the UPDATE, it will not send keepalive. Since this update is not received by remote neighbor, it will not ackowledge @ TCP level and so BGP router keep sending the same UPDATE and no keepalive. This will result in flapping the session.

2. Did anyone expirience problems like this with BGP?

Is this bug or normal behavior? If this is normal what causes problem that R2 doesn't send keepalives to R1 until R1 MTU is reduced to 4000Bytes?

This nehaviour of not sending keepalve when UPDATE is sent, is per RFC4271 and is not a BUG.Please see below for snap from RFC,

"

      Each time the local system sends a KEEPALIVE or UPDATE message, it
      restarts its KeepaliveTimer, unless the negotiated HoldTime value
      is zero.

"

When MTU is reduced to 4000, UPDATE appears to have passed through and so you didnt see any issues.

HTH,

Nagendra

View solution in original post

Mahesh Gohil · ‎10-21-2011

Hi A,

Though i have no clear answers to your two queries but if i assume correctly ip tcp path mtu discovery command can

solve your problem. It will negotiate on lowest mtu in path..in your case it will be negotiated as 4000-20(for tcp)-20(for ip)=3960.

check documents for more details on this command.

Regards

Mahesh

Latchum Naidu · ‎10-21-2011

Hi,

As per the explaination, this could be becuase of the MTU mismatch.
Make sure the MTU is same at both ends

Please rate the helpfull posts.
Regards,
Naidu.

Antonio_1_2 · ‎10-21-2011

Hello,

@Mahesh: I thought that path MTU discovery is enabled by default on Cisco routers?

@Nauidu: Shouldnt TCP negotiate MTU to minimum between two endpoints? So only one side should be the MTU minimum on the path

regards,

A.

Nagendra Kumar Nainar · ‎10-21-2011

Hi Antonio,

1. In documentation can be seen that only BGP update message uses MTU of the configured interface. Keeepalive and Open uses smaller MTU.

In my opinion keepalive messages couldn't get stucked behind BGP update message in BGP output queue because MTU of 4000 bytes are on layer 2 network. R2 can't know that there is a bottleneck in the network and should send all BGP messages out of interface.

Whenever BGP UPDATE is sent, the rotuer will not assume that keepalive for that interval is sent and will not send any keepalive. SO if this UPDATE packet is missing, it will result in a situation where neighbor will not receive any UPDATE or KEEPALIVE and flap the session.

In your case, BGP comes up as initialization messages like OPEN message are very small and session will move to Established state. Now when BGP router sends the UPDATE, it will not send keepalive. Since this update is not received by remote neighbor, it will not ackowledge @ TCP level and so BGP router keep sending the same UPDATE and no keepalive. This will result in flapping the session.

2. Did anyone expirience problems like this with BGP?

Is this bug or normal behavior? If this is normal what causes problem that R2 doesn't send keepalives to R1 until R1 MTU is reduced to 4000Bytes?

This nehaviour of not sending keepalve when UPDATE is sent, is per RFC4271 and is not a BUG.Please see below for snap from RFC,

"

      Each time the local system sends a KEEPALIVE or UPDATE message, it
      restarts its KeepaliveTimer, unless the negotiated HoldTime value
      is zero.

"

When MTU is reduced to 4000, UPDATE appears to have passed through and so you didnt see any issues.

HTH,

Nagendra

Antonio_1_2 · ‎10-21-2011

Hello Nagendra,

thank you very much for explanation.

Regards,

A

Kishore Chennupati · ‎10-21-2011

antonio,

This is what is happening in your case from what I can see.

1. When R1 sends the update packet with a MTU of 9000bytes it gets into your network where the MTU is 1400bytes. Now, the packet should be fragmented and sent to the other end to get reassembled. Now looks like thefragmentation is not happening hence the UPDATE packet doesnt get to the other side(R2) and is being dropped in the middle. once R1 sends the UPDATE packet and the clock gets ticking to hold timer which is 3 mins. Now, the other end doesn't receive the UPDATE packet and the session is torn down

2. A good test for this would be to send some pings from R1 to R2 with varying packets sizes to see whether fragmentation is actually happenig or not. Send like say packet with a size of 5000bytes to R2 and see if it gets there if it doesnt then you have a fragmentation problem.

3. path mtu discovery would be good but it doesnt work with routing protocols as far as I can see . I mean any device in the middle might send the packet too big messages back to R1 when u enable pmtu and it sends UPDATE packets with 9000bytes MTU but bgp will ignore it and keepsending packets with 9000bytes.

4. Once you changed the mtu to 4000bytes the packet travelled smoothly to R2 and the session was stable. But the same problem will happen if R2 sends an UPDATE to R1 and the problem will re occur. Hence its a good idea to have uniform MTU to avoid BGP flapping issues. This is true in EIGRP as well. The only protocol which can escape such a scenario is OSPF where you can configure the protocol to ingore the MTU.

HTH,

Regards

Kishore

Please rate if helpful