Load Sharing with the Loopback Address as a BGP Neighbor - BGP s - Page 2

muhammad shahrir osman · ‎08-07-2011

Hello,

I have one scenario. One end is using Cisco router and the other end using Juniper router have been configured eBGP Multihoming using loopback IP. At each router, two interface have been configured and both static route has been pointing to the eBGP loopback IP. The problem is, for example if the

1st interface/link down, the BGP also will down eventhough the 2nd interface/link up. By right for Multihoming BGP at least one physical link should be up to keep to BGP up. Any one have experienced this sort of problem before? Need some help. Thanks in advance!

Regards,

arel

gerald.suiza · ‎08-09-2011

this is what that filter does:

By default, unicast RPF uses strict mode. Unicast RPF loose mode is similar to unicast RPF strict mode and has the same configuration restrictions. The only check in loose mode is whether the packet has a source address with a corresponding prefix in the routing table; loose mode does not check whether the interface expects to receive a packet with a specific source address prefix. If a corresponding prefix is not found, unicast RPF loose mode does not accept the packet. As in strict mode, loose mode counts the failed packet and optionally forwards it to a fail filter, which either accepts, rejects, logs, samples, or polices the packet.

ideally that filter should not affect traffic...

cadet alain · ‎08-09-2011

Hi,

Yes indeed the uRPF shouldn't be the problem as the ping was working but I was wondering about the CFLOW filter ?

Regards.

Alain.

Don't forget to rate helpful posts.

gerald.suiza · ‎08-09-2011

as far as i can remember cflow is similar to netflow and also should not affect traffic..

muhammad shahrir osman · ‎08-12-2011

Hi,

Something cross my mind..Both links has been configured static route pointing to loopback ip. If one link down, one link up but the BGP still down, and I still can ping to the neighbor loopback ip, is it because due to the static route? Although the BGP down, route still prefer the static route. One static route is still running. Thats why I can ping the neighbor when the BGP down. Am I correct?

p/s:Erm..actually this is a running and real production...and my customer cannot accommodate the request to do the debugging.sigh

Regards,

arel

cadet alain · ‎08-13-2011

Hi,

of course the ping is working because there is a route to 2.2.2.2 but how did you do the ping?

You have to source it from 1.1.1.1 to test end-to-end reachability, is this what you did?

If the customer doesn't want debugs then we can try if your platform supports to capture the packets on the POS interface while doing ping 2.2.2.2 so LoO repeat 10 and then telnet 2.2.2.2 179 /source-interface Lo0.

To capture packets you can use RITE:http://www.cisco.com/en/US/docs/ios/12_4t/12_4t11/ht_rawip.html

or EPC:http://www.cisco.com/en/US/docs/ios/netmgmt/configuration/guide/nm_packet_capture_ps6441_TSD_Products_Configuration_Guide_Chapter.html

Then post your capture files here along with sh run | i ip route and sh tcp brief all and sh ip int br | exc una and sh ip route 2.2.2.2 outputs before and after the BGP failure excep first command just done once.

Regards.

Alain.

Don't forget to rate helpful posts.

muhammad shahrir osman · ‎08-15-2011

Hi,

Below are snippet of BGP log captured during one link down and BGP not establish. Any abnormal seen? From the log TCP is close?

Thank you

#sh ip bgp nei 2.2.2.2

BGP neighbor is 2.2.2.2, remote AS XXX, external link

Description: "XXXXXX"

BGP version 4, remote router ID 0.0.0.0

BGP state = Idle

Last read 00:00:09, last write 00:00:09, hold time is 180, keepalive interval is 60 seconds

...................................

....................................

Outbound Inbound

Local Policy Denied Prefixes: -------- -------

Total: 0 0

Number of NLRIs in the update sent: max 0, min 0

Address tracking is enabled, the RIB does have a route to 2.2.2.2

Connections established 11; dropped 11

Last reset 00:00:13, due to BGP Notification sent, hold time expired

External BGP neighbor may be up to 2 hops away.

No active TCP connection

#sh tcp brief all | inc 179

xxxxx 1.1.1.1.179 2.2.2.2.62249 SYNRCVD

xxxxx 1.1.1.1.179 2.2.2.2.51573 FINWAIT1

xxxxx *.179 2.2.2.2.* LISTEN

#sh ip route 2.2.2.2

Routing entry for 2.2.2.2/32

Known via "static", distance 1, metric 0

Routing Descriptor Blocks:

* x.x.x.33

Route metric is 0, traffic share count is 1

x.x.x.21

Route metric is 0, traffic share count is 1

#ping
Protocol [ip]:
Target IP address: 2.2.2.2

Repeat count [5]: 100
Datagram size [100]: 1500
Timeout in seconds [2]:
Extended commands [n]: y
Source address or interface: 1.1.1.1

Type of service [0]:
Set DF bit in IP header? [no]:
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 100, 1500-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (100/100), round-trip min/avg/max = 64/77/264 ms

Regards,

arel

cadet alain · ‎08-15-2011

That's really strange because the link is down so as it is PPP the line protocol should be down and so the route recursing to this interface should disappear from the routing table but it is still here.As normally you should be doing cef switching then if the load balancing is per src-dst-port then the icmp echoes are taking the still up interface but the tcp to 179 is taking the route with the interface down.This maybe one possible explanation but without packet captures or a debug I don't know how we can verify this possible explanation.

If this was the cause then we'll have to investigate why this is the case but I suspect PPP is the culprit.

Regards.

Alain.

Don't forget to rate helpful posts.

muhammad shahrir osman · ‎08-15-2011

Hi Alain,

Thank you very much for your help. I'll try to investigate this problem further. Thanks!

BTW, just sharing you "show interface" captured during the the test.

BGP activity 429275/38504 prefixes, 16523152/14843023 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
2.2.2.2 4 XXX 5493203 106276 0 0 0 00:00:54 Active

#sh int po7/1
POS7/1 is up, line protocol is up
Hardware is Packet over SONET
Description: "XXXXXX"
Internet address is x.x.x.22/30
MTU 4470 bytes, BW 622000 Kbit, DLY 100 usec, rely 255/255, load 11/255
Encapsulation PPP, crc 32, loopback not set
Keepalive set (10 sec)
Scramble disabled
LCP Open
Listen: CDPCP
Open: IPCP
Last input 00:00:00, output 00:00:00, output hang never
Last clearing of "show interface" counters 14:52:14
Queueing strategy: fifo
Output queue 0/40, 0 drops; input queue 0/75, 0 drops
        Available Bandwidth 598962 kilobits/sec
5 minute input rate 189038000 bits/sec, 26256 packets/sec
5 minute output rate 28902000 bits/sec, 13571 packets/sec
     2702023452 packets input, 2235784276601 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
              0 parity
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     1122995631 packets output, 332388913211 bytes, 0 underruns
     0 output errors, 0 applique, 0 interface resets
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions

#sh int po7/2
POS7/2 is administratively down, line protocol is down
Hardware is Packet over SONET
Description: "XXXXXXXXX"
Internet address is x.x.x.34/30
MTU 4470 bytes, BW 622000 Kbit, DLY 100 usec, rely 255/255, load 7/255
Encapsulation PPP, crc 32, loopback not set
Keepalive set (10 sec)
Scramble disabled
LCP Closed
Closed: IPCP, CDPCP
Last input 00:02:45, output 00:02:44, output hang never
Last clearing of "show interface" counters 14:52:14
Queueing strategy: fifo
Output queue 0/40, 0 drops; input queue 0/75, 0 drops
        Available Bandwidth 598962 kilobits/sec
5 minute input rate 140945000 bits/sec, 19663 packets/sec
5 minute output rate 19201000 bits/sec, 8556 packets/sec
     2650157360 packets input, 2209077098852 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
              0 parity
     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
     1120030742 packets output, 331973731804 bytes, 0 underruns
     0 output errors, 0 applique, 2 interface resets
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions

regards,

arel

cadet alain · ‎08-15-2011

Hi,

so the line protocol is down but the route stays in the routing table marked with an asterisk so it's gonna be used for some traffic.

What happens if the other side shuts down the link instead?

Regards.

Alain.

Don't forget to rate helpful posts.

muhammad shahrir osman · ‎08-15-2011

Hi,

Never try to shut down on Juniper side. Only Cisco side. Is there any different if we try to shut the other side?

Another thing that I don't understand, during the test when trace to Juniper lo0, I could see ip 8.8.8.8 (not the real ip, I changed it) in the trace output. My customer said that ip is connected to their core. Thanks

#traceroute

Protocol [ip]:

Target IP address: 2.2.2.2

Source address: 1.1.1.1

Numeric display [n]:

Timeout in seconds [3]:

Probe count [3]:

Minimum Time to Live [1]:

Maximum Time to Live [30]:

Port Number [33434]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Type escape sequence to abort.

Tracing the route to 2.2.2.2

1 2.2.2.2 [AS XXX] 52 msec

8.8.8.8 0 msec

2.2.2.2 [AS XXX] 52 msec

Regards,

arel

cadet alain · ‎08-15-2011

Hi,

No it should be the same but as it behaving quite strangely I'd like to see if it goes the same when the link is shut from the other side.

8.8.8.8 is a public dns server from Google.Why is it appearing in the traceroute?

Regards.

Alain.

Don't forget to rate helpful posts.

gerald.suiza · ‎08-15-2011

the interface output is strange:

#sh int po7/2

POS7/2 is administratively down, line protocol is down

Hardware is Packet over SONET

Description: "XXXXXXXXX"

Internet address is x.x.x.34/30

MTU 4470 bytes, BW 622000 Kbit, DLY 100 usec, rely 255/255, load 7/255

Encapsulation PPP, crc 32, loopback not set

Keepalive set (10 sec)

Scramble disabled

LCP Closed

Closed: IPCP, CDPCP

Last input 00:02:45, output 00:02:44, output hang never

Last clearing of "show interface" counters 14:52:14

Queueing strategy: fifo

Output queue 0/40, 0 drops; input queue 0/75, 0 drops

Available Bandwidth 598962 kilobits/sec

5 minute input rate 140945000 bits/sec, 19663 packets/sec

5 minute output rate 19201000 bits/sec, 8556 packets/sec

2650157360 packets input, 2209077098852 bytes, 0 no buffer

Received 0 broadcasts, 0 runts, 0 giants, 0 throttles

0 parity

0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort

1120030742 packets output, 331973731804 bytes, 0 underruns

0 output errors, 0 applique, 2 interface resets

0 output buffer failures, 0 output buffers swapped out

0 carrier transitions

you are admin down yet there is traffic flowing...

muhammad shahrir osman · ‎08-15-2011

Hi Gerald,

Yup..its strange that still got traffic.

cadet alain · ‎08-15-2011

Hi,

clear the counters before shutting the interface.

Regards.

Alain.

Don't forget to rate helpful posts.

muhammad shahrir osman · ‎08-15-2011

Hi Alain,

Sorry to made you confuse. Actually 8.8.8.8 is not the "real ip" that appeared in the traceroute...All the result that I posted here are correct and real but before I posted here I've changed or censored (x.x.x.x) the IP address.I forgot the 8.8.8.8 is google public dns. Sorry

BGP Down

#traceroute

Protocol [ip]:

Target IP address: 2.2.2.2

Source address: 1.1.1.1

Numeric display [n]:

Timeout in seconds [3]:

Probe count [3]:

Minimum Time to Live [1]:

Maximum Time to Live [30]:

Port Number [33434]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Type escape sequence to abort.

Tracing the route to 2.2.2.2

1 2.2.2.2 [AS XXX] 52 msec

X.X.X.8 0 msec

2.2.2.2 [AS XXX] 52 msec

BGP UP

Protocol [ip]:

Target IP address: 2.2.2.2

Source address: 1.1.1.1

Numeric display [n]:

Timeout in seconds [3]:

Probe count [3]:

Minimum Time to Live [1]:

Maximum Time to Live [30]:

Port Number [33434]:

Loose, Strict, Record, Timestamp, Verbose[none]:

Type escape sequence to abort.

Tracing the route to 2.2.2.2

1 2.2.2.2 [AS XXX] 52 msec

Load Sharing with the Loopback Address as a BGP Neighbor - BGP session down