Solved: Beginner BGP Preference question

cco@leferguson.com · ‎09-12-2024

I'm less than a novice at BGP, and helping a client. They have two sites in separate counties that have both redundant equipment and paths between them. Each site connects, separately, to an AT&T state run network that is semi-private (meaning it has both private and public addresses).

The current vendor's firewall (some weird non-branded thing) connects to the AT&T network, and advertises BGP into AT&T from each end. Each end advertises the opposite ends internal (yes, private) addresses. I'll make up some addresses:

Site 1: 
Local VLAN addresses 10.1.0.0/16
Advertised addresses: 10.1.0.0/16 and 10.2.0.0/16

Site 2: 
Local VLAN addresses: 10.2.0.0/16
Advertised addresses: 10.1.0.0/16 and 10.2.0.0/16

The result of this is that traffic from Site 1 to/from the VOIP connections AT&T is providing go like this:

Site 1 originated path: 
10.1.x.y to AT&T goes out Site 1's firewall 
AT&T returns the connection to Site 1's firewall, to 10.1.x.y
Everything's happy and works.

Site 2 originated path:
10.2.x.y to AT&T goes out Site 2's firewall 
AT&T returns the connection to Site 1's firewall addressed to 10.2.x.y
Site 1's router forwards over internal connection to other end, and thence to 10.2.x.y

This means the return path for a site 2 originating session is asymmetric, returning through site 1. This is working for many things but not some. Note that NAT is not involved, but the firewalls are apparently not happy with a TCP session beginning on one and ending on another, as you would expect. (What is not clear after literally days of discussion is why it was designed this way by the vendor).

Here's the thing - they say they can't change their firewall to do this "right" (in any of the several possibilities), and are suggesting that the AT&T network connection move to our Cisco router, and we advertise BGP in a way that will work.

This implies I need to advertise (for example) 10.1.0.0 from Site 1, with a lower preference for 10.2.0.0 and the reverse from the other side. Except my head is hearing reading about BGP and route preference inbound.

I THINK what I want to do is specific MED with a route map, and I see some indication how in this document.

But I am really a bit gun shy as to whether this is a proper fix (notably because if it is, it would seem they could do it in their firewall now (to which we have no access, by the way)).

I should also note this is a critical system that while highly redundant, shifting the BGP advertisements could take down all portions at once, so I cannot experiment practically (though I have asked to set up a separate pair of test connections for that purpose).

To the extent it matters, eigrp is runing in the internal network between and among sites 1 and 2, and we have complete control over the transport portions of that network, just not the voip systems and that firewall at each end, which another vendor maintains (the vendor that has more or less thrown up their hands and given up).

Bottom line -- is MED the correct path (pun intended) I should be pursuing? If not, can you give me a pointer to continue my research?

Linwood

MHM Cisco World · ‎09-13-2024

what IGP run?
please check topology below

View solution in original post

paul driver · ‎09-12-2024

Hello
Having dual sites with their own wan egress and also sharing say a dark fibre les connection is an applicable design, its a good form of wan redundancy, infact we have various clients doing this presently

Can you clarify a few things:

Does each site have run its own bgp ASN
Does each site run any internal IGP or IBGP
Are the wan connections running IBGP/EBGP
What routing process is the S2S connection running?
Are you performing any mutual redistribution
what routes are you receiving from the WAN - full/partial/default

Can you share a topology diagram please

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

cco@leferguson.com · ‎09-12-2024

@paul driver wrote:
Hello
Having dual sites with their own wan egress and also sharing say a dark fibre les connection is an applicable design, its a good form of wan redundancy, infact we have various clients doing this presently

Can you clarify a few things:
Does each site have run its own bgp ASN
Does each site run any internal IGP or IBGP
Are the wan connections running IBGP/EBGP
What routing process is the S2S connection running?
Are you performing any mutual redistribution
what routes are you receiving from the WAN - full/partial/default
Can you share a topology diagram please

The short answer is that I do not have access to the firewall where this information exists. This was a supposedly turn key solution purchased, to which we were to plug in an internal connection between sites. The vendor does not permit us to access their equipment. Our Cisco's connection to those firewalls is to simply treat them as a default gateway -- the internal cisco's are not exchanging any routing protocols with the vendor's firewall. The Firewall treats the Cisco as the next hop for all internal addresses (for either site), all statically routed.

We are running EIGRP in between the sites to track connectivity over our diverse internal paths, and the cisco routers are in HSRP pairs. We could change routing protocols between sites as needed.

I am not sure if the vendor's firewall is receiving routes from AT&T; I believe they have static routes in each firewall to specific services there. At best they probably have a "track" type check, but they can't pass routing protocols across the internal network. We also know, from our experiments, that we can fail one AT&T path and calls complete to both ends still, but believe that is at the application level above IP. This is a critical site for life safety and we can only do limited testing.

I could make up a diagram (I cannot share a real one), but it is literally a triangle -- Site 1, Site 2 and AT&T connected to each other. We control the links between Site 1 and Site 2 and those work fine, fully redundant and pass traffic appropriately for all addresses. The path from each site to AT&T is under AT&T's control on their end, and this firewall vendor's on our end.

**IF** we redo the connection to cut these firewalls out, clearly AT&T is going to have to work out how we peer with them, AS numbers, routes, etc. But the core of the question before we want to start down this path is, if we do that, are we going to be able to properly give a preference so that Site 1's return traffic goes to Site 1, and same for Site 2, but each will fail over to the other (for return traffic -- outbound we can handle).

We are reluctant to head down this path, and cut out part of a turn-key solution, especially since it is live and the original system long dismantled. Except, well, it's not working and they can't seem to fix it. Us taking over BGP was their suggestion, not ours. So trying to understand feasibility. I would assume AT&T would be cooperative in setting this up, I just don't know enough about AT&T to know if it works that way.

Here's a diagram but not sure it will help. We have control and visibility only to the cisco routers (really 2 HSRP pairs) and the yellow links.

Giuseppe Larosa · ‎09-13-2024

Hello cco@leferguson.com ,

Looking at your network diagram at the fact that private IP addresses are used without NAT on the firewalls makes me to think that AT&T is providing an MPLS L3 VPN service to your customer with two sites.

Both sites are currently advertising both prefixes 10.1.0.0/16 and 10.2.0.0/16 either with eBGP or by using static routes on the corresponding PE nodes.

>> Here's the thing - they say they can't change their firewall to do this "right" (in any of the several possibilities), and are suggesting that the AT&T network connection move to our Cisco router, and we advertise BGP in a way that will work.

if AT&T is providing a L3 VPN MPLS service the change can be successful only by working with their tech support.

The use of MED can be performed or not depending on how many BGP ASN of AT&T are involved.

You could try to use selective AS path prepending by prepending your own AS number to the prefix that you would like to make less attractive on site 1 it is 10.2.0.0/16 on site 2 it is 10.1.0./16.

Again , if this an MPLS L3 VPN service you need to work with AT&T tech stuff for this to be effective.

Hint: the PE node can strip private AS number (remove private AS ) or it can change each occurrence of them with its own AS number ( as-override).

This is why even selective AS path prepending may not work as intended without cooperation of the ISP.

So I would suggest to open a ticket with the provider to get more info about the offered service.

Hope to help

Giuseppe

cco@leferguson.com · ‎09-13-2024

Yeah... they are trying to arrange a meeting with AT&T as well as the equipment vendor. At present we have only heard from the equipment vendor. We did a Teams session to capture packet traces to prove to the vendor that the asymmetrically routed syn/ack packet was not making it from his firewall to our cisco's. That's when I was able to learn a bit about their setup, by watching screens fly by, and I talked him to showing me a few more more carefully. I never saw even a brand on the firewall (other than that company name), so I think it's something they have wrapped up and rebranded. The Tech is only allowed in the firewall GUI no CLI. He did show me the GUI BGP advertisement page (if the AS was there it went by too fast). Both sites were identical advertisement, and just a list of networks, no parameters, route maps, etc.

AT&T is delivering this via ethernet. It's quite possible they have a VPN related device on site, but in the limited viewing the firewall I had there were no signs of MPLS or any VPN. Their route table, NAT table and such (which kind of flew by) all looked like a simple point to point connection with a non-private IP address outside, and some specific address groups routed to that port (it also had a regular internet port which was its default gateway). If I had to guess, their "firewall" is OpenWRT or DD-WRT or something wrapped in their own GUI.

I spent a long time not believing NAT was not involved, but the outside captures were clearly routed to our LAN addresses, and clearly asymmetric (departing with a source address on site 2, with the ack arriving on site 1 to that same private destination).

I agree that getting AT&T involved is needed. My fear is that we are being pushed to take over this connection with the idea we can do it differently/better than their firewall, and trying to understand if there is a basic limitation of BGP or if it is their bad configuration. And reluctant because while I am pretty good with internal routing protocols, BGP and I are not acquainted. The last thing I want is spend a lot of time (and potentially down time) and end up with the same restriction but now our own fault.

Thanks for the observations. I'll return if I get more clarity.

MHM Cisco World · ‎09-13-2024

It hard without NAT

But if you cab make all traffic for both sites path via one FW only "" no load balance ""

The interconnect link use to direct traffic from one site to FW of other sites.

MHM

cco@leferguson.com · ‎09-13-2024

I am sorry, I do not quite understand what you are saying. Just use one side all the time except in failover? That's probably possible (if we take over from their firewall). Not exactly desirable.

Question: Let's assume for the moment that there are different AS numbers for each site, and we took over and advertised (only) the desired path from each. Is there something like a "track" statement that would let me change advertisements on the fly, so if site 2 went down, I could add an advertised route to site 1?

There's a propagation time delay, but we are going to have that anyway with routes with some kind of preference as the AT&T side has to see the absence of connection to the pier at (for example) site 2 and so route to site 1 instead. Would those times be similar?

MHM Cisco World · ‎09-13-2024

SiteA and SiteB

SiteA use FW-A to forward traffic

Site-B use FW-A to forward traffic (after it pass interconnect)

Site-B advertise prefix with less prefer (using MED or as prepend)

This make traffic ingress egress via FW-A

If it failed then siteB and SiteA will use FW-B

If you have Q you are free to ask

MHM

cco@leferguson.com · ‎09-13-2024

Got it. I think we could do that, but it's not the most desired setup, we want each county site to pass its own traffic if possible, and fail over only if it fails. While we have diverse paths between the sites, they are not high capacity or low latency (though we plan to change that with new microwave links). So this round-about path is functional but not desirable.

MHM Cisco World · ‎09-13-2024

So this round-about path is functional but not desirable.

Unfortunately Yes

MHM

cco@leferguson.com · ‎09-13-2024

Well, mostly functional. Most of the traffic appears to be routed by the firewall, but when there's a SYN from site 2 whose Ack/Syn response arrives at Site 1 the site 1 firewall is dropping it, so certain functions are not working. That is how we actually found out about the round-about routing. We were TOLD that each site handled its own traffic and never dug into it. When this came up and we looked, we found the between site path carrying return traffic to Site 2 for everything. Most of which is working, but not some.

Moving from the firewalls to the cisco even with the same BGP routing would "fix" the dropped packets as we wouldn't be doing stateful inspection, but we really would like to have the normal case where each site's traffic returns to that site. And they don't know how to turn off the inspection in the firewalls.

At present the preferred path between sites is a very slow fiber. It's quite reliable but is throttled at 10mbs for reasons lost in state politics. There's a secondary path over microwave but for county politics reasons I do not understand, we can't make it primary for the inter-site traffic.

There's talk of a new microwave hope to become primary that would be fast and low latency, but it won't happen for a year or two, its money is tied up in... yes, politics.

I don't understand BGP, but I understand BGP a lot better than politics.

So the desirable thing is have each site handle its own traffic except in failure modes. Which is also easier than explaining why traffic goes around the circle as well.

MHM Cisco World · ‎09-13-2024

Friend' both sites advertise same prefix

How external client know this IP from siteA or siteB?

If we config routing so there is one way for inbound and outbound traffic then there is no asymmetric.

It not issue how you config bgp.

MHM

cco@leferguson.com · ‎09-13-2024

I realize how it is configured now. My question is how it can be configured to work the way I want it to, so that there's a preference.

If this was OSPF or EIGRP it would figure out the shortest (or other criteria path). I realize BGP is different, but I was hoping there was a way that this could be configured so each side told AT&T to use its own path, but also offered (somehow) the alternate for failover. I apologize if I lack the correct terminology. If you are saying "BGP can't do that" -- OK. If you are saying I'm an idiot -- OK, noted. But I think my question is valid.

MHM Cisco World · ‎09-13-2024

Friends don't saying that about yourself, I am some time king of idiot in topic I little know about it.

So I will be with you try helping as much as I can.

I will draw topology with some note let check together

MHM

MHM Cisco World · ‎09-13-2024

what IGP run?
please check topology below