I have built a Proof of Concept lab to test a simple PfR design for Inbound and Outbound load balancing but I am having mixed results. You can see a diagram of the network topology here:
The diagram details SITE A and there is a SITE B which is an exact mirror of SITE A apart from it has different addressing. 10.128.62.0/24 and 10.128.63.0/24 are the subnets that connect the iPerf PC's at SITE B. All routers are running 15.2(4)M5 and they are 3945E's.
Each site has two client subnets which are 10.128.60.0/24 and 10.128.61.0/24 at SITE A and .62 and .63 at SITE B. I have a PC on each subnet which are running iPerf tests. These are TCP tests and are policed at 5Mb. A bidirectionl test runs between 10.128.60.0 at SITE A and 10.128.62 at SITE B and also between 10.128.61.0 at SITE A and 10.128.63.0 at SITE B. I was expecting that PfR would keep one transfer on one BR and move the other transfer to the other BR.
eBGP is configured between SITE A and the Cloud and also SITE B and the Cloud. R1 and R2 at Site A are also iBGP neighbors. Internally they run EIGRP as the IGP and mutual redistribution is configured between EIGRP and BGP on both routers. R1 is configured as the primary link for inbound and outbound traffic using a better seed metric for EIGRP to ensure it is chosen outbound with BGP prepending on R2 to ensure R1 is also chosen inbound.
After running the setup I have encountered the following issues:
Links are not load balanced – this varies but what I am seeing right now is that R1 is sending roughly 10Mb outbound and receiving 5Mb inbound. The other 5Mb is received inbound on R2 but R2 has no outbound. At SITE B R3 is sending and receiving at 10Mb and R4 is carrying no traffic.
PfR is causing an outage where traffic is lost for a period of time – I can see an outage almost a full minute where the iperf is reporting 0 bytes per second – this happened once after about a minute of successful transfers and then worked fine after the outage.
I suspect these issues are “user” error rather than PfR itself. I have included full configs for SITE A and SITE B and also a log capture from SITE A. If anyone could have a look through and see if I have done anything incorrectly or could give me some pointers I would appreciate it.
I had a notehr look today and fixed a few problems with my configs and this is now working although I intend to do further tests.
1) Missing "local interface" on one of the BR's
2) WAN interfaces had incorrect masks
3) Incorrectly configured "absolute" on R3 and R4 instead of precent which limited interface to 100K instead of 10Mb
I'm just now familiarizing myself with PfR, I think it's truly a great solution. My company unfortunately makes use of a lot of static routing, I'd like to set the path to enhancements and introduce dynamic routing in combination to PfR as a solution to help improve the overall performance of application-based routing, something not obtained by solely through dynamic routing protocols.
Was this simple to install and how does PrF react on MPLS networks? We currently have a BGP hand off to the provider whereby the packet is encapsulated onto MPLS, however is OSPF and or EIGRP a requirement on internally in effort for PfR to work properly?
It was interesting to see that the solution to your problem was related to a simple yet very common mistakes (incorrect subnet masks etc.) Good catch and helpful for others to keep in mind.