05-30-2025 09:21 AM
Hi Community Members.,
I set up a Failover /Load Balancing on a Cisco Router using Dual ISP and DUAL LAN segment
As shown below:
From the Edge Router: Config.
track 1 ip sla 1 reachability
!
track 2 ip sla 2 reachability
!
interface FastEthernet0/0
ip address 10.0.137.54 255.255.255.0
ip nat outside
duplex full
!
interface FastEthernet1/0
ip address 172.16.0.2 255.255.255.252
ip nat outside
duplex full
!
interface Ethernet2/0.10
description FIBER-CONNECTION
encapsulation dot1Q 10
ip address 192.168.10.1 255.255.255.0
ip nat inside
ip ospf 1 area 0
!
interface Ethernet2/0.20
description MICROWAVE-CONNECTION
encapsulation dot1Q 20
ip address 192.168.20.1 255.255.255.0
ip nat inside
ip ospf 1 area 0
!
ip nat inside source route-map FIBER-LINK interface FastEthernet0/0 overload
ip nat inside source route-map MICROWAVE-LINK interface FastEthernet1/0 overload
!
ip route 0.0.0.0 0.0.0.0 10.0.137.1 name FIBER-LINK track 1
ip route 0.0.0.0 0.0.0.0 172.16.0.1 name MICROWAVE-LINK track 2
!
ip sla 1
icmp-echo 4.2.2.2 source-ip 10.0.137.54
owner FIBER-LINK
ip sla schedule 1 life forever start-time now
ip sla 2
icmp-echo 172.16.0.1 source-ip 172.16.0.2
owner FIBER-LINK
ip sla schedule 2 life forever start-time now
access-list 101 permit ip 192.168.20.0 0.0.0.255 any
access-list 102 permit ip 192.168.20.0 0.0.0.255 any
!
route-map MICROWAVE-LINK permit 10
match ip address 102
set ip next-hop verify-availability 172.16.0.1 1 track 2
!
route-map FIBER-LINK permit 10
match ip address 101
set ip next-hop verify-availability 10.0.137.1 1 track 1
!
!
!
event manager applet CLEARIPNAT1
event syslog pattern "%TRACKING-5-STATE: 2 ip sla 2 reachability Up->Down"
action 1.0 cli command "enable"
action 2.0 cli command "route-map MICROWAVE-LINK permit 10"
action 3.0 cli command "set ip next-hop verify-availability 10.0.137.1 1 track 1"
action 4.0 cli command "clear ip nat translation *"
event manager applet CLEARIPNAT2
event syslog pattern "%TRACKING-5-STATE: 2 ip sla 2 reachability Down->Up"
action 1.0 cli command "enable"
action 2.0 cli command "route-map MICROWAVE-LINK permit 10"
action 3.0 cli command "no set ip next-hop verify-availability 10.0.137.1 1 track 1"
!
end
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Testing !!!!!!!!!!!!!
When one ISP is down:
EDGE_ROUTER#ping 4.2.2.2 source 192.168.20.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 192.168.20.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 204/210/216 ms
EDGE_ROUTER#trace 4.2.2.2 source 192.168.20.1
Type escape sequence to abort.
Tracing the route to b.resolvers.level3.net (4.2.2.2)
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.137.1 12 msec 4 msec 8 msec
2 10.10.35.1 12 msec 4 msec 12 msec
3 10.10.30.254 12 msec 4 msec 8 msec
EDGE_ROUTER#trace 4.2.2.2 source 192.168.10.1
Type escape sequence to abort.
Tracing the route to b.resolvers.level3.net (4.2.2.2)
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.137.1 12 msec 60 msec 12 msec
2 10.10.35.1 8 msec 8 msec 12 msec
3 10.10.30.254 8 msec 12 msec 8 msec
%%%%%%%%%%%%% WHEN BOTH ISP ARE ALL UP AND WORKING %%%%%%%%
EDGE_ROUTER#ping 4.2.2.2 source 192.168.20.1
Packet sent with a source address of 192.168.20.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 200/202/212 ms
EDGE_ROUTER#trace 4.2.2.2 source 192.168.20.1
Type escape sequence to abort.
Tracing the route to b.resolvers.level3.net (4.2.2.2)
VRF info: (vrf in name/id, vrf out name/id)
1 10.0.137.1 24 msec * 8 msec
2 10.0.137.1 24 msec
10.10.35.1 12 msec
10.0.137.1 24 msec
3 10.10.10.254 8 msec
10.10.35.1 24 msec
10.10.10.254 8 msec
4 10.10.10.254 20 msec
10.46.6.185 12 msec
10.10.10.254 24 msec
5 200.100.65.74 8 msec
10.46.6.185 24 msec
200.100.65.74 8 msec
6 200.100.65.74 20 msec
200.100.65.83 8 msec
200.100.65.74 20 msec
7 200.100.65.18 12 msec
200.100.65.83 24 msec
200.100.65.18 8 msec
8 200.100.65.18 28 msec
200.100.65.22 12 msec
200.100.65.18 24 msec
...Note the above trace pattern is same from the other LAN Segment
05-30-2025 03:13 PM - edited 05-30-2025 11:05 PM
Edit:
To summarise most issues I think you are having
1. For this point, I don't think your traceroutes are taking the expected routes because Policy Based Routing (PBR) does not apply to traffic generated locally by the router. This is comparative to situations where ACLs are applied to interfaces via the 'access-group' command in the outbound direction - traffic generated by the router itself is not processed and is allowed through, regardless of if a match would've occurred.
So what happens when generating traceroutes/pings from the router itself under different circumstances?:
Since both static routes have an AD of 1, both are inserted into the RIB and CEF table. In this case, the Fiber link should always choose to egress out the Fiber link if there is an ECMP (equal cost) route to 4.2.2.2. If there is a single route that is better, it will choose that instead. However, in this case with an ECMP route to 4.4.4.2 (two default static routes), when the Fiber link is lost, it should attempt to send traffic out of the Microwave link even if it is sourced from the Fiber link interface IP. As another example, when you are sourcing the traceroutes from the internal LAN side interfaces like e2/0.x the CEF table will use hash buckets to assign the same flows to the same next-hop. So your traceroutes from same source to same dest will always leave through one of the picked internet links. Once it picks a link, the flow should stay over that link. Since CEF is being left to make the decision, The 'Fiber LAN' could try to go over the Microwave link, etc.
If you want traceroutes sourced from e2/0.20 to go over the Microwave link and e2/0.10 to traverse the Fiber link, you will 'local PBR' applied globally to the router (just affecting traffic initiated by the router).
Edit:
2. Because your access-lists 101 and 102 are the same (192.168.20/24), the only traffic that can be NAT'd is the traffic sourced from 192.168.20.0/24. So traffic from clients in the 192.168.10.0/24 subnet should have no connectivity as they are unable to perform NAT. They may have connectivity because this is a lab, but an ISP in the real world of course would usually filter this out and use BCP 38 to block IP source spoofing. I'm not sure how much you are simulating?
3. I think the PBR needs to be applied under the LAN side interfaces. The 'Fiber link' route-map needs to be attached to e2/0.10. The 'Microwave link' route-map needs to be attached to e2/0.20 with 'ip policy route-map <route-map_name>
4. The result of the router only following the RIB and CEF table is that if you have an upstream failure somewhere other than the directly connected link, the failure may be detected, resolved, detected, resolved, and continues in a loop. For example, if the Fiber link 4.2.2.2 becomes unreachable because of an upstream failure (the directly connected cable between the edge router and the ISP router is still OK) the pings will start to fail, the track object will go down. Since the static default route is monitoring the track object, the static route will be removed from the RIB. It can never reach 4.2.2.2 ever again because the route is removed and is only added back on the success of more pings. If the pings can reach 4.2.2.2 through the Microwave link as the next best route, the track object will come back up, the original static default route over the Fiber link returns to the RIB. However, the upstream failure has still not been fixed yet potentially, so it goes down again and reverts back to the Microwave link. This would probably happen every 60 seconds since that is the default polling period for IP SLA and the frequency is not set in the config? In any of the cases, it is bad. This is where a /32 route to 4.2.2.2 would work. But to isolate the route only to a source IP of the Fiber link interface IP, local PBR needs to be used I think. Again, this probably doesn't happen in the lab if it's not simulating proper filtering
So there are quite a few things missing from my perspective but hopfully those are some pointers in the right direction perhaps?
05-31-2025 12:52 AM
I dont see you use PBR in original post? Which is one point of points AI reply to you.
MHM
06-02-2025 12:36 PM
Hi Royalty.,
Thanks for the response. However after the changes the pattern still persist...
Tracing to 4.2.2.2 seems to be oscillating in between the links:
EDGE_ROUTER(config)#do trace 4.2.2.2 source 192.168.20.1
Type escape sequence to abort.
Tracing the route to b.resolvers.level3.net (4.2.2.2)
VRF info: (vrf in name/id, vrf out name/id)
1 *
172.16.0.1 16 msec *
2 10.0.137.1 24 msec * 24 msec
3 *
10.10.101.1 24 msec *
4 10.10.101.254 20 msec * 28 msec
5 *
10.74.6.185 20 msec *
6 112.100.65.74 28 msec * 20 msec
7 *
EDGE_ROUTER(config)#
EDGE_ROUTER(config)#
EDGE_ROUTER(config)#do trace 4.2.2.2 source 192.168.10.1
Type escape sequence to abort.
Tracing the route to b.resolvers.level3.net (4.2.2.2)
VRF info: (vrf in name/id, vrf out name/id)
1 *
10.0.137.1 4 msec *
2 10.10.101.1 0 msec
10.0.137.1 20 msec
10.10.101.1 8 msec
3 10.10.101.1 20 msec
10.10.101.254 28 msec
10.10.101.1 16 msec
4 10.74.6.185 8 msec
10.10.101.254 20 msec
10.74.6.185 8 msec
5 10.74.6.185 20 msec
112.100.65.74 12 msec
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
EDGE_ROUTER#sh run | sec route-map
ip policy route-map FIBER_LINK-LAN
ip policy route-map MICROWAVE-LAN
ip nat inside source route-map FIBER_LINK-LAN interface FastEthernet0/0 overload
ip nat inside source route-map MICROWAVE-LAN interface FastEthernet1/0 overload
route-map MICROWAVE-LAN permit 10
match ip address 102
set ip next-hop verify-availability 172.16.0.1 1 track 2
set ip next-hop verify-availability 10.0.137.1 2 track 1
route-map FIBER_LINK-LAN permit 10
match ip address 101
set ip next-hop verify-availability 10.0.137.1 1 track 1
set ip next-hop verify-availability 172.16.0.1 2 track 2
route-map MICROWAVE-LINK permit 10
match ip address 106
match interface Ethernet2/0.20
route-map FIBER-LINK permit 10
match ip address 105
match interface Ethernet2/0.10
EDGE_ROUTER#
EDGE_ROUTER#
EDGE_ROUTER#sh run | sec access
access-list 101 permit ip 192.168.10.0 0.0.0.255 any
access-list 102 permit ip 192.168.20.0 0.0.0.255 any
access-list 105 permit ip 192.168.10.0 0.0.0.255 any
access-list 106 permit ip 192.168.20.0 0.0.0.255 any
EDGE_ROUTER#
EDGE_ROUTER#
EDGE_ROUTER#
EDGE_ROUTER#sh run | sec ip route
ip route 0.0.0.0 0.0.0.0 10.0.137.1 name FIBER-LINK track 1
ip route 0.0.0.0 0.0.0.0 172.16.0.1 name MICROWAVE-RADIO track 2
DGE_ROUTER#sh run int Ethernet2/0.10
Building configuration...
Current configuration : 194 bytes
!
interface Ethernet2/0.10
description FIBER-CONNECTION
encapsulation dot1Q 10
ip address 192.168.10.1 255.255.255.0
ip nat inside
ip policy route-map FIBER_LINK-LAN
ip ospf 1 area 0
end
EDGE_ROUTER#
EDGE_ROUTER#
EDGE_ROUTER#
EDGE_ROUTER#sh run int Ethernet2/0.20
Building configuration...
Current configuration : 197 bytes
!
interface Ethernet2/0.20
description MICROWAVE-CONNECTION
encapsulation dot1Q 20
ip address 192.168.20.1 255.255.255.0
ip nat inside
ip policy route-map MICROWAVE-LAN
ip ospf 1 area 0
end
06-01-2025 03:37 PM
track 1 ip sla 1 reachability
!
track 2 ip sla 2 reachability
!
interface FastEthernet0/0
ip address 10.0.137.54 255.255.255.0
ip nat outside
duplex full
!
interface FastEthernet1/0
ip address 172.16.0.2 255.255.255.252
ip nat outside
duplex full
!
interface Ethernet2/0.10
description FIBER-CONNECTION
encapsulation dot1Q 10
ip address 192.168.10.1 255.255.255.0
ip nat inside
ip policy route-map FIBER_LINK-LAN
ip ospf 1 area 0
!
interface Ethernet2/0.20
description MICROWAVE-CONNECTION
encapsulation dot1Q 20
ip address 192.168.20.1 255.255.255.0
ip nat inside
ip policy route-map MICROWAVE-LAN
ip ospf 1 area 0
!
ip nat inside source route-map FIBER-LINK interface FastEthernet0/0 overload
ip nat inside source route-map MICROWAVE-LINK interface FastEthernet1/0 overload
!
ip sla 1
icmp-echo 4.2.2.2 source-ip 10.0.137.54
owner FIBER-LINK
ip sla schedule 1 life forever start-time now
ip sla 2
icmp-echo 172.16.0.1 source-ip 172.16.0.2
owner FIBER-LINK
ip sla schedule 2 life forever start-time now
access-list 101 permit ip 192.168.10.0 0.0.0.255 any
access-list 102 permit ip 192.168.20.0 0.0.0.255 any
!
route-map MICROWAVE-LAN permit 10
match ip address 102
set ip next-hop verify-availability 172.16.0.1 1 track 2
set ip next-hop verify-availability 10.0.137.1 2 track 1
!
route-map FIBER_LINK-LAN permit 10
match ip address 101
set ip next-hop verify-availability 10.0.137.1 1 track 1
set ip next-hop verify-availability 172.16.0.1 2 track 2
!
route-map MICROWAVE-LINK permit 10
match ip address 102
match interface Ethernet2/0.20
!
route-map FIBER-LINK permit 10
match ip address 101
match interface Ethernet2/0.10
06-02-2025 06:12 PM - edited 06-02-2025 06:22 PM
Hi!
Thanks for the response. However after the changes the pattern still persist...
Tracing to 4.2.2.2 seems to be oscillating in between the links:
I mentioned in my previous post that traffic generated by the router is not influenced by PBR. You need to apply Local PBR for this to work. To speed things along for you, I will post a sample configuration below which will re-use (but need re-creating) the existing route-maps that are applied to the LAN interfaces. Please note this is does reflect best practice and full production configuration:
route-map ROUTER-POLICY permit 10
match ip address 102
set ip next-hop verify-availability 172.16.0.1 1 track 2
set ip next-hop verify-availability 10.0.137.1 2 track 1
!
route-map ROUTER-POLICY permit 20
match ip address 101
set ip next-hop verify-availability 10.0.137.1 1 track 1
set ip next-hop verify-availability 172.16.0.1 2 track 2
!
ip local policy route-map ROUTER-POLICY
As for the rest of the configuration, I was hoping for and am glad to see the 'match interface' command attempted. The only problem is that you have got the logic slightly the wrong way around. You need to apply the 'match interface' command to the egress interfaces. Remember that NAT is applied, in this case, at the point of egress, i.e. after the routing decision has already been made. Below is an example of the changes you need to make, hopefully this will make sense:
ip access-list standard NAT-ACL
10 permit 192.168.10.0 0.0.0.255
20 permit 192.168.20.0 0.0.0.255
!
route-map FIBER-LINK permit 10
match ip address NAT-ACL
match interface FastEthernet0/0
!
route-map MICROWAVE-LINK permit 10
match ip address NAT-ACL
match interface FastEthernet1/0
In the above, we are informing the router to NAT traffic regardless of the source subnet, but to apply the correct translation address based on the egress interface chosen for the flow, which is decided by PBR. This is how we can achieve the failover.
You also need to make sure that the FastEthernet0/0 interface ALWAYS uses the route out of its own interface to reach 4.2.2.2. Else, you will get the behaviour I have described in my first post where track object 1 will go down and stay down indefinitely. Either that, or you get stuck in a situation where the track object 1 will constantly flap between up and down and cause major problems with intermittent connectivity. Again, not best practice, and it could be done in Local PBR, but as an example:
ip route 4.2.2.2 255.255.255.255 10.0.137.1
Finally, back to EEM. You could use this to clear the ip nat translation. In theory, with the configuration I have provided, you should be able to suffer upstream failures on the Fiber link, and a direct / next-hop failure on the Microwave link, and still have complete failover:
event manager applet CLEAR_FIBER_NAT_TRANS
event track 1 state any
action 1.0 cli command "enable"
action 1.1 cli command "clear ip nat translation *"
!
event manager applet CLEAR_MICROWAVE_NAT_TRANS
event track 2 state any
action 1.0 cli command "enable"
action 1.1 cli command "clear ip nat translation *"
Please let me know how you get on!
06-02-2025 10:56 PM - edited 06-02-2025 11:08 PM
Hello @2D-Technology Services @Royalty
I believe you are on the correct path in your configuration however it does look like a couple of things need tweaking.
@Royalty is correct on the route-maps (NAT & PBR) but for the NAT use an additional ACL to include both lan subnets and for the PBR RMs I would also suggest including the actual next hops as well for failover AND you need to policy base route on traffic traversing the rtr NOT orignating from it
Also suggest including a Boolean And statement and then bind this with your ip sla tracking and primary default static route
As for you ip sla 1 probe you could change it to point the upstream to the connected interface ip as otherwise you may potentially blackhole half of your traffic if/when a failure is incurred on ISP1 as it will not be able to recovery as the default route isn’t available or as suggested use a local PBR but you need to make sure ONLY that local policy route is used in one direction so I would suggest use a additional route map and null out any potential to route leak via isp 2
Lastly is OSPF running on that rtr of is that historical due to the fact you show default static routing?
Please see attached file for a possible solution to your OP.
(Apologies if you see multiple copies of this is post, I'm having an issue with my account duplicating them!)
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide