05-11-2022 08:30 AM - edited 05-12-2022 06:07 AM
Hi everyone,
I'm having a problem with my VPN Setup and the Routing.
I have Site A which consists of two Redundant Cisco ISR 4300 (v 17.03.05) Routers in Active / Standby Mode and i have Site B which consists of a Cisco ISR 1100 (v 16.12.04).
About the setup:
Site A:
The Cisco ISRs have two Interfaces with a Public Subnet. GigabitEthernet0/0/0.900 is the "outside" Subnet which is the transit Subnet to communicate with the Datacenter ISP for advertising our own Public Subnet via BGP which is on GigabitEthernet0/0/1 ("inside")
Active ISR 4300:
Interface GigabitEthernet0/0/0.900 description ISP BGP encapsulation dot1Q 900 ip address 7.7.7.2 255.255.255.254 crypto map IKEv2 ! interface GigabitEthernet0/0/1 description TEST-FW1 ip address 3.3.3.4 255.255.255.0 standby version 2 standby 1 ip 3.3.3.3 standby 1 priority 110 standby 1 preempt delay minimum 60 standby 1 name HA-WAN negotiation auto
Standby ISR 4300:
Interface GigabitEthernet0/0/0.900 description ISP BGP encapsulation dot1Q 900 ip address 6.6.6.2 255.255.255.254 crypto map IKEv2 ! interface GigabitEthernet0/0/1 description TEST-FW2 ip address 3.3.3.5 255.255.255.0 standby version 2 standby 1 ip 3.3.3.3 standby 1 name HA-WAN negotiation auto
The IPSEC Profile and config for Site A looks identical on both:
crypto ikev2 proposal IKEv2 encryption aes-cbc-256 prf sha256 sha512 integrity sha256 group 14 ! crypto ikev2 policy 20 proposal IKEv2 crypto ikev2 profile US1 description US1 ISR VPN match identity remote any authentication remote pre-share key XXXXXXXXXXXXXXXXXXXXXXX authentication local pre-share key XXXXXXXXXXXXXXXXXXXXXX crypto ipsec transform-set IKEv2 esp-aes 256 esp-sha256-hmac mode tunnel crypto dynamic-map ISR-Dynamic 10 set security-association lifetime seconds 86400 set security-association replay window-size 128 set transform-set IKEv2 set ikev2-profile US1 match address 105 reverse-route ! crypto map IKEv2 10 ipsec-isakmp dynamic ISR-Dynamic
The B-Side consists of one Cisco ISR 1100 which has a dynamic Public IP, it runs on either DSL or Cellular:
VPN Config:
crypto ikev2 proposal IKEv2_DH14 encryption aes-cbc-256 prf sha256 sha512 integrity sha256 group 14 ! crypto ikev2 policy 2 proposal IKEv2_DH14 ! crypto ikev2 keyring US1-Key peer TEST-ISR1 address 7.7.7.2 pre-shared-key local XXXXXXXXXXXXXXXXXXX pre-shared-key remote XXXXXXXXXXXXXXXX ! peer TEST-ISR2 address 6.6.6.2 pre-shared-key local XXXXXXXXXXXXXXXXXXXXXXXXXX pre-shared-key remote XXXXXXXXXXXXXXXXXXXXXXXXXXXXX ! crypto ikev2 profile US1 description US1 match identity remote address 7.7.7.2 255.255.255.255 match identity remote address 6.6.6.2 255.255.255.255 identity local key-id XXXXXXXXXXXXXXXXX authentication remote pre-share authentication local pre-share keyring local US1-Key ! crypto ipsec transform-set IKEv2 esp-aes 256 esp-sha256-hmac mode tunnel ! crypto map outside_map 10 ipsec-isakmp set peer 7.7.7.2 default set peer 6.6.6.2 set security-association lifetime seconds 86400 set security-association replay window-size 128 set transform-set IKEv2 set ikev2-profile US1 match address 101
This the setup how we got it working. It looks like this:
So the VPN Tunnel on Site A terminates on the "outside" Transit Subnet to the ISP. But its performing really bad.
The B-Side is on private DSL Lines or cellular, so it drops from time to time, so its just a network dropout. What happens quite often is that the B-Side builds a tunnel to the default and the secondary peer when such an event happens.
And it stays connected to both Tunnels, this causes the traffic flow to stop.
I tried to create an EEM-Script for this which Tracks Internal Traffic (Ping)
event manager applet RESET-VPN-US1_Track60 event syslog pattern "%TRACK-6-STATE: 60 list boolean or Up -> Down" action 001 cli command "enable" action 002 cli command "clear crypto session remote 7.7.7.2" action 003 cli command "clear crypto session remote 6.6.6.2" action 099 syslog msg "US1 VPN tunnel cleared due to Track60 Recovery"
But still to many sites get stuck with two Tunnels and no traffic flow.
The optimal solution would be really easy: Attaching the cryptomap to the interface GigabitEthernet0/0/1 interface on the HSRP IP. That way we would not have to add two IPs in the cryptomap and could just connect to 3.3.3.3
So:
interface GigabitEthernet0/0/1 description TEST-FW1 ip address 3.3.3.4 255.255.255.0 standby version 2 standby 1 ip 3.3.3.3 standby 1 priority 110 standby 1 preempt delay minimum 60 standby 1 name HA-WAN negotiation auto crypto-map redundancy HA-WAN
Then it would look like this:
But this does not work. The Tunnel is established successfully and the B-Site has the route to the A-Side. But the A-Side can not return traffic as it will answer through the Interface GigabitEthernet0/0/0.900. Even though RRI is turned on.
We tried to solve this problem with Cisco TAC but they couldn't solve it and said its not supported. Unfortunately they could not provide a Solution.
What we have also tried as a workaround:
Added the "set security-association idletime 60" command. So after 60 seconds it will try to connect to the default peer. It works, after roughly 60-90 seconds its connected to the default peer, but it is still connected to the Standby-Peer and will stay there for multiple hours until the SA expires or another ISP drops kicks it. So again no traffic flow.
We have thought about using a FQDN for the both routers and then putting a Route53 HealthCheck to failover between them, but as the AWS healthcheckers can only check for TCP and change their public IP, we don't want to expose a responding Port to the Internet or constantly maintain ACLs.
OSPF or other routing protocols is not an option right now, we would have to add Interfaces for each Tunnel. We are talking 50+ devices so maintenance would be ugly even though we have Prime.
Could someone help me? I can't be the only one running such a setup, i can't believe there is no working setup for this.
Thank you very much
05-16-2022 10:35 AM
Hello,
sorry for my late reply. I will lab this up...thanks for the sanitized configs...
05-16-2022 11:07 AM
Hello,
I am just thinking: why don't you let your EEM script not shut down the primary tunnel interface (shut) in case of a failover situation ? That way, the 'redundant' tunnel could never be established...
05-13-2022 07:32 AM
If you want to prefer one endpoint over the other (Primary/Backup) then you could try prioritizing the peers on the ISR1100, something like the following ??
05-16-2022 07:38 AM
In my first post i have posted the whole config. You can see that we are using prioritizing already but it does not work correctly.
05-13-2022 08:59 AM
Site-b site-a-r1 site-a-r2
Site-b established ipsec to site-a-r1
Site-b loss it isp1 shift to isp2
Here is trick
When ipsec re-established ipsec when interface crypto map config under it is shut down.
Since isp1 is show down
Site-b re-established ipsec toward site-a-r1 ......here issue
Site-1-r1 still have active ipsec phase1 so it will not respond.
Site-b now see no response from site-a-r1 it will established ipsec with site-1-r2 and here traffic drop.
Solution
In site-b
Config crypto map under each isp inteface
Config loopback use it as ipsec source and as ipsec ID.
This make ipsec up even if isp is shift.
Also config keepalive ipsec in both site.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide