01-22-2018 08:07 PM - edited 03-12-2019 04:56 AM
I have a scenario where when the Primary ISP (the ISP with a tracked route fails) the backup ISP takes over and the remote L2L tunnels begin flapping between the primary ISP and the backup ISP. This only occurs on failover to the backup ISP. If I failback to the primary ISP the tunnel flapping stops.
crypto map VPN_Salas interface Qwest_Backup
crypto map VPN_Salas interface Cox_Primary
crypto isakmp identity address
crypto ikev1 enable Qwest_Backup
crypto ikev1 enable Cox_Primary <<<< If I remove this entry the Backup ISP tunnels establish and the flapping stops.
This must be code related, because we upgraded from a 5505 to a 5506 and started seeing these issues. I'm running 9.2. Whats confusing is I thought that when a route track goes down and the route is pulled from the routing table all associations to that Interface name are removed from service until restoration. So I don't know why the ASA thinks the "Cox_Primary" connection is still in services to initiate tunnel requests.
01-22-2018 10:58 PM
Hello,
Can you please share your sla monitoring configuration. It is possible that sla monitoring is flapping at the time and causing the links to switch between 2 ISPs.
Also, syslogs during this time will be a good help to find if the sla monitoring is working as expected.
Regards,
AJ
01-23-2018 10:40 AM - edited 01-23-2018 10:41 AM
When the failover is active I have physically shutdown the port to the Cox_Primary Interface. so the routing tables looks like this with no entries to Cox_Primary. This is what has surprised me about the VPN tunnel flapping being attempted through the Cox_Primary Interface.:
Gateway of last resort is [qwest_gateway_per_ip] to network 0.0.0.0
S* 0.0.0.0 0.0.0.0 [254/0] via [qwest_gateway_per_ip], Qwest_Backup
C [qwest_ip_subnet] 255.255.255.248 is directly connected, Qwest_Backup
L [qwest_cer_ip] 255.255.255.255 is directly connected, Qwest_Backup
C 169.254.1.0 255.255.255.252 is directly connected, nlp_int_tap
L 169.254.1.1 255.255.255.255 is directly connected, nlp_int_tap
C 192.168.1.0 255.255.255.0 is directly connected, LANInterface
L 192.168.1.1 255.255.255.255 is directly connected, LANInterface
route Cox_Primary 0.0.0.0 0.0.0.0 [cox_gateway_per_ip] 1 track 1
route Qwest_Backup 0.0.0.0 0.0.0.0 [qwest_gateway_per_ip] 254
sla monitor 100
type echo protocol ipIcmpEcho 8.8.8.8 interface Cox_Primary
num-packets 3
frequency 10
sla monitor schedule 100 life forever start-time now
01-23-2018 05:43 PM
Hi,
What is the configuration on other end? If the other end got 2 crypto map statements/configs with 'Cox_Primary' as high priority (low number), they try to initiate tunnel to primary IP. If that is the case, as test remove related crypto map config when you admin down primary ISP interface.
hth
MS
01-23-2018 08:19 PM - edited 01-23-2018 08:24 PM
Here is one of the other sites. There are 5 other sites which have L2L tunnels to this flapping site.
crypto ipsec ikev1 transform-set ESP-3DES-SHA esp-3des esp-sha-hmac
crypto ipsec security-association pmtu-aging infinite
crypto map VPN 1 match address Central_North_VPN_Traffic
crypto map VPN 1 set peer [Cox_Primary_IP] [Qwest_Backup_IP]
crypto map VPN 1 set ikev1 transform-set ESP-3DES-SHA
crypto map VPN interface primary crypto map VPN interface secondary
crypto ca trustpool policy crypto isakmp identity address crypto ikev1 enable primary crypto ikev1 enable secondary crypto ikev1 am-disable crypto ikev1 policy 1 authentication pre-share encryption 3des hash sha group 2 lifetime 28800
!
tunnel-group [Cox_Primary_IP] type ipsec-l2l
tunnel-group [Cox_Primary_IP] ipsec-attributes
ikev1 pre-shared-key *****
!
tunnel-group [Qwest_Backup_IP] type ipsec-l2l
tunnel-group [Qwest_Backup_IP] ipsec-attributes
ikev1 pre-shared-key *****
"If that is the case, as test remove related crypto map config when you admin down primary ISP interface."
This is a good idea and as you can see how I have the config setup. How would the tunnel request to the [Cox_Primary_IP] ever get serviced if that IP isn't reachable, since I've admined down the interface this IP belongs to? What I'm finding is the remote sites actually recieve a teardown
Here's the ip sla stats during failover tonight
Entry number: 100 Modification time: 16:16:36.033 ARIZONA Sun Jan 21 2018 Number of Octets Used by this Entry: 2056 Number of operations attempted: 16731 Number of operations skipped: 0 Current seconds left in Life: Forever Operational state of entry: Active Last time this entry was reset: Never Connection loss occurred: FALSE Timeout occurred: TRUE Over thresholds occurred: FALSE Latest RTT (milliseconds): NoConnection/Busy/Timeout Latest operation start time: 14:44:46.035 ARIZONA Tue Jan 23 2018 Latest operation return code: Timeout RTT Values: RTTAvg: 0 RTTMin: 0 RTTMax: 0 NumOfRTT: 0 RTTSum: 0 RTTSum2: 0
Here is the data giving proof that the [Cox_Primary_IP] is still initiating a request. Somehow traffic is being generated from this IP even though the interface is down. As you can see the site with the VPN config above is receiving a a teardown request from this [Cox_Primary_IP]
PHASE 2 COMPLETEs to the Qwest_Backup_IP
Jan 23 19:48:30 [IKEv1]Group = Qwest_Backup_IP, IP = Qwest_Backup_IP, PHASE 2 COMPLETED (msgid=efe4e93d)
We see new phase 1 begin from [Cox_Primary_IP] than MSG2's which represent an unreachable connection.
Jan 23 19:48:58 [IKEv1]Group = Qwest_Backup_IP, IP = Qwest_Backup_IP, Connection terminated for peer Qwest_Backup_IP. Reason: Peer Terminate Remote Proxy 192.168.1.0, Local Proxy 192.168.2.0 Jan 23 19:48:58 [IKEv1 DEBUG]Group = Qwest_Backup_IP, IP = Qwest_Backup_IP, Active unit receives a delete event for remote peer Qwest_Backup_IP
Jan 23 19:48:58 [IKEv1 DEBUG]Group = Qwest_Backup_IP , IP = Qwest_Backup_IP , IKE Deleting SA: Remote Proxy 192.168.1.0, Local Proxy 192.168.2.0
Jan 23 19:49:06 [IKEv1]IP = Cox_Primary_IP , IKE Initiator: New Phase 1, Intf inside, IKE Peer Cox_Primary_IP local Proxy Address 192.168.2.0, remote Proxy Address 192.168.1.0, Crypto map (VPN) Jan 23 19:49:38 [IKEv1 DEBUG]IP = Cox_Primary_IP , IKE MM Initiator FSM error history (struct &0x00002aaac232a9f0) <state>, <event>: MM_DONE, EV_ERROR-->MM_WAIT_MSG2, EV_RETRY-->MM_WAIT_MSG2, EV_TIMEOUT-->MM_WAIT_MSG2, NullEvent-->MM_SND_MSG1, EV_SND_MSG-->MM_SND_MSG1, EV_START_TMR-->MM_SND_MSG1, EV_RESEND_MSG-->MM_WAIT_MSG2, EV_RETRY
This cycle of completing PHASE 2 to the Qwest_Backup_IP and tearing it down keeps continuing until the Cox_Primary_IP is restored.
01-23-2018 08:46 PM
very similar to this flapping site
crypto ipsec ikev1 transform-set ESP-3DES-SHA esp-3des esp-sha-hmac crypto ipsec security-association pmtu-aging infinite crypto map VPN 1 match address Central_North_VPN_Traffic crypto map VPN 1 set peer Cox_Primary_IP Qwest_Backup_IP crypto map VPN 1 set ikev1 transform-set ESP-3DES-SHA
crypto map VPN interface primary
crypto map VPN interface secondary
crypto ca trustpool policy
crypto isakmp identity address
crypto ikev1 enable primary
crypto ikev1 enable secondary
crypto ikev1 am-disable
crypto ikev1 policy 1
authentication pre-share
encryption 3des
hash sha
group 2
lifetime 28800
whats strange is phase 2 completes to the qwest_primary_ip and than there's a teardown request to from the cox_Primary_ip. .
Jan 23 19:48:30 [IKEv1]Group = Qwest_Backup_IP, IP = Qwest_Backup_IP, PHASE 2 COMPLETED (msgid=efe4e93d)
Jan 23 19:48:58 [IKEv1]Group = Qwest_Backup_IP, IP = Qwest_Backup_IP, Session is being torn down. Reason: User Requested
Jan 23 19:48:58 [IKEv1]Ignoring msg to mark SA with dsID 1073152 dead because SA deleted
Jan 23 19:49:06 [IKEv1 DEBUG]Pitcher: received a key acquire message, spi 0x0
Jan 23 19:49:06 [IKEv1]IP = Cox_Primary_IP, IKE Initiator: New Phase 1, Intf inside, IKE Peer Cox_Primary_IP local Proxy Address 192.168.2.0, remote Proxy Address 192.168.1.0, Crypto map (VPN)
This cycle continuous of phase 2 complets and teardowns until I restore service to the Cox_PRimary_IP.
01-25-2018 03:08 AM
Hi,
Make sure no phase1/phase2 session got stuck when you bring down the primary. Use 'Clear crypto' commands to clear any such connections (it is hard to see that happen..but based on your issue, make sure thats not the case).
Test by enabling 'isakmp keepalive threshold retry' under tunnel-group commands.
hth
MS
01-25-2018 08:01 PM
The other tunnels do clear out. I continuously run this command and see them all clear and begin establishing on the Qwet_Backup_IP. I believe this is occurring because somehow that Cox_Primary_IP is sending traffic. could the Qwet_Backup_IP tunnels be encapsulating traffic with an inner IP as the Cox_Primary_IP? this is a long stretch, just out of ideas. the default tunnel group has this keepalive enabled. Is that enough?
sh cryp isa sa
tunnel-group DefaultRAGroup ipsec-attributes
isakmp keepalive threshold 10 retry 2
01-28-2018 07:27 AM
Hi,
Try use this under primary tunnel group.
isakmp keepalive threshold 10 retry 2
Apart from VPN behaviour, how is the internet access via secondary provider. You notice any discrepancy? You noticed it only after upgrading ASA to 5506 (if I remember correct model). If so, did you check for any bugs in the code you are running?
Thx
MS
01-29-2018 07:51 PM
ok should this be at all sites including this problem site or just the problem site?
isakmp keepalive threshold 10 retry 2
Apart from VPN behaviour, how is the internet access via secondary provider. You notice any discrepancy? You noticed it only after upgrading ASA to 5506 (if I remember correct model). If so, did you check for any bugs in the code you are running?
the internet is working great via 2nd provider. no issues. yes, just after upgrading to 5506. I have the latest code on:
Release 9.6.2 Interim
|
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide