cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3071
Views
0
Helpful
11
Replies

Duplicate IKEV2 SAs

john-serink
Level 1
Level 1

Hi All;

 

I've been having issues with GPRS connections in India from Digi WR21 routers to an ISR4431 running Cisco IOS XE Software, Version 16.08.01. As such, I've been sleeping with RFC5996 under my pillow and have noticed a few anomalies. See the following:

CCrouter# sh crypto session br
Status: A- Active, U - Up, D - Down, I - Idle, S - Standby, N - Negotiating
K - No IKE
ivrf = (none)
Peer I/F Username Group/Phase1_id Uptime Status
171.76.163.159 Gi0/0/0 CORS9 00:44:42 UA
171.76.154.37 Gi0/0/0 CORS8 00:14:57 UA
106.215.187.163 Gi0/0/0 CORS11 00:44:42 UA
171.76.141.249 Gi0/0/0 CORS10 00:44:42 UA
171.76.141.249 Gi0/0/0 CORS10 00:42:45 UA
106.215.194.74 Gi0/0/0 CORS13 00:44:42 UA
106.215.230.163 Gi0/0/0 CORS12 00:42:46 UA
106.215.230.163 Gi0/0/0 CORS12 00:44:43 UA
171.76.136.182 Gi0/0/0 CORS15 00:44:42 UA
171.76.163.122 Gi0/0/0 CORS14 00:42:42 UA
171.76.163.122 Gi0/0/0 CORS14 00:44:42 UA
106.215.169.120 Gi0/0/0 CORS1 00:40:45 UA
106.215.240.161 Gi0/0/0 CORS3 00:44:39 UA
106.215.245.132 Gi0/0/0 CORS2 00:21:17 UA
220.255.242.218 Gi0/0/0 jserinki7 01:37:19 UA
220.255.242.218 Gi0/0/0 jserinki7 01:37:19 UA
106.215.205.163 Gi0/0/0 CORS5 00:44:41 UA
106.215.138.9 Gi0/0/0 CORS4 00:44:29 UA
106.215.135.118 Gi0/0/0 CORS6 00:42:44 UA
106.215.135.118 Gi0/0/0 CORS6 00:44:41 UA
106.215.150.146 Gi0/0/0 CORS25 00:44:42 UA
106.215.179.177 Gi0/0/0 CORS24 00:39:58 UA
106.215.141.113 Gi0/0/0 CORS27 00:44:39 UA
106.215.248.156 Gi0/0/0 CORS26 00:44:41 UA
27.63.34.46 Gi0/0/0 CORS29 00:44:41 UA
106.215.179.226 Gi0/0/0 CORS28 00:44:40 UA
171.76.168.193 Gi0/0/0 CORS31 01:48:42 UA
106.215.162.7 Gi0/0/0 CORS17 00:44:42 UA
171.76.178.113 Gi0/0/0 CORS16 00:42:45 UA
171.76.178.113 Gi0/0/0 CORS16 00:44:42 UA
171.76.139.121 Gi0/0/0 CORS19 00:44:42 UA
106.215.229.19 Gi0/0/0 CORS18 00:30:06 UA
171.76.147.54 Gi0/0/0 CORS21 00:42:43 UA
171.76.147.54 Gi0/0/0 CORS21 00:44:41 UA
27.63.46.230 Gi0/0/0 CORS20 00:44:42 UA
132.154.27.208 Gi0/0/0 CORS23 00:44:41 UA
27.63.62.191 Gi0/0/0 CORS22 00:44:42 UA
27.63.62.191 Gi0/0/0 CORS22 00:42:44 UA
27.63.41.13 Gi0/0/0 CORS32 00:08:39 UA
171.76.171.85 Gi0/0/0 CORS34 00:08:38 UA

 

Notice entries for Phase IDs CORS10, CORS14, CORS12, CORS21 and CORS22......they are duplicates. And they are not recent, they all have nearly the same age with one slightly older than the other.

 

According to my understanding of the RFC, this is only supposed to happen just before the SA expires....

To quote the RFC:

"To rekey a Child SA within and existing IKE SA, create a new, equivalent SA, and when the new one is established, delete the old one."

Why is the 4431 NOT deleting the old one?

Interface: GigabitEthernet0/0/0
Profile: SOIprofile
Session status: UP-ACTIVE
Peer: 171.76.141.249 port 4609
Session ID: 42925
IKEv2 SA: local 103.205.244.106/4500 remote 171.76.141.249/4609 Active
Session ID: 47085
IKEv2 SA: local 103.205.244.106/4500 remote 171.76.141.249/4609 Active
IPSEC FLOW: permit ip host 1.1.1.10 host 2.2.2.10
Active SAs: 2, origin: dynamic crypto map

 

Why are there 2 active SAs?

Should not the older one be removed?

Or am I not understanding this correctly?

 

Cheers,

john

 

 

11 Replies 11

I would say yes, those old sessions should be removed right after the new ones have been established, otherwise we would call them staled sessions. What is the remote peer vendor/model?

Hi Aref:

 

The other end is a DigiTransport WR21.

But, it also does it on IKEv1 with Libreswan on linux from my laptop.

 

I just checked a 1900 I have running in the office on IOS15.2.3 which is running against a bunch of initiators (all Digi's) all on IKEV1 and there is not a single duplicated SA. That has been running fine for over 10 years.

 

Here is some data hot of the press form the 4431:

Interface: GigabitEthernet0/0/0
Profile: SOIprofile
Session status: UP-ACTIVE
Peer: 171.76.160.227 port 1025
Session ID: 58329
IKEv2 SA: local 103.205.244.106/4500 remote 171.76.160.227/1025 Active
Session ID: 58350
IKEv2 SA: local 103.205.244.106/4500 remote 171.76.160.227/1025 Active
IPSEC FLOW: permit ip host 1.1.1.10 host 2.2.2.22
Active SAs: 2, origin: dynamic crypto map

 

And here are the SAs on the Digi's end:

CCrouter 103.205.244.106 100.68.35.6 0x69 1 SHA256 AES(256) 11409 105
CCrouter 103.205.244.106 100.68.35.6 0x68 4 SHA256 AES(256) 7871 104

 

I have been seeing several odd behaviors which I have been blaming on the GPRS systems in India, my config, the Digi's (because it could "never" bee Cisco) the color of socks I'm wearing.....but I'm starting to think its the IOS version.

CCrouter#sh ver
Cisco IOS XE Software, Version 16.08.01

 

I'm bugging the Cisco dealer to get me an updated IOS as the router was only delivered in February but he is giving me issues over it.

 

Will keep pushing him.

Will update you lot once I get a newer IOS.

Cheers,

John

In the meantime, I would try to disable DPDs for those tunnels and see if that makes any difference.

Disable DPD?

 

Why not? You think I should do it from both ends?

 

Cheers,

John

Ok, have disabled DPD on the ISR4431 and disabled it in a handfull of WR21s in the field. Will observe over the next couple of days.

 

Cheers,

john

Fingers crossed!

Ok, problems improved but not eliminated in either the group is dpd disabled in both directions or those just in one direction.

Dealer finally got me the IOS, just waiting for the hash key so I can check it.

 

Cheers,

john

Hi Everyone:

I am now running:

Cisco IOS XE Software, Version 16.12.04
Cisco IOS Software [Gibraltar], ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 16.12.4, RELEASE SOFTWARE (fc5)
Technical Support: http://www.cisco.com/techsupport
Copyright (c) 1986-2020 by Cisco Systems, Inc.
Compiled Thu 09-Jul-20 21:44 by mcpre

 

Now, the duplicate SA issue is much less:

CCrouter#sh crypt session br
Status: A- Active, U - Up, D - Down, I - Idle, S - Standby, N - Negotiating
K - No IKE
ivrf = (none)
Peer I/F Username Group/Phase1_id Uptime Status
171.76.170.55 Gi0/0/0 CORS9 03:25:52 UA
171.76.146.65 Gi0/0/0 CORS8 00:07:06 UA
106.215.249.156 Gi0/0/0 CORS11 02:19:53 UA
171.76.204.85 Gi0/0/0 CORS10 00:44:28 UA
106.215.248.16 Gi0/0/0 CORS13 03:11:06 UA
106.215.165.209 Gi0/0/0 CORS12 02:30:33 UA
171.76.159.122 Gi0/0/0 CORS15 03:28:13 UA
171.76.158.89 Gi0/0/0 CORS14 03:04:27 UA
106.215.239.43 Gi0/0/0 CORS1 03:52:10 UA
106.215.169.142 Gi0/0/0 CORS3 03:37:43 UA
220.255.242.218 Gi0/0/0 jserinki7 04:45:23 UA
106.215.164.186 Gi0/0/0 CORS2 03:28:30 UA
106.215.145.67 Gi0/0/0 CORS5 03:28:51 UA
106.215.236.67 Gi0/0/0 CORS4 02:19:32 UA
106.215.173.81 Gi0/0/0 CORS7 03:28:35 UA
59.97.82.94 Gi0/0/0 CORS6 01:04:00 UA
106.215.250.107 Gi0/0/0 CORS25 03:36:44 UA
106.215.235.44 Gi0/0/0 CORS24 03:31:44 UA
106.215.178.32 Gi0/0/0 CORS27 03:41:23 UA
106.215.240.65 Gi0/0/0 CORS26 02:20:01 UA
27.63.33.183 Gi0/0/0 CORS29 02:46:11 UA
106.215.153.166 Gi0/0/0 CORS28 03:28:31 UA
171.76.176.5 Gi0/0/0 CORS31 00:10:03 UA
171.76.171.222 Gi0/0/0 CORS30 03:28:38 UA
106.215.251.143 Gi0/0/0 CORS17 01:35:25 UA
27.63.37.120 Gi0/0/0 CORS16 03:28:40 UA
171.76.149.59 Gi0/0/0 CORS19 03:28:12 UA
106.215.203.9 Gi0/0/0 CORS18 00:53:59 UA
27.63.33.81 Gi0/0/0 CORS21 03:27:50 UA
171.76.173.136 Gi0/0/0 CORS20 03:28:40 UA
171.76.151.244 Gi0/0/0 CORS22 00:04:49 UA
171.76.140.66 Gi0/0/0 CORS41 00:17:55 UA
171.76.175.136 Gi0/0/0 CORS40 02:51:17 UA
171.76.160.110 Gi0/0/0 CORS43 00:17:32 UA
171.76.150.213 Gi0/0/0 CORS42 03:25:03 UA
27.63.61.168 Gi0/0/0 CORS45 01:55:51 UA
171.76.185.98 Gi0/0/0 CORS44 01:15:58 UA
171.76.204.100 Gi0/0/0 CORS47 01:56:45 UA
171.76.165.241 Gi0/0/0 CORS46 03:28:37 UA
171.76.171.151 Gi0/0/0 CORS33 01:44:54 UA
171.76.135.8 Gi0/0/0 CORS32 01:41:36 UA
171.76.128.224 Gi0/0/0 CORS35 02:18:51 UA
171.76.170.216 Gi0/0/0 CORS34 01:22:33 UA
171.76.204.40 Gi0/0/0 CORS37 03:28:45 UA
27.63.45.164 Gi0/0/0 CORS36 01:40:49 UA
171.76.204.158 Gi0/0/0 CORS39 00:36:19 UA
106.215.143.153 Gi0/0/0 CORS56 03:04:44 UA
106.215.149.231 Gi0/0/0 CORS49 03:12:31 UA
106.215.230.99 Gi0/0/0 CORS48 03:11:43 UA
171.76.154.16 Gi0/0/0 CORS51 00:11:52 UA
106.215.219.211 Gi0/0/0 CORS50 00:37:40 UA
106.215.241.133 Gi0/0/0 CORS53 03:14:45 UA
27.63.38.40 Gi0/0/0 CORS52 03:28:14 UA
106.215.253.34 Gi0/0/0 CORS54 00:11:56 UA

 

In fact, usually I only see it on my machine which is jserinki7 above as its the only one on ikev1, the rest are on ikev2.

 

But the Circular SA problem that I originally had whereby the ikev2 SA and IPSec SA would come up and the Digi would go silent and the SA would eventually time out as the Digi started to negotiate a new one.

The behavior has changed since the IOS upgrade to:

Now the Cisco accepts the initial request to create a tunnel and sends the message back to the Digi and states its waiting for authentication....which never comes. It times our eventually and in the mean time, the Digi send another request to bring up and tunnel. So the Digi never got the message that the initial proposal was accepted......

This goes round and round. The only solution is to reboot the digi by SMS which either fixes it or same behavior. We keep rebooting until it works.

 

This happens so some sites more than others....it appears that he telco is doing something odd with the NAT gateway on the cel sites.

I have a test Digi here in Singapore and I have connected to the System in India PERFECTLY over 100 times.....I can't get the Digi in Singapore to show the same symptoms the units in India are presenting.

 

Its really odd.

 

I will get some debug dumps for you lot tomorrow.

 

Cheers,

john

I would try to disable NAT-T if you haven't done that already. I had seen similar scenarios where the rekey negotiation was stuck between Cisco and some other vendors, and it was fixed by disabling NAT-T.

Hi Aref:

 

Ok, interesting idea. In fact, the 4431 is on the internet so it has no requirement for NAT-t on its end but the remote sites are all behind Telco/ISP NAT gateways so obviously they require it.

 

How can I turn of NAT-T on the Cisco end but still have the Cisco accept NAT-T functionality from the Digis?

 

The other work-around I have been deploying is a small python ping routine I've put on the Digis to ping the internal interface of the 4431 that is only available if the IPSec tunnels are up. After bootup the script waits for 120 seconds so that the links can come up and the IPSec negotiated. If the tunnels are up, the pings are successful and the script does nothing and goes to sleep for 30 seconds. If 4 sequential pings fail (~45 seconds each apart as the ping timeout is ~10-15 seconds) then the Digi router reboots. This has been working but of course there will be 5 minutes data gaps but it beats having to SMS the routers individually to command a reboot.

 

Your suggested solution sounds intriguing so I'd like to try it.

 

Cheers,

John

Hi John. If the remote devices are behind a NAT device, then we need to keep NAT-T enabled, otherwise it won't work at all. I was going through this thread again, and I've noticed this from the output you posted previously:

Interface: GigabitEthernet0/0/0
Profile: SOIprofile
Session status: UP-ACTIVE
Peer: 171.76.141.249 port 4609
Session ID: 42925
IKEv2 SA: local 103.205.244.106/4500 remote 171.76.141.249/4609 Active
Session ID: 47085
IKEv2 SA: local 103.205.244.106/4500 remote 171.76.141.249/4609 Active
IPSEC FLOW: permit ip host 1.1.1.10 host 2.2.2.10
Active SAs: 2, origin: dynamic crypto map

Not sure why it shows port 4609 instead of port 4500, any idea? not sure if this is the case with all the other tunnels or not. As far as I know, when NAT-T is in use, both the source and the destination port should be 4500/udp.