cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1455
Views
10
Helpful
10
Replies

Phase3-DMVPN SA goes down between two spokes after only a few seconds.

QuinnBowman
Level 1
Level 1

Hi all,

 

Out of the blue yesterday two of my spokes in my Phase3-DMVPN stopped being able to communicate between themselves. All other spokes are still able to communicate with each other AND communicate with the two spokes having the problem. The Hub is still able to communicate with both spokes as well.

 

Let's call the spokes Tom and Jerry

 

Tom and Jerry learn about each other through NHRP and share routes using iBGP, as do the rest of the Spokes.

After a bunch of troubleshooting various possibilities, I've narrowed down the symptoms a bit:

If there is no NHRP peer session between Tom and Jerry and I send some packets (ping) from Tom through the tunnel, the Session establishes, and ~15 pings will go through until they stop. From there on out, no communication between the two will work over the tunnel until I clear the dmvpn peer session. Then the same pattern repeats. Pings work for 15 frames or so, and then they stop until I manually clear the session or wait for the ~15 minutes for the session to drop on its own.

 

#debug cryp ikev2 error#debug cryp ipsec error#terminal mon

Shows the SA go down when I clear the dmvpn peer session

Tom#clear dmv sess peer Jerry

%IKEV2-5-SA_DOWN: SA DOWN

And then comes up after I start a ping

But without fail, a few seconds in, the SA goes down again, with a few errors I can't make reason of.

Tom#ping Jerry repeat 1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to Jerry, timeout is 2 seconds:
!
Success rate is 100 percent (1/1), round-trip min/avg/max = 24/24/24 ms
Oct 23 17:18:51: %IKEV2-5-RECV_CONNECTION_REQUEST: Received a IKE_INIT_SA request
Oct 23 17:18:51.267: IKEv2-ERROR:Failed to retrieve Certificate Issuer list
Oct 23 17:18:51.275: IKEv2-ERROR:Failed to retrieve Certificate Issuer list
Oct 23 17:18:51.279: insert of map into mapdb AVL failed, map + ace pair already exists on the mapdb
Oct 23 17:18:51: %IKEV2-5-OSAL_INITIATE_TUNNEL: Received request to establish an IPsec tunnel; local traffic selector = Address Range: Tom-Tom Protocol: 47 Port Range: 0-65535 ; remote traffic selector = Address Range: Jerry-Jerry 7 Protocol: 47 Port Range: 0-65535
Oct 23 17:18:51.451: IPSEC(ipsec_get_crypto_session_id):
Invalid Payload Id
Oct 23 17:18:51.451: IKEv2-ERROR:Error constructing config reply
Oct 23 17:18:51: %IKEV2-5-SA_UP: SA UP
Oct 23 17:18:51.451: IPSEC(ipsec_get_crypto_session_id):
Invalid Payload Id
Oct 23 17:18:51.639: IPSEC(ipsec_get_crypto_session_id):
Invalid Payload Id
Oct 23 17:18:51: %IKEV2-5-SA_UP: SA UP
Oct 23 17:18:51.639: IPSEC(ipsec_get_crypto_session_id):
Invalid Payload Id
Oct 23 17:18:51.651: IPSEC: sa null
Oct 23 17:18:51.651: IPSEC(send_delete_notify_kmi): not sending KEY_ENGINE_DELETE_SAS
Oct 23 17:18:51: %IKEV2-5-SA_DOWN: SA DOWN

After all this though, the two still seem to be fully peered. And stay that way about ~15 minutes before the session drops without much warning or say.

Tom#show dmvpn
Interface: Tunnel1, IPv4 NHRP Details
Type:Spoke, NHRP Peers:6,
 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 Jerry      Jerry-T    UP 00:05:54   DT1

 

Tom#show ip nhrp
Jerry-T/32 via Jerry-T
   Tunnel1 created 00:07:04, expire 01:52:55
   Type: dynamic, Flags: router used nhop rib
   NBMA address: Jerry

AND the SA is still present..

Tom#show crypto ipsec sa
interface: Tunnel1
    Crypto map tag: Tunnel1-head-0, local addr Tom

   protected vrf: (none)
   local  ident (addr/mask/prot/port): (Tom/255.255.255.255/47/0)
   remote ident (addr/mask/prot/port): (Jerry/255.255.255.255/47/0)
   current_peer Jerry port 500
     PERMIT, flags={origin_is_acl,}
    #pkts encaps: 46, #pkts encrypt: 46, #pkts digest: 46
    #pkts decaps: 7, #pkts decrypt: 7, #pkts verify: 7
    #pkts compressed: 0, #pkts decompressed: 0
    #pkts not compressed: 0, #pkts compr. failed: 0
    #pkts not decompressed: 0, #pkts decompress failed: 0
    #send errors 0, #recv errors 0

     local crypto endpt.: Tom, remote crypto endpt.: Jerry
     plaintext mtu 1458, path mtu 1500, ip mtu 1500, ip mtu idb GigabitEthernet0/1
     current outbound spi: 0x3792E6EC(932374252)
     PFS (Y/N): N, DH group: none

     inbound esp sas:
      spi: 0x4E27AEBD(1311223485)
        transform: esp-aes esp-sha-hmac ,
        in use settings ={Transport, }
        conn id: 991, flow_id: Onboard VPN:991, sibling_flags 80000000, crypto map: Tunnel1-head-0
        sa timing: remaining key lifetime (k/sec): (4364220/3054)
        IV size: 16 bytes
        replay detection support: Y
        Status: ACTIVE(ACTIVE)

     inbound ah sas:
     inbound pcp sas:

     outbound esp sas:
      spi: 0x3792E6EC(932374252)
        transform: esp-aes esp-sha-hmac ,
        in use settings ={Transport, }
        conn id: 992, flow_id: Onboard VPN:992, sibling_flags 80000000, crypto map: Tunnel1-head-0
        sa timing: remaining key lifetime (k/sec): (4364215/3054)
        IV size: 16 bytes
        replay detection support: Y
        Status: ACTIVE(ACTIVE)

     outbound ah sas:
     outbound pcp sas:

Pings between the public IP addresses of Tom and Jerry hold rock solid the whole time.

This is ONLY happening between Tom and Jerry and I'm worried about the rate of hair loss I'm currently experiencing....help.

--
Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.
10 Replies 10

Hello,

 

on both spokes, try and set the security association lifetime to the highest value possible, and/or disable volume-based rekeying:

 

crypto ipsec security-association lifetime seconds 2592000
crypto ipsec security-association lifetime kilobytes disable

No change.

--
Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.

Hello,

 

is there a difference in the configuration, the IOS, the router model, or the type of Internet connection between the 'working' and the 'non-working' spokes ?

Configuration is the same between the two (And the rest of the spokes)

 

Tom

crypto ikev2 proposal ikev2-prop
 encryption aes-cbc-256 aes-cbc-192
 integrity sha256 sha384 sha1
 group 14 24 5
!
crypto ikev2 policy ikev2-pol1
 match fvrf any
 proposal ikev2-prop
!
crypto ikev2 keyring ikev2-keyring1
 peer ikev2-peer1
  address 0.0.0.0 0.0.0.0
  pre-shared-key **********
 !
!
!
crypto ikev2 profile ikev2-profile
 match fvrf any
 match identity remote address 0.0.0.0
 authentication remote pre-share
 authentication local pre-share
 keyring local ikev2-keyring1
 dpd 30 10 periodic
!
crypto ikev2 nat keepalive 45
crypto ikev2 diagnose error 25
!
crypto logging ikev2
!
!
crypto ipsec transform-set ipsec-trans1 esp-aes 256 esp-sha256-hmac
 mode tunnel
!
!
crypto ipsec profile ipsec-prof1
 set ikev2-profile ikev2-profile

Jerry

crypto ikev2 proposal ikev2-prop
 encryption aes-cbc-256 aes-cbc-192
 integrity sha256 sha384 sha1
 group 14 24 5
!
crypto ikev2 policy ikev2-pol1
 match fvrf any
 proposal ikev2-prop
!
crypto ikev2 keyring ikev2-keyring1
 peer ikev2-peer1
  address 0.0.0.0 0.0.0.0
  pre-shared-key **********
 !
!
!
crypto ikev2 profile ikev2-profile
 match fvrf any
 match identity remote address 0.0.0.0
 authentication local pre-share
 authentication remote pre-share
 keyring local ikev2-keyring1
 dpd 30 10 periodic
!
crypto ikev2 nat keepalive 45
crypto ikev2 diagnose error 25
!
!
crypto logging ikev2
crypto isakmp invalid-spi-recovery
!
!
crypto ipsec transform-set ipsec-trans1 esp-aes 256 esp-sha256-hmac
 mode tunnel
crypto ipsec df-bit clear
!
!
crypto ipsec profile ipsec-prof1
 set ikev2-profile ikev2-profile


Unfortunately, YES. Dissimilar hardware, AND different Carriers between most of the spokes.

Tom:

Cisco CISCO3925-CHASSIS (revision 1.0) with C3900-SPE100/K9
Cisco IOS Software, C3900 Software (C3900-UNIVERSALK9-M), Version 15.7(3)M2, RELEASE SOFTWARE (fc2)

Jerry:

Cisco CISCO2911/K9 (revision 1.0)
Cisco IOS Software, C2900 Software (C2900-UNIVERSALK9-M), Version 15.5(3)M6a, RELEASE SOFTWARE (fc2)

 

BUT it gets weirder because next door to Tom is Spike.

Spike:

Cisco ISR4321

Cisco IOS Software [Denali], ISR Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 16.3.8, RELEASE SOFTWARE (fc3)
Spike is just a few miles down the road from Tom, uses the Same ISP as tom, but has nothing like the same problem as Tom when it comes to communicating with Jerry.

--
Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.

Hello,

 

the recommended releases for both the 2911 and the 3925 is release 15.7.3M7 MD. You might want to try and upgrade the IOS to that release...

If Only

Client hasn't held a support subscription to these things. So no access to firmware (that I know of) to try upgrade.

--
Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.

Hello,

 

I haven't seen your full configuration, but on the WAN interfaces, make sure you have:

 

ip tcp adjust-mss 1360
mtu 1400

 

configured, and check if that makes a difference.

Good news is I found an image of 15.7(3)M1 lying around. It's not exactly current, but better than not I guess.

Bad news is, no change.

 

Applied the suggested configurations above. No change.

 

Let me know if you'd like to see the full config of Tom/Jerry

--
Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.

georgehewittuk1
Level 1
Level 1

Does seem like an IKEV2 issue and buggyness... Is this a production environment I presume? Would be interesting to break and use different authentication etc. Does the issue persist after a reload?

Production Environment, yes.

But we do have some flexibility with getting this fixed as we've provided a 'good enough' alternative. 

And yes, issue persists after reload on both sides.

Interesting thought to try some different encryption algorithms. I've heard of some buggy results coming out of esp-sha256-hmac before.

Any suggestions?

--
Sure, understanding today's complex world of the future is a little like having bees live in your head. But, there they are.
Review Cisco Networking products for a $25 gift card