Re: DMVPN Single Cloud Dual Hub - Failover from one hub to another

chira.cipri@gmail.com · ‎11-10-2020

Hello community,

I have a question regarding DMVPN Phase 3 Single Cloud, that is one tunnel interface for all routers) configuration with redundant hubs. Specifically, how do you achieve failover from the primary hub to the secondary hub ?

In my scenario routers vIOS1 and vIOS5 are hubs, and vIOS2 and vIOS3 are spokes. OSPF is being run on the tunnel interfaces, and the spokes have priority 0 to make sure they never become DRs. Spoke to spoke traffic works as expected for DMVPN phase3. As I have mentioned there is only single DMVPN cloud so there is one tunnel per router.

However, after I clear the dynamic tunnels on the spokes and shutdown the tunnel interface on vIOS1 ( to simulate a failure of the hub), the spokes don't failover immediately to the other hub vIOS5. When they eventually do failover, it is only when 1/3 of the NHRP holdtime expires and the spokes don't receive replies to the NHRP Registration Requests. By default, that can mean up to 2400 seconds for the default value of the timers.

Is there any other way to make the switchover from one hub to another other than that ?
I have searched this forum and Cisco documentation but could not find an answer for this .

Version on all routers is the following. As you may have guessed, this is a GNS3 lab topology.

R5#show version
Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Version 15.5(3)M, RELEASE SOFTWARE (fc1)

Topology and router configuration can be found below.

### R1 - Hub ###

R1#show run int tu123
Building configuration...

Current configuration : 351 bytes
!
interface Tunnel123
 ip address 10.10.123.1 255.255.255.248
 no ip redirects
 ip nhrp map multicast dynamic
 ip nhrp network-id 123
 ip nhrp nhs 10.10.123.5 nbma 150.10.45.5 multicast
 ip nhrp redirect
 ip ospf network broadcast
 ip ospf 1 area 0
 tunnel source GigabitEthernet0/0.100
 tunnel mode gre multipoint
 tunnel key 123
end

### R2 - Spoke ###

R2#show run int tu123
Building configuration...

Current configuration : 482 bytes
!
interface Tunnel123
 ip address 10.10.123.2 255.255.255.248
 no ip redirects
 ip nhrp map 10.10.123.1 150.10.14.1
 ip nhrp map multicast 150.10.14.1
 ip nhrp map 10.10.123.5 150.10.45.5
 ip nhrp map multicast 150.10.45.5
 ip nhrp network-id 123
 ip nhrp nhs 10.10.123.1
 ip nhrp nhs 10.10.123.5
 ip nhrp shortcut
 ip ospf network broadcast
 ip ospf priority 0
 ip ospf 1 area 0
 tunnel source GigabitEthernet0/0.100
 tunnel mode gre multipoint
 tunnel key 123
end

### R3 - Spoke ###

R3#show run int tu123
Building configuration...

Current configuration : 392 bytes
!
interface Tunnel123
 ip address 10.10.123.3 255.255.255.248
 no ip redirects
 ip nhrp network-id 123
 ip nhrp nhs 10.10.123.1 nbma 150.10.14.1 multicast
 ip nhrp nhs 10.10.123.5 nbma 150.10.45.5 multicast
 ip nhrp shortcut
 ip ospf network broadcast
 ip ospf priority 0
 ip ospf 1 area 0
 tunnel source GigabitEthernet0/0.100
 tunnel mode gre multipoint
 tunnel key 123
end

### R5 - Hub ###

R5#show run int tu123
Building configuration...

Current configuration : 351 bytes
!
interface Tunnel123
 ip address 10.10.123.5 255.255.255.248
 no ip redirects
 ip nhrp map multicast dynamic
 ip nhrp network-id 123
 ip nhrp nhs 10.10.123.1 nbma 150.10.14.1 multicast
 ip nhrp redirect
 ip ospf network broadcast
 ip ospf 1 area 0
 keepalive 5 3
 tunnel source GigabitEthernet0/0.100
 tunnel mode gre multipoint
 tunnel key 123
end

Georg Pauwen · ‎11-10-2020

Hello,

you could change the 'ip nhrp registration timeout' to a low value (1 second is the lowest). That said, typically you would configure two tunnels on each spoke, one to each hub.

chira.cipri@gmail.com · ‎11-10-2020

Yes, I considered that option as well, and so far it seems to be the only one considering GRE Keepalives are not supported for GRE multipoint interfaces.

I would be interested to know however if there are any options besides this one for the single cloud dual hub design.

balaji.bandi · ‎11-10-2020

How are you doing Failover full path fail ? i would asuggest to play with OSPF - Can you post full configuraiton of all devices to look.

or refere below document to tweak :

https://www.cisco.com/c/en/us/support/docs/security-vpn/ipsec-negotiation-ike-protocols/41940-dmvpn.html

https://community.cisco.com/t5/networking-documents/design-single-dmvpn-with-dual-hubs-as-a-redundant-path-over/ta-p/3109950

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

chira.cipri@gmail.com · ‎11-10-2020

Hello,

The question I'm asking is straightforward. The primary hub fails for some reason or another. How do the spokes know to forward to the secondary hub once the primary has failed. OSPF metrics don't really have much impact in this specific scenario.

I have noticed that even after the tunnel interface on the primary hub is shutdown and the OSPF on the spokes is torn down, the spokes still send their NHRP Resolution Requests towards that hub, even though they don't receive any reply.

Only after the NHRP Registration requests are sent and no answer is received they declare the hub as down and swich to the secondary.

MHM Cisco World · ‎11-10-2020

ip nhrp nhs nhs-address priority nhs-priority cluster cluster-number

Instead you can use cluster and config dual hub each one with it priority.

that better

chira.cipri@gmail.com · ‎11-10-2020

I'm sorry, but your answer has nothing to do with my question.

I'm not asking how to select which hub is primary and which is secondary, there are many ways to do that.

What I'm asking is once the primary hub fails, how can I make the spokes switchover to the secondary hub, other than changing the NHRP holdtime.

MHM Cisco World · ‎11-10-2020

we will config cluster contain dual hub,

we also config fallback timer,

fallback timer make spoke when failed send and receive nhrp message to/from hub change to other hub.

this is what I am think.

ip nhrp nhs fallback fallback-time

chira.cipri@gmail.com · ‎11-11-2020

I have already tried this with no effect.

So far the only thing that will make a spoke switch from one hub to another in my scenario is the expiration of NHRP Holdtime, after which the spoke sends several Registration Requests to the hub, and only after it does not receive any Registration Replies does it declare the NHS unreachable and switch to another NHS.

However this is already mentioned in the documentation.

https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/ipaddr_nhrp/configuration/15-mt/nhrp-15-mt-book/config-nhrp.html#GUID-EBB42CA1-87D6-4CE5-8167-382A1FF4568D

NHRP registrations are sent from NHCs to their configured NHSs every one-third of the NHRP holdtime (configured by the ip nhrp holdtime value command), unless the ip nhrp registration timeout value command is configured, in which case registrations are sent out according to the configured timeout value. If an NHRP registration reply is not received for an NHRP registration request, the NHRP registration request is retransmitted at timeouts of 1, 2, 4, 8, 16, and 32 seconds, then the sequence starts over again at 1.

The NHS is declared down if an NHRP registration reply is not received after three retransmission (7 seconds), and an NHRP resolution packets will no longer be sent to or by way of that NHS. NHRP registrations will continue to be sent at 1-, 2-, 4-, 8-, 16-, and 32-second intervals, probing the NHS until an NHRP registration reply is received

Giuseppe Larosa · ‎11-12-2020

Hello chira.cipri@gmail.com ,

I would suggest the following three options :

a) two hubs single cloud using DMVPN phase II in this case NHRP redirect is not used anymore and you may have better convergence ( to be tested)

b) keep DMVPN phase 3 but use two hubs and two different DMVPN clouds (I'm afraid it does not solve your issue)

c) explore if Dead Peer detection is applicable to DMVPN context. I remember a thread about this. If it is it can be the best solution.

Hope to help

Giuseppe

MHM Cisco World · ‎11-12-2020

....

MHM Cisco World · ‎11-14-2020

Hi Friends,
DMVPN with single Cloud and Dual Hub,
there are two choose :-
1- using ONE tunnel with dual NHS config
2-using TWO tunnel each toward one NHS

your case is "1"
when we config dual NHS "hub" in single Tunnel, the spoke will send NHRP request to both!!!
Hmm this what we don't want,
we want to select one NHS as primary and when failed we will shift to other NHS which will be as backup.

as I mention before this done with
ip nhrp nhs cluster priority
where

ip nhrp nhs cluster priority 10<-primary NHS1
ip nhrp nhs cluster priority 20<-backup NHS2

Note:-priority 0-255 0 is highest

also we must config
ip nhrp nhs cluster max-connection 1<-here we will make one connection meaning the SPOKE is connect to only one NHS.

NOW
what happened,
when the NHS1 down which is primary the Spoke will start to send register to make connect to backup
and when NHS1 return the spoke return to connect to it and disconnect to NHS2.

NOTE:-we can make fallback faster with
ip nhrp nhs fallback

NOTE:- we can reduce the time that the spoke detect NHS down by reduce the hold time and/or registration timeout.

this Give us what we want where we want spoke only connect to ONLY one NHS in time.

I hope this solve your Issue.

paul driver · ‎11-17-2022

Hello
Apply the recovery on the NHCs

Example:
int tun x
ip nhrp holdtime 10 <-- timeout (secs) for NHRP registration reply
ip nhrp registration timeout 3 < ---NHRP registration request, if not configured 1/3 of specified holdtime
ip nhrp nhs x.x.x.1 priority 1 cluster 1 < ----Primary NHS in cluster 1
ip nhrp nhs x.x.x.5 priority 254 cluster 1 <-----Backup NHS in cluster 1
ip nhrp nhs cluster 1 max-connections 1 <--- Allow the NHC only a single NHS peer at a time
ip nhrp nhs fallback 25 <---- Time in sec fallback is initiated to/from failover
if-state nhrp <---- detects the state of a NHS

Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul