07-21-2023 07:57 AM - last edited on 08-17-2023 11:54 AM by Translator
Hello.
Our spoke ISP failed for 4 minutes. Once it returned, our
C2921 DMVPN
spoke remained in the NHRP state (no production traffic flowed from this spoke to the hub. This was remediated by shutting then no-shutting the spoke
tunnel interface
(It was fortunate I was physically at the branch spoke to fix this.)
QUESTIONS:
1. Why did this spoke not return to "UP" status when the ISP link became healthy?
2. What can be done so that this symptom does not re-occur?
Thank you.
Solved! Go to Solution.
07-21-2023 02:10 PM - last edited on 08-17-2023 12:13 PM by Translator
Hello,
It's better not to "simplify" scenarios during troubleshooting like this because details like these may change the whole story completely.
Initially, assuming that all you did was to flap the
Tunnel interface
and did not perform further config changes, I suspected the NHRP registration interval to have collided in an unfortunate way with the ISP outage. Since it is 200 seconds by default, if the registration fails, it will take up to 200 seconds for the router to register to the hub again. That could have explained it if the registration fell into the 4-minute outage of the ISP (and if
Gi1/0
didn't go down which I only learned when you shared the logs). Very importantly, if the
Tunnel2
came up just by flapping it but keeping
Gi1/0
as the source interface, it would have confirmed that the internet connectivity through
Gi1/0
worked after the ISP came back.
However, you have changed the source interface and only then flapped the Tunnel interface. This means that we can not assume anything about the apparently restored connectivity through the ISP on
Gi1/0
For what it is worth, just because
Gi1/0
came back up does not mean that the internet was actually reachable through it.
So based on the fact that the
Tunnel2
became operable after you changed the source interface opens a whole set of questions on how the connectivity was restored, if at all, through
Gi1/0
It is not possible to say with certainty whether the problem was NHRP or the connectivity through
Gi1/0
I suspect for now that when
Gi1/0
came up, it still did not provide connectivity through that ISP to internet. Why would that be the case - that's something I can't say without seeing the full
show logging and full show running-config
because there are too many unknowns, and we cannot afford assuming.
All depends now on whether it is possible to share the following full outputs (no line may be removed, only sensitive data replaced with safe placeholders):
- show logging
- show running-config
- show ip interface brief
- show ip protocols
- show ip route
- show ip route vrf *
- show ip arp
The reason I am asking for this information is that I need to understand what is the momentary runtime state of this router, whether it appears to have at least a local connectivity to the ISP, and how is the routing set up on it. Changing the source interface on
Tun2
would have changed the source IP but not the outgoing interface itself - the outgoing interface is still determined by the routing table based on the destination IP address of the packet, not by the
tunnel source
command.
If those outputs cannot be shared, I'm afraid this is as far as we can get.
Best regards,
Peter
07-21-2023 08:21 AM - last edited on 08-17-2023 11:58 AM by Translator
Hello,
May I ask a few questions to clarify the issue?
1. What exactly was down after the ISP recovered? Was the
Tunnel interface "down, line protocol down" or "up, line protocol down"
? Or was it some NHRP state that was unresolved or empty? Let's be very precise regarding this.
2. Was it purely the
shut / no shut
on the
Tunnel interface
that resolved this issue?
3. Would it be possible for you to share the
show logging
from the spoke
C2921
including a few lines before, all lines during, and a few lines after the event with the ISP?
4. Would it be possible for you to share a sanitized but still full configuration of the
Tunnel interface
on the spoke
C2921
?
Thank you!
Best regards,
Peter
07-21-2023 09:33 AM - last edited on 08-17-2023 11:58 AM by Translator
(obfuscated)
!! Discussed tunnel is tunnel2 !!
Jul 21 13:56:02.768: %CRYPTO-4-IKMP_BAD_MESSAGE: IKE message from 18.179.50.34 failed its sanity check or is malformed
*Jul 21 13:58:02.728: %CRYPTO-4-IKMP_BAD_MESSAGE: IKE message from 18.179.50.34 failed its sanity check or is malformed
!! 18.179.50.34 is the public IP address of a spoke, probably connected to this spoke via spoke to spoke Tunnel2. !!
!! below-- ISP fails... !!
*Jul 21 13:58:10.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0, changed state to down
*Jul 21 13:58:11.000: %LINK-3-UPDOWN: Interface GigabitEthernet1/0, changed state to down
*Jul 21 13:58:11.408: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel3, changed state to down
*Jul 21 13:58:11.408: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel2, changed state to down
*Jul 21 13:58:11.408: %DUAL-5-NBRCHANGE: EIGRP-IPv4 1: Neighbor 192.168.3.1 (Tunnel3) is down: interface down
*Jul 21 13:58:11.412: %DUAL-5-NBRCHANGE: EIGRP-IPv4 1: Neighbor 192.168.12.1 (Tunnel2) is down: interface down
*Jul 21 14:02:14.000: %LINK-3-UPDOWN: Interface GigabitEthernet1/0, changed state to up
*Jul 21 14:02:15.000: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet1/0, changed state to up
*Jul 21 14:02:21.648: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel3, changed state to up
*Jul 21 14:02:21.648: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel2, changed state to up
!! Tunnel is stuck in NHRP state. !!
!! below is troubleshooting commenced. int tu20 is #shut !!
*Jul 21 14:19:52.316: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel2, changed state to down
*Jul 21 14:19:52.316: %LINK-5-CHANGED: Interface Tunnel2, changed state to administratively down
!! I didnt include this in discussion because I didnt want to complicate discussion, but fact is that at this moment I switched tunnel interface from g0/1 to the backup ISP at g0/4, then executed on tu2 #no shut !!
*Jul 21 14:20:17.896: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel2, changed state to up
*Jul 21 14:20:17.896: %LINK-3-UPDOWN: Interface Tunnel2, changed state to up
*Jul 21 14:20:21.648: %DUAL-5-NBRCHANGE: EIGRP-IPv4 1: Neighbor 192.168.12.1 (Tunnel2) is up: new adjacency
*Jul 21 14:23:05.644: %CRYPTO-4-IKMP_BAD_MESSAGE: IKE message from 18.179.50.34 failed its sanity check or is malformed
*Jul 21 14:24:05.696: %CRYPTO-4-IKMP_BAD_MESSAGE: IKE message from 18.179.50.34 failed its sanity check or is malformed
---
(obfuscated)
#sh int tu2
Tunnel2 is up, line protocol is up
Hardware is Tunnel
Description: Spoke1
Internet address is 192.168.12.26/24
MTU 17912 bytes, BW 250000 Kbit/sec, DLY 20000 usec,
reliability 255/255, txload 2/255, rxload 1/255
Encapsulation TUNNEL, loopback not set
Keepalive not set
Tunnel linestate evaluation up
Tunnel source 1.2.3.4 (GigabitEthernet0/4)
Tunnel Subblocks:
src-track:
Tunnel2 source tracking subblock associated with GigabitEthernet0/4
Set of tunnels with source GigabitEthernet0/4, 1 member (includes iterators), on interface <OK>
Tunnel protocol/transport multi-GRE/IP
Key 0x14, sequencing disabled
Checksumming of packets disabled
Tunnel TTL 255, Fast tunneling enabled
Tunnel transport MTU 1472 bytes
Tunnel transmit bandwidth 8000 (kbps)
Tunnel receive bandwidth 8000 (kbps)
Tunnel protection via IPSec (profile "myprofile")
Last input 00:00:01, output never, output hang never
Last clearing of "show interface" counters 29w3d
Input queue: 0/75/96/0 (size/max/drops/flushes); Total output drops: 7527297
Queueing strategy: fifo
Output queue: 0/0 (size/max)
5 minute input rate 1350000 bits/sec, 408 packets/sec
5 minute output rate 2741000 bits/sec, 453 packets/sec
473313393 packets input, 1659536957 bytes, 0 no buffer
Received 0 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
1155234215 packets output, 1291688580 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
0 unknown protocol drops
0 output buffer failures, 0 output buffers swapped out
07-21-2023 12:32 PM - last edited on 08-17-2023 12:05 PM by Translator
First since you change tunnel source try
Clear
nhrp
in hub
Clear
crypto isakmp
and clear
crypto ipsec sa
in hub and spoke
If above not work share config of spoke
07-21-2023 01:38 PM
Thank you MHM.
The situation is now healthy. The reason for the post is that my boss does not want this to happen again, so I am searching for understanding to why it happened. I cannot execute any new configs right now.
07-21-2023 02:10 PM - last edited on 08-17-2023 12:13 PM by Translator
Hello,
It's better not to "simplify" scenarios during troubleshooting like this because details like these may change the whole story completely.
Initially, assuming that all you did was to flap the
Tunnel interface
and did not perform further config changes, I suspected the NHRP registration interval to have collided in an unfortunate way with the ISP outage. Since it is 200 seconds by default, if the registration fails, it will take up to 200 seconds for the router to register to the hub again. That could have explained it if the registration fell into the 4-minute outage of the ISP (and if
Gi1/0
didn't go down which I only learned when you shared the logs). Very importantly, if the
Tunnel2
came up just by flapping it but keeping
Gi1/0
as the source interface, it would have confirmed that the internet connectivity through
Gi1/0
worked after the ISP came back.
However, you have changed the source interface and only then flapped the Tunnel interface. This means that we can not assume anything about the apparently restored connectivity through the ISP on
Gi1/0
For what it is worth, just because
Gi1/0
came back up does not mean that the internet was actually reachable through it.
So based on the fact that the
Tunnel2
became operable after you changed the source interface opens a whole set of questions on how the connectivity was restored, if at all, through
Gi1/0
It is not possible to say with certainty whether the problem was NHRP or the connectivity through
Gi1/0
I suspect for now that when
Gi1/0
came up, it still did not provide connectivity through that ISP to internet. Why would that be the case - that's something I can't say without seeing the full
show logging and full show running-config
because there are too many unknowns, and we cannot afford assuming.
All depends now on whether it is possible to share the following full outputs (no line may be removed, only sensitive data replaced with safe placeholders):
- show logging
- show running-config
- show ip interface brief
- show ip protocols
- show ip route
- show ip route vrf *
- show ip arp
The reason I am asking for this information is that I need to understand what is the momentary runtime state of this router, whether it appears to have at least a local connectivity to the ISP, and how is the routing set up on it. Changing the source interface on
Tun2
would have changed the source IP but not the outgoing interface itself - the outgoing interface is still determined by the routing table based on the destination IP address of the packet, not by the
tunnel source
command.
If those outputs cannot be shared, I'm afraid this is as far as we can get.
Best regards,
Peter
07-22-2023 06:13 AM - last edited on 08-17-2023 12:16 PM by Translator
this UP healthy status
when you change the ISP (without config no unique register under tunnel) the Hub receive NHRP same Spoke tunnel IP but different tunnel source IP this make Hub refuse the NHRP and in Spoke tunnel you see status NHRP.
solution
ip nhrp registration non-unique
NOTE:- above is command is only used if you have Spoke get it IP via ISP (DHCP or PPP), for your case since you change ISP IP one times no need only clear NHRP in hub as I mention above.
07-24-2023 07:46 AM
Note:- in my lab I change ISP ip and you can see the status is change from UP to NHRP.
07-24-2023 07:55 AM
If you go to ISP router device, shut interface for four minutes, then no-shut...
what is result of tunnel status on spoke DMVPN device?
07-24-2023 07:59 AM
As I mention if you get IP via DHCP ftom ISP and the IP change then this case can happened if you dont config register non unique
07-24-2023 08:09 AM
IPs are static. ISP routing device sits in cage, we cannot log into it, but we know its inside public IP address, and our DMVPN has public IP address also in this subnet.
As described, When ISP came back online 4 mins later, tunnel was stuck in NHRP state.
07-24-2023 08:16 AM
IP of tunnel source is static?
07-24-2023 09:12 AM
yes, static public.
07-24-2023 01:40 PM - last edited on 08-17-2023 12:18 PM by Translator
If Spoke have static IP' arw you config bgp flapping prevent or track
ip sla
?
07-25-2023 05:29 AM
ISP routing device in cage is not administrated by us. It has public IP address on its inside interface connected to a l2 switch, which connects to our network.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide