03-03-2019 04:47 PM - edited 03-16-2021 07:02 PM
This article describes control connection problems that might arise between SD-WAN routers (cEdge / vEdge) and vSmart controllers and between SDWAN routers (cEdge / vEdge) and vManage NMSs when you are bringing up these devices. All the troubleshooting steps below apply to both vEdge and cEdge routers, but all the captures are taken from vEdge Router.
To check the status of the control connections of all SD-WAN routers, in the vManage Dashboard, view the Control Status pane. Click any row to display a table with device details.
To check the status of a single vEdge router's control connections, in vManage NMS, select Monitor ► Network, locate the desired vEdge router, and click its hostname. In the left pane, click Control Connections.
To display active control connections from the CLI, issue the show control connections [vEdge] or show sdwan control connections [cEdge] command. If a control connection is not listed in the command output, that connection is not operational.
If the vManage NMS screens or the command output indicates connection problems between a vEdge router and a vSmart controller or vManage NMS, see the sections below to troubleshoot the problem.
Before starting to troubleshoot, make sure to confirm the SD-WAN router in question has been configured properly.
It includes:
If the portion of the vEdge router's configuration that establishes control connections is correct such that control connections are up, the show control local-properties command output looks similar to this example, in Releases 16.3 and later:
vEdge# show control local-properties personality vedge
sp-organization-name Viptela, Inc. organization-name Viptela, Inc. certificate-status Installed root-ca-chain-status Installed certificate-validity Valid certificate-not-valid-before Sep 06 22:39:01 2016 GMT certificate-not-valid-after Sep 06 22:39:01 2017 GMT dns-name trainingvbond.viptela.com site-id 10 domain-id 1 protocol dtls tls-port 0 system-ip 172.1.10.1 chassis-num/unique-id 66cb2a8b-2eeb-479b-83d0-0682b64d8190 serial-num 12345718 vsmart-list-version 0 keygen-interval 1:00:00:00 retry-interval 0:00:00:17 no-activity-exp-interval 0:00:00:12 dns-cache-ttl 0:00:02:00 port-hopped TRUE time-since-last-port-hop 20:16:24:43 number-vbond-peers 0 number-active-wan-interfaces 1 NAT TYPE: E -- indicates End-point independent mapping A -- indicates Address-port dependent mapping N -- indicates Not learned Note: Requires minimum two vbonds to learn the NAT type PUBLIC PUBLIC PRIVATE PRIVATE PRIVATE MAX RESTRICT/ LAST SPI TIME NAT VM INTERFACE IPv4 PORT IPv4 IPv6 PORT VS/VM COLOR STATE CNTRL CONTROL/ LR/LB CONNECTION REMAINING TYPE CON STUN PRF -------------------------------------------------------------------------------------------------------------------------------------------------------------- ge0/4 73.241.233.20 12386 192.168.0.20 2601:647:4380:ca75::c2 12386 2/1 public-internet up 2 no/yes/no No/Yes 0:10:34:16 0:03:03:26 E 5
If the control connections are down, debug the vEdge router's configuration as discussed in the sections below.
A valid certificate must be installed on the vEdge router.
To view the status of the router's certificate, in vManage NMS, select the Configuration ► Certificates screen, and select the vEdge List tab. In the table, the Validate column should be green, to indicate that the certificate is valid, and the State column should show a green icon. From the CLI, use the show control local-properties command.
The vEdge router configuration must include a system IP address, a site ID, an organization name, and a vBond orchestrator IP address or DNS name.
To view the device's running configuration in vManage NMS:
To view the device's running configuration from the CLI, use the show running-config [vEdge] or show sdwan running-config [cEdge] command.
The vEdge router's clock must have the same time configured as other devices in the overlay network.
To view the system time on the device, use the CLI command show clock.
Control connections might not come up if the overlay network has routing issues. If this is the case, the State column in the show control connections command output has a value of "connect", which indicates that a connection attempt is in progress. If the control connection is up, the State column shows a value of "up".
PEER PEER CONTROLLER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC GROUP TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE UPTIME ID ------------------------------------------------------------------------------------------------------------------------------------------------------------------- vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 gold connect 0 vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 silver connect 0
Check for routing issues as discussed in the sections below.
The route table (RIB) might not contain a valid route to the correct next hop.
To verify the entries in the route table in vManage NMS:
To verify the entries in the route table from the CLI, use the command show ip routes vpn 0 or show ip routes vpn 0 prefix/length.
The TLOC IP address might be being leaked between upstream ISPs.
To verify connectivity, ping the default gateway. Check for correct distance values and protocols for the IP prefix.
If control connections are down, issue the show control connections-history [vEdge] or show sdwan control local-properties [cEdge] on CLI command on the vEdge router to display information about control plane connection attempts initiated by the router. The Local Error or Remote Error columns in the output report any errors that occurred with the connection initialization attempts. The following errors are related to issues related to configuration and establishing control tunnels:
Problem Statement
A device's serial number is missing from the vSmart controllers.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the values BIDNTVRFD, CRTREJSER, and SERNTPRES indicate a missing serial number. BIDNTVRFD indicates a missing serial number for vBond orchestrators. CRTREJSER indicates a missing serial number for vEdge routers and vSmart controllers. SERNTPRES on a vBond orchestrator indicates a serial number mismatch between vSmart controllers.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE LOCAL/REMOTE COUNT DOWNTIME --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vbond dtls - 0 0 1.4.30.30 12346 1.4.30.30 12346 mpls tear_down CRTREJSER NOERR 161 2016-10-14T11:19:39-0700 vbond dtls - 0 0 1.4.30.30 12346 1.4.30.30 12346 silver tear_down CRTREJSER NOERR 161 2016-10-14T11:19:38-0700 vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 mpls tear_down CRTREJSER NOERR 160 2016-10-14T11:19:22-0700 vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 silver tear_down CRTREJSER NOERR 160 2016-10-14T11:19:22-0700
Resolve the Problem
Send the device's serial number to the controllers:
Problem Statement
The organization name is not identical among all devices in the overlay network.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value CTORGNMMIS indicates an organization name mismatch in the overlay network.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 mpls tear_down CTORGNMMIS NOERR 19 2016-10-06T00:39:37+0000 vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 gold tear_down CTORGNMMIS NOERR 28 2016-10-06T10:39:20-0000
Resolve the Problem
To configure the correct organization name on every device in the overlay network, use the organization-name command.
Problem Statement
Verification of the vEdge router's certificate failed.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value CRTVERFL indicates certificate verification failure.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 mpls tear_down CRTVERFL NOERR 142 2016-10-03T00:39:37+0000 vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 gold tear_down CRTVERFL NOERR 213 2016-10-03T10:39:20-0000
Resolve the Problem
Problem Statement
The vEdge router does not establish DTLS connections to controllers in the overlay network.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value DCONFAIL indicates DTLS connection failure.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 mpls connect DCONFAIL NOERR 1 2016-09-22T10:49:04-0700 vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 gold connect DCONFAIL NOERR 1 2016-09-22T10:49:03-0700
Resolve the Problem
Problem Statement
The vEdge router experiences transient control connection errors.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value DISCVBD indicates that the vEdge router's connection to the vBond orchestrator has been taken down. This is normal behavior. The value SYSIPCHNG indicates a change in the vEdge router's system IP address.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE LOCAL/REMOTE COUNT DOWNTIME --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vbond dtls - 0 0 1.3.25.25 12346 1.3.25.25 12346 lte tear_down DISCVBD/NOERR 0 2016-09-27T19:06:16+0000 vmanage tls 172.1.0.18 200 0 1.4.28.28 12346 1.3.25.25 12346 lte tear_down SYSIPCHNG/NOERR 0 2016-09-27T19:05:30+0000 vsmart tls 172.1.0.19 103 1 1.3.26.26 12346 1.3.25.25 12346 lte tear_down SYSIPCHNG/NOERR 0 2016-09-27T19:05:30+0000 vsmart tls 172.1.0.16 104 1 1.3.29.29 12346 1.3.25.25 12346 lte tear_down SYSIPCHNG/NOERR 0 2016-09-27T19:05:30+0000 vbond dtls - 0 0 1.4.30.30 12346 1.3.25.25 12346 lte tear_down DISCVBD/NOERR 0 2016-09-27T17:56:30+0000
Resolve the Problem
These issues are part of normal operation of the overlay network. They have no impact on production traffic, and they resolve by themselves, with no action required.
Problem Statement
A TLOC, or transport location, is disabled on the vEdge router. A TLOC identifies the physical interface where a vEdge router connects to the WAN transport network or to a NAT gateway.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value DISTLOC indicates that a TLOC is disabled.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vmanage dtls 172.1.0.18 1001 0 1.4.28.28 12346 1.4.28.28 12346 gold tear_down DISTLOC NOERR 0 2016-09-25T18:00:41-0700 vsmart dtls 172.1.0.19 1013 1 1.3.29.29 12346 1.3.29.29 12346 gold tear_down DISTLOC NOERR 0 2016-09-25T18:00:41-0700 vsmart dtls 172.1.0.16 1011 1 1.3.26.26 12346 1.3.26.26 12346 gold tear_down DISTLOC NOERR 0 2016-09-25T18:00:41-0700
Resolve the Problem
A TLOC might be disabled if:
Use the show running-config command to check the configuration. Reconfigure any attributes necessary.
Problem Statement
Socket error messages occur.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value LISFD indicates a socket error.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- vsmart dtls 1.1.1.8 7 0 1.4.28.28 12447 1.4.28.28 12447 gold up LISFD NOERR 45 2016-07-26T23:53:32-0700 vmanage dtls 1.1.1.7 7 1 1.3.29.29 12647 1.3.29.29 12647 gold up LISFD NOERR 70 2016-07-26T23:53:32-0700 vmanage dtls 1.1.1.7 7 1 1.3.26.26 12747 1.3.26.26 12867 gold up LISFD NOERR 69 2016-07-26T23:53:32-0700
Resolve the Problem
Socket errors might occur if:
Problem Statement
The vEdge router's configuration template was not attached to the router during the bringup of the router.
Identify the Problem
Issue the show control connections-history command. In the Remote Error column of the output, the value NOVMCFG indicates that the vEdge router's template was not attached during bringup.
PEER PEER
PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC REPEAT
TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE LOCAL/REMOTE COUNT DOWNTIME
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
vmanage tls 172.1.0.18 200 0 1.4.28.28 12346 1.3.25.25 12346 lte tear_down RXTRDWN/NOVMCFG 78 2016-10-06T08:10:06+0000
Resolve the Problem
During bringup from ZTP, if the device is not attached with a template on the vManage, then you will see “No Config. in vManage for device”. Make sure that the template is assigned on the vManage for the device in question.
Problem Statement
In an unstable network, when connections are frequently going down and coming up, the Trusted Board ID chip on a hardware vEdge router might not initialize.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the values RDSIGFBD and TXCHTOBD indicate a problem with initializing the Trusted Board ID chip. RDSIGFBD indicates that the network failed to read the signature from board ID. TXCHTOBD indicates that the network failed to send a challenge to board ID.
PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ vbond dtls 0 0 172.16.1.1 12346 172.16.1.1 12346 publicinternet challenge TXCHTOBD NOERR 0 20160701T15:47:40+000 vbond dtls 0 0 172.16.1.1 12346 172.16.1.1 12346 publicinternet challenge TXCHTOBD NOERR 0 20160701T15:47:40+0000
Resolve the Problem
Sometimes due to locking issues, sending challenge to board-id fails and when that happens, we reset the board-ID and try again. It shouldn’t happen often, it delays the bring up of control connections. This should be fixed in newer versions.
Problem Statement
A peer timeout occurs if the vEdge router loses reachability to a controller in the overlay network.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the values VB_TMO, VM_TMO, VP_TMO, and VS_TMO indicate a peer timeout error for vBond orchestrator, vManage NMS, peer vEdge routers, and vSmart controllers, respectively.
PEER PEER
PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC REPEAT
TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE LOCAL/REMOTE COUNT DOWNTIME
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
vmanage tls 172.1.0.18 200 0 1.4.28.28 12346 1.3.25.25 12346 default tear_down VM_TMO/NOERR 0 2016-10-01T10:54:20-0700
Issue the show control connections-history detail command to check the hello counters. A discrepancy between the transfer and receive Hello packet counters indicates packet loss between the vEdge router and the controller.
Tx Statistics- -------------- hello 1467659 connects 0 registers 0 register-replies 0 challenge 0 challenge-response 1 challenge-ack 0 teardown 1 teardown-all 0 vmanage-to-peer 0 register-to-vmanage 0 Rx Statistics- -------------- hello 1467279 connects 0 registers 0 register-replies 0 challenge 1 challenge-response 0 challenge-ack 1 teardown 0 vmanage-to-peer 0 register-to-vmanage 0
Resolve the Problem
Also, sometimes the problem could be caused by the underlay, where the devices in the underlay could be rate-limiting the TLS/DTLS packets. What has been observed is that if the packets are rate-limited to below 1Mbps, control connection(s) mayn't be formed and you will see "VM_TMO" errors. Make sure to look into the underlay, for any potential BW / throughput issues.
Problem Statement
The certificate for the vEdge router or vSmart controller has been revoked.
Identify the Problem
Issue the show control connections-history command. In the Local Error column of the output, the value VECRTREV indicates a revoked certificate on a vEdge router. The value VSCRTREV indicates a revoked certificate on a remote vSmart controller. If a certificate is revoked on a local vSmart controller, the value VSCRTREV displays in the Remote Error column.
PEER PEER
PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT
INSTANCE TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT REMOTE COLOR STATE ERROR ERROR COUNT DOWNTIME
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 vsmart dtls 172.1.0.16 1011 1 1.3.26.26 12346 1.3.26.26 12346 default tear_down VSCRTREV NOERR 0 2016-10-14T11:59:13-0700
Resolve the Problem
Certification verification failure is when certificate cannot be verified with the root cert installed:
1) Check time - it should be at least within vBond's certificate validity range.
show clock
2) This can be caused by root cert corruption on vEdge
Open a Cisco / Viptela Support case to resolve the issue.
Thank you for this excellent post.
I have another problem not mention in the post.
I built a test lab on ESXi and I am using my own CA (tinyCA) for signing CSRs and issue certificates. I was able to onboard vmanage, vbond and vsmart with signed certificates from tinyCA. Next I performed initial config on vEdge and installed root cert from my CA. It has been successfully authorized by vBond however in the vManage it shows with "Certificate installation failed" status. I am aware that there is an option "WAN Edge Cloud Certificate Authorization" which is currently set to Automated.
My question: Is Automated option supported when using private CA? Can you describe what goes on in the vManage when Automated option is enabled or please point me to an article.
Note that I was able to onboard vEdge when switching "WAN Edge Cloud Certificate Authorization" to Manual and signing the CSR on my CA but I want it to happen automatically.
Rudi
Hi Rudi -
Thx., for the comment on the post.
On your question:
Your understanding is correct. No support for automated option for private (enterprise) CA. For private (enterprise) CA,
there is a manual intervention involved. You need CSR signed by your CA server, then install signed PEM on vManage.
- SV
I have another connection issue but this time I am trying to join vEdgeCloud 19.1.0 to controllers running 18.4.1. According to compatibility matrix that should not be an issue.
I've done the following that worked on 18.4.1 vedges:
I went onto vManage and sent serial number to controllers manually multiple times and the proccess succeded but the vEdge was still stuck. I've also made sure the time on all devices is synced. Moreover I went through deleteing all certs (root and vedge') and reinstalled them with no luck of getting any further.
I went on and did the debug on vedge:
local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: create_ssl_conn_to_peer[5721]: %VDAEMON_DBG_MISC-3: SSL : Connecting to peer from ge0_0 to 10.0.0.3:12346 local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: create_ssl_conn_to_peer[5727]: %VDAEMON_DBG_MISC-3: SSL_connect ERR_WANT (server 10.0.0.3:12346) ... retrying local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_verify_callback[409]: %VDAEMON_DBG_MISC-1: Verify failed: self signed certificate! No need to panic!! local7.debug: Jul 30 09:28:24 vEdge30 last message repeated 2 times local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: ssl_connect_timer_cb[408]: %VDAEMON_DBG_MISC-3: SSL_connect succeeded from ge0_0 to 10.0.0.3:12346 local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vbond_parse_msg[2146]: %VDAEMON_DBG_MISC-3: Received a Challenge request from the Server local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vbond_parse_challenge[6480]: %VDAEMON_DBG_MISC-3: Parsing CHALLENGE .. local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vbond_proc_challenge[4298]: %VDAEMON_DBG_MISC-3: Received CHALLENGE on ge0_0.. local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_dtls_verify_vbond_cert[798]: %VDAEMON_DBG_MISC-3: Certificate validation Successful local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_dtls_verify_vbond_cert[812]: %VDAEMON_DBG_MISC-3: O-name vIPtela Inc, OU name SRC Internal (in cfg SRC Internal) local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_mark_n_sweep_vmanage_serial_numbers[4227]: %VDAEMON_DBG_MISC-3: vManage serial file DB does not exist yet. Add all. local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_send_challenge_ack[4624]: %VDAEMON_DBG_MISC-3: Send Challenge ACK ... (board_id_present No) local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vbond_parse_msg[2166]: %VDAEMON_DBG_MISC-3: Received a TEAR DOWN from the peer local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vbond_parse_tear_down[6912]: %VDAEMON_DBG_MISC-3: Parsing TEAR DOWN.. local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vbond_proc_msg[5801]: %VDAEMON_DBG_MISC-3: Received a TEAR DOWN (dev type 4) (teardown Just this peer) on ge0_0 local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_find_next_active_wan_intf[1465]: %VDAEMON_DBG_MISC-1: Next wan interface to connect to vmanage = none local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_cleanup_peer_frag[1777]: %VDAEMON_DBG_MISC-3: Cleanup Fragments .. local7.debug: Jul 30 09:28:24 vEdge30 VDAEMON[880]: vdaemon_ftm_send_ctrl_tun[242]: %VDAEMON_DBG_MISC-3: Local-TLOC 10.0.0.30:12346 Remote-TLOC 10.0.0.3:12346 msg-type Delete
I found some clues (red marked) but I am not sure what exactly those mean. vBond is supposed to authenticate vEdge via valid-vedge list and tell the vEdge the ip of valid vManage right? So I went onto the vBond and did this:
As you can see there are valid vSmarts and vEdges (including the one stuck - serial number 12) but the vManage list is empty. Is that expected? Note that I have other vEdges joined but the fact that the list is empty is a bit strange.
Why is vEdge not sending its board_id (is this chassis-id?)? I assume due to this vBond doesn't know who is trying to autehnticate and tears down the connection.
Any hint on what should I do next?
Hi -
- Starting 17.1, vManage will sign the vEdge cloud certificates
- On the vManage GUI, you need to add the serial numbers generated from PnP
- do send to controllers, then generate the bootstrap configuration for the vEdge cloud on the vManage
- On the vEdge make sure the reachability is there to the controllers
- Then execute the command on the vEdge cloud
request vEdge-cloud activate chassis xxxxx token xxxxx
- Then the vEdge cloud should be on-boarded on the overlay
Note: vManage sign the cert for the vEdge cloud devices, based on your note above step 5 is not needed, assuming you are generating the CSR on your vEdge cloud and signing it with your enterprise CA.
I'm in the same situation as mocnikr above
"At this point the vedge went back to vBond and is stuck there with REMOTE ERROR: BIDNTVRFD ""
My vEDGE is stuck at *CRTVERFL*
NFVIS-FCH2035V0ZN-vEDGE# show control connections-history Legend for Errors ACSRREJ - Challenge rejected by peer. NOVMCFG - No cfg in vmanage for device. BDSGVERFL - Board ID Signature Verify Failure. NOZTPEN - No/Bad chassis-number entry in ZTP. BIDNTPR - Board ID not Initialized. OPERDOWN - Interface went oper down. BIDNTVRFD - Peer Board ID Cert not verified. ORPTMO - Server's peer timed out. BIDSIG - Board ID signing failure. RMGSPR - Remove Global saved peer. CERTEXPRD - Certificate Expired RXTRDWN - Received Teardown. CRTREJSER - Challenge response rejected by peer. RDSIGFBD - Read Signature from Board ID failed. CRTVERFL - Fail to verify Peer Certificate. SERNTPRES - Serial Number not present. CTORGNMMIS - Certificate Org name mismatch. SSLNFAIL - Failure to create new SSL context. DCONFAIL - DTLS connection failure. STNMODETD - Teardown extra vBond in STUN server mode. DEVALC - Device memory Alloc failures. SYSIPCHNG - System-IP changed. DHSTMO - DTLS HandShake Timeout. SYSPRCH - System property changed DISCVBD - Disconnect vBond after register reply. TMRALC - Timer Object Memory Failure. DISTLOC - TLOC Disabled. TUNALC - Tunnel Object Memory Failure. DUPCLHELO - Recd a Dup Client Hello, Reset Gl Peer. TXCHTOBD - Failed to send challenge to BoardID. DUPSER - Duplicate Serial Number. UNMSGBDRG - Unknown Message type or Bad Register msg. DUPSYSIPDEL- Duplicate System IP. UNAUTHEL - Recd Hello from Unauthenticated peer. HAFAIL - SSL Handshake failure. VBDEST - vDaemon process terminated. IP_TOS - Socket Options failure. VECRTREV - vEdge Certification revoked. LISFD - Listener Socket FD Error. VSCRTREV - vSmart Certificate revoked. MGRTBLCKD - Migration blocked. Wait for local TMO. VB_TMO - Peer vBond Timed out. MEMALCFL - Memory Allocation Failure. VM_TMO - Peer vManage Timed out. NOACTVB - No Active vBond found to connect. VP_TMO - Peer vEdge Timed out. NOERR - No Error. VS_TMO - Peer vSmart Timed out. NOSLPRCRT - Unable to get peer's certificate. XTVMTRDN - Teardown extra vManage. NEWVBNOVMNG- New vBond with no vMng connections. XTVSTRDN - Teardown extra vSmart. NTPRVMINT - Not preferred interface to vManage. STENTRY - Delete same tloc stale entry. EMBARGOFAIL - Embargo check failed PEER PEER PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ vbond dtls 0.0.0.0 0 0 10.201.146.159 12346 10.201.146.159 12346 default tear_down CRTVERFL NOERR 39 2019-09-24T16:39:51+0000 vbond dtls 0.0.0.0 0 0 10.201.146.159 12346 10.201.146.159 12346 default tear_down SYSIPCHNG NOERR 0 2019-09-24T16:27:34+0000 NFVIS-FCH2035V0ZN-vEDGE#
What else can I do as remediation or other troubleshooting commands?
I had the similar issue as mocnikr and garrettc134.
vEdge Cloud did not connect to fabric with REMOTE ERROR: BIDNTVRFD
In my case it was a bug CSCvp75927 and provided workaround helped me.
svemulap@cisco.com wrote:Hi Rudi -
Thx., for the comment on the post.
On your question:
Your understanding is correct. No support for automated option for private (enterprise) CA. For private (enterprise) CA,
there is a manual intervention involved. You need CSR signed by your CA server, then install signed PEM on vManage.
- SV
Thank you for clarification on using private/enterprise CA. For onboarding a vEdge/cEdge nodes, are the following steps correct? For my lab, I use vManage as the enterprise CA.
--> Install enterprise root cert(pem) on the vEdge/cEdge
--> generate CSR from vEdge/cEdge cli and have it signed by enterprise CA
--> import the signed vEdge/cEdge.crt into vManage in the GUI
Thank you for the reply in advance.
I have a C1111-4PLTEEA router running 16.10.4 code. The controllers are using Cisco signed certificates and for the SDWAN routers I have Onbox certificate option checked.
I prepared templates in vManage and assigned them to the router. Then I used the initial config and usb to provision the initial config.
The router boots up and successfully grabs the config from the usb. I can ping the internet and also the public IP of my onPrem vBond.
When I check control connections on the router I see this familiar error:
PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ vbond dtls - 0 0 193.XX.XX.101 12346 193.XX.XX.101 12346 lte tear_down CRTVERFL NOERR 88 2020-05-07T09:47:24+0200 vbond dtls - 0 0 193.XX.XX.101 12346 193.XX.XX.101 12346 lte connect DCONFAIL NOERR 24 2020-05-07T09:38:38+0200 vbond dtls - 0 0 193.XX.XX.101 12346 193.XX.XX.101 12346 lte connect DCONFAIL NOERR 0 2020-05-07T09:13:47+0200
I've checked the org-name and other paramters and it looks fine. Furthermore I've checked the SUDI number and the serial number that the vBond has and it matches with the router. How can I troubleshoot this error deeper? I did not find any debug commands on the box.
Note: I have all controllers behind a firewall with private IPs and 1:1 static mappings with public IP for each controller.
Any ideas?
Rudi
Hello all,
My problem is here
HN_THD_WAN_2#sh sdwan control connection-history | in silver
vmanage dtls 1.1.1.6 4294946755 0 10.0.2.229 12946 3.1.66.91 12946 silver tear_down DISTLOC NOERR 0 2020-05-20T16:00:03+0700
vsmart dtls 1.1.1.4 4294946754 1 10.0.2.116 12346 3.0.25.255 12346 silver tear_down DISTLOC NOERR 0 2020-05-20T16:00:03+0700
vsmart dtls 1.1.1.5 4294946753 1 10.0.5.13 12346 52.64.3.185 12346 silver tear_down DISTLOC NOERR 0 2020-05-20T16:00:03+0700
vbond dtls 0.0.0.0 0 0 52.64.213.149 12346 52.64.213.149 12346 silver tear_down DISCVBD NOERR 1 2020-05-19T15:11:49+0700
vbond dtls 0.0.0.0 0 0 13.251.153.180 12346 13.251.153.180 12346 silver tear_down DISCVBD NOERR 1 2020-05-19T15:11:44+0700
vsmart dtls 1.1.1.4 4294946754 1 10.0.2.116 12346 3.0.25.255 12346 silver up RXTRDWN VP_TMO 0 2020-05-19T15:10:49+0700
HN_THD_WAN_2#sh sdw
HN_THD_WAN_2#sh sdwan con
HN_THD_WAN_2#sh sdwan contr
HN_THD_WAN_2#sh sdwan control con
HN_THD_WAN_2#sh sdwan control connections
PEER PEER CONTROLLER
PEER PEER PEER SITE DOMAIN PEER PRIV PEER PUB GROUP
TYPE PROT SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR PROXY STATE UPTIME ID
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
vsmart dtls 1.1.1.5 4294946753 1 10.0.5.13 12346 52.64.3.185 12346 silver No up 0:02:29:23 0
vsmart dtls 1.1.1.4 4294946754 1 10.0.2.116 12346 3.0.25.255 12346 silver No up 0:02:29:27 0
vbond dtls 0.0.0.0 0 0 13.251.153.180 12346 13.251.153.180 12346 silver - up 0:02:29:35 0
Please let me know how to resolve it!
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: