Unable to repeat ZTP

Martin Kyrc · ‎05-27-2019

Hello,

I'm playing in dCloud lab. ZTP is working for the first time (new lab), but after decommission vEdge device is another attempt of ZTP not working again. I'm not sure how can I "reset" device to "default setting" (I can't reset whole device, because I lost IP connectivity to it).

I have this message in the log:

Event Name : device-template-attached-during-ztp
Event Details : host-name=vManage; uuid=ddd801b2-8cbe-4394-abd1-3b71e39886e3; peer-type=vedge

Event Name : vbond-reject-vedge-connection
Event Details : host-name=vBond-2; uuid=ddd801b2-8cbe-4394-abd1-3b71e39886e3; organization-name=Cisco Sy1 - 19968; sp-organization-name=Cisco Sy1 - 19968; reason=ERR_BID_NOT_VERIFIED

I can't find description of error message "ERR_BID_NOT_VERIFIED". I think, it's connected with device validation against vbond but I'm not sure what else can I verify.

My troubleshooting steps:

vedge# show control connections-history
<cut>
CRTREJSER - Challenge response rejected by peer. RXTRDWN - Received Teardown.
CRTVERFL - Fail to verify Peer Certificate. RDSIGFBD - Read Signature from Board ID failed.
CTORGNMMIS - Certificate Org name mismatch. SSLNFAIL - Failure to create new SSL context.
DCONFAIL - DTLS connection failure. SERNTPRES - Serial Number not present.
NOERR - No Error. VS_TMO - Peer vSmart Timed out.
<cut>
PEER PEER
PEER PEER PEER SITE DOMAIN PEER PRIVATE PEER PUBLIC LOCAL REMOTE REPEAT
TYPE PROTOCOL SYSTEM IP ID ID PRIVATE IP PORT PUBLIC IP PORT LOCAL COLOR STATE ERROR ERROR COUNT DOWNTIME
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
vbond dtls 0.0.0.0 0 0 198.18.1.11 12346 198.18.1.11 12346 default tear_down CTORGNMMIS NOERR 0 2019-05-27T11:32:27+0000
vbond dtls 0.0.0.0 0 0 198.18.1.13 12346 198.18.1.13 12346 default challenge_resp RXTRDWN SERNTPRES 12 2019-05-27T11:32:15+0000

It looks like cert name mismatch. Which one? Manual attempt of "vedge-cloud activate" is not successful.

Has anybody experience with ZTP in the lab environment?

martin

Danny De Ridder · ‎05-27-2019

Hello,

to get more details as to what exactly is wrong with the certifciate, you can enable debugs on the vedge.

To catch this type of errors one would use "debug vdaemon misc high".

Data will then be capture in the log named : /var/log/tmplog/vdebug.

You can tail the log like this :

show log /var/log/tmplog/vdebug tail -f

To stop the tail, you can do <CNTRL>C.

This should print more info with regards to the error event you are seeing.

Regards,

Danny.

Martin Kyrc · ‎05-29-2019

Hi Danny,

thank you for log file where can I read some troubleshooting records ("vdaemon"):

vedge# debug vdaemon misc high

and then

vedge# vshell 
vedge:~$ less/tail/grep/... /var/log/tmplog/vdebug

I can find in the log file these messages connected to (un-successful) ZTP (with my comments):

!-- ge0/0 is my Internet interface (dhcp with connection to "ztp server", in this case vbond.cisco.com/<lab-ip-address>)
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_find_next_active_wan_intf[1422]: %VDAEMON_DBG_MISC-1: Next wan interface to connect to vmanage = ge0_0
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_set_confd_ztp_status[6293]: %VDAEMON_DBG_MISC-1: Setting ztp-status to 0
local7.debug: May 27 13:01:13 vedge stray: setsockopt: Bad file descriptor
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_cfg_confd_params_threaded_set[5747]: %VDAEMON_DBG_CONFD-1: Setting ztp status to 0
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_cfg_confd_params_threaded_set[5769]: %VDAEMON_DBG_CONFD-1: ztp status = 0
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_cfg_confd_params_threaded_set[6159]: %VDAEMON_DBG_MISC-1: Applying configuration after 0 retries to 0..
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_dtls_verify_vbond_cert[1014]: %VDAEMON_DBG_MISC-1: No SP Org name set .. bailing out
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vbond_proc_challenge[3445]: %VDAEMON_DBG_MISC-1: Unable to verify v Server's certificate .. bailing out
local7.debug: May 27 13:01:13 vedge VDAEMON[655]: vdaemon_find_next_active_wan_intf[1422]: %VDAEMON_DBG_MISC-1: Next wan interface to connect to vmanage = none
local7.info: May 27 13:01:13 vedge VDAEMON[655]: %Viptela-vedge-vdaemon-6-INFO-1400002: Notification: 5/27/2019 13:1:13 control-connection-auth-fail severity-level:major host-name:"vedge" system-ip::: personality:vedge peer-type:vbond peer-system-ip::: local-system-ip:0.0.0.0 local-color:default reason:"ERR_CERT_ORG_NAME_MISMCH"

In other words (correct me if I'm wrong):

vbond cert is not verified. Right?
I'm not sure, that the reason is "missing org name in cert", but last row in the log confirm it: "reason:ERR_CERT_ORG_NAME_MISMCH"

I tried to verify "server's cert" (in this lab: ztp.viptela.com or vbond.cisco.com) but no success:

vedge:~$ openssl s_client -showcerts -connect vbond.cisco.com:12346
connect: Connection timed out
connect:errno=110

it looks like "filtered port" (fw/acl?):

vedge:~$ nmap vbond.cisco.com -Pn -p 12346
Nmap scan report for vbond.cisco.com (198.18.1.11)
Host is up.
Other addresses for vbond.cisco.com (not scanned): 198.18.1.21
PORT      STATE    SERVICE
12346/tcp filtered netbus

I checked configuration and no specific port is defined - it means port 12346 (or 12347 for nat) is used. Correct?

Are my troubleshooting steps correct? Forgot I something to verify? In other words, I'm not closer to solving this issue. I have no remote access to ZTP and vBond server in this scenario (because dcloud lab) and I'm not able to verify "server's cert".

Any ideas?

martin

Danny De Ridder · ‎05-29-2019

Hello,

for the nmap test, I think you need to use UDP to test, not TCP.

EDGE:~# nmap 192.168.0.231 -sU -Pn -p 12346 --system-dns

Starting Nmap 6.47 ( http://nmap.org ) at 2019-05-29 14:57 CEST
Nmap scan report for 192.168.0.231
Host is up.
PORT      STATE         SERVICE
12346/udp open|filtered unknown

Nmap done: 1 IP address (1 host up) scanned in 2.80 seconds
EDGE:~#

Do you have both org-name and sp-org-name set?

Is setting under system.

The error seems to come from sp-org-name missing on the vEdge?

Danny.

Minnesotakid · ‎08-28-2019

Martin,

I had this error today as well. The step I took that I believed resolved it was sending my WAN Edge List of certificates to the Controllers. I could be wrong though.

Secondly, make sure your org name you specified is spelled exactly correct, including case-sensitivity.

Hope this helps!

Phil