cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
901
Views
0
Helpful
0
Replies

Nexus 1000v and VCSA 6.0 connection issues

sparky256DSL
Level 1
Level 1

I have several environments that is running vCenter appliance 6.0 (3634794), ESXi hosts running 6.0 (3620759), and Nexus 1000v running 5.2(1)SV3(2.1) - Essential License. 

 

The problem I'm having is that randomly throughout the day the Nexus will lose connection with the vCenter server appliance. It re-connections a few seconds later and this triggers the DVS to reload in vCenter, delete and re-create the Nexus host online/offline alert definitions in vCenter and then the ESXi hosts all alert with the "host-10 online" alert because once the alert is created it is immediately checking and the VEMs are indeed still online. 

 

This results in the Nexus event log showing this:

2018 Apr 23 01:28:24 dvs01-xxxx-xxx vms[2723]: %VMS-3-CONNECTION_ERROR: Unable to communicate with vCenter Server/ESX. Disconnecting..
2018 Apr 23 01:29:25 dvs01-xxxx-xxx vms[2723]: %VMS-5-CONN_DISCONNECT: Connection 'vcentername.local' disconnected from the vCenter Server.
2018 Apr 23 01:32:50 dvs01-xxxx-xxx vms[2723]: %VMS-5-CONN_CONNECT: Connection 'vcentername.local' connected to the vCenter Server.
2018 Apr 23 01:32:53 dvs01-xxxx-xxx msp[2720]: %MSP-5-DOMAIN_CFG_SYNC_DONE: Domain config successfully pushed to the management server.

 

The vCenter vpxd log shows this several of these around the time the error occurs:

2018-04-23T18:17:11.087Z error vpxd[7FCE2E3A7700] [Originator@6876 sub=SSL SoapAdapter.HTTPService] Failed to read request; stream: <SSL(<io_obj p:0x00007fce287d9320, h:-1, <TCP '0.0.0.0:0'>, <TCP '0.0.0.0:0'>>)>, error: N7Vmacore16Timeou
tExceptionE(Operation timed out)

 

And this around the same time:

2018-04-23T18:17:21.680Z error vpxd[7FCE2ECB9700] [Originator@6876 sub=SSL SoapAdapter.HTTPService] accept failure N7Vmacore3Ssl12SSLExceptionE(SSL Exception: error:140000DB:SSL routines:SSL routines:short read) on stream (null)
2018-04-23T18:17:21.680Z error vpxd[7FCE2ECB9700] [Originator@6876 sub=SSL SoapAdapter.HTTPService] stream is NULL - no read scheduled

 

As part of the troubleshooting process, in a 2 host environment, I moved the vCenters to the same host that housed the active VSM to see if that would eliminate the physical network (the vCenter and VSM are on the same vLAN/subnet).  This didn't help. I also ran a continuous ping from a jump server to the vCenter, the mgmt vmk on the ESXi host, the primary VSM mgmt IP, a SRM server that is running on the same ESXi host as the VSM and vCenter and a VM that is not on the same ESXi host. The only drops were seen from the jump server to the vCenter, which lead me to suspect it was a vCenter issue so I opened a case with VMware.

 

VMware said that the alerts above mean nothing and to ignore them but it seems odd that we see these repeat several times at the same time (or right before) the issue happens. Googling have returned little on what these errors mean and my case with VMware is not going anywhere fast. I'm hoping to begin some troubleshooting from the Nexus side to see if I can determine if that is where the issue is.

 

Any advise on where to begin with troubleshooting from the N1Kv side?

0 Replies 0

Review Cisco Networking for a $25 gift card