Created by: James Maudlin on 07-02-2012 06:11:26 PM We are stuck with a WebEx Social installation issue
During the installation process we are getting a host not reachable message for Analytics Store, Notifier and WebEx Social Web. Our infra team has confirmed that thinks looks fine at network level and we don¿t have any firewalls in place. Will need cisco team to work jointly with us to sort this issue out.
Had partner try the following:
The first place I¿d look is DNS and then the network (i.e. ping the FQDN of the Director node from each of the unreachable nodes). Then I¿d check puppet SSL certificates.
Check the /mnt/auto/diagnostics/logs/common/messages log on the Director node for hints on what to do next probably.
Partner Response:
1. Currently 3 VMs are showing as Unreachable Host - Analytics Store, Notifier and WebEx Social Web. And yes, we are able to ping DNS and FQDN of Director node from these 3 VMs. 2. Regarding Puppet certificates: We checked on Analytics Store and got following error: [root@infy-WebEx Social-analytics ~]# service puppet debug err: Could not request certificate: getaddrinfo: Name or service not known Exiting; failed to retrieve certificate and waitforcert is disabled 3. On Director, certificates for these 3 VMs are not listed as follows: [root@infy-WebEx Social-director ~]# puppetca --list --all + infy-WebEx Social-director.ad.infosys.com (14:38:FB:37:C8:C3:91:58:9E:76:0F:E5:44:74:06:02) + infy-WebEx Social-istore.ad.infosys.com (42:3D:CC:7B:3E:B4:2A:83:24:FB:7F:3E:3F:25:4D:5E) + infy-WebEx Social-jsonstore.ad.infosys.com (3F:28:4D:7B2:331:E9:BB:FE:F0:8F:98:0B:E9:9D) + infy-WebEx Social-rdbms.ad.infosys.com (15:34:5B:0F:FE:A8:83D:EA:86:8C:24:7D:F5:13:94) + infy-WebEx Social-searchslave.ad.infosys.com (AB:05:B6:35:EC:78:87:AE:25:ED:F0:31:44:77:8E:B1) + infy-WebEx Social-searchstore.ad.infosys.com (4A:F7:18:14:75:36:23:7CF:A98:AD:67:9D:5C:9D) + localhost.localdomain (FA:CBD:66:ED:F1:3A:06:70:40:6A:61:08:1E:85:89) 4. When we run command to clean certificate of Analytics: [root@infy-WebEx Social-director auto]# puppetca --clean infy-WebEx Social-analytics.ad.infosys.com notice: Revoked certificate with serial # Inventory of signed certificates # SERIAL NOT_BEFORE NOT_AFTER SUBJECT 0x0001 2012-01-14T17:43:44GMT 2017-01-12T17:43:44GMT /CN=Puppet CA: localhost.localdomain 0x0002 2012-01-14T17:43:44GMT 2017-01-12T17:43:44GMT /CN=localhost.localdomain 0x0003 2012-01-14T17:53:14GMT 2017-01-12T17:53:14GMT /CN=infy-WebEx Social-director.ad.infosys.com 0x0004 2012-01-16T14:35:57GMT 2017-01-14T14:35:57GMT /CN=infy-WebEx Social-rdbms.ad.infosys.com 0x0005 2012-01-16T23:09:25GMT 2017-01-14T23:09:25GMT /CN=infy-WebEx Social-jsonstore.ad.infosys.com 0x0006 2012-01-16T23:35:09GMT 2017-01-14T23:35:09GMT /CN=infy-WebEx Social-cache.ad.infosys.com 0x0007 2012-01-16T23:55:26GMT 2017-01-14T23:55:26GMT /CN=infy-WebEx Social-dmqueue.ad.infosys.com 0x0008 2012-01-17T00:03:18GMT 2017-01-15T00:03:18GMT /CN=infy-WebEx Social-notifier.ad.infosys.com 0x0009 2012-01-17T00:43:47GMT 2017-01-15T00:43:47GMT /CN=infy-WebEx Social-istore.ad.infosys.com 0x000a 2012-01-17T17:46:25GMT 2017-01-15T17:46:25GMT /CN=infy-WebEx Social-searchslave.ad.infosys.com 0x000b 2012-01-18T17:15:08GMT 2017-01-16T17:15:08GMT /CN=infy-WebEx Social-searchstore.ad.infosys.com 0x000c 2012-01-22T17:32:37GMT 2017-01-20T17:32:37GMT /CN=infy-WebEx Social-istore.ad.infosys.com 0x000d 2012-01-22T18:07:30GMT 2017-01-20T18:07:30GMT /CN=infy-WebEx Social-WebEx Socialweb.ad.infosys.com 0x000e 2012-01-23T22:42:24GMT 2017-01-21T22:42:24GMT /CN=infy-WebEx Social-WebEx Socialweb.ad.infosys.com 0x000f 2012-01-23T22:59:08GMT 2017-01-21T22:59:08GMT /CN=infy-WebEx Social-notifier.ad.infosys.com err: Could not call revoke: Cannot convert into OpenSSL::BN 5. Logs mentioned by you are not listed, showing following error: [root@infy-WebEx Social-director ~]# cd /mnt/auto/diagnostics/logs/common/ -bash: cd: /mnt/auto/diagnostics/logs/common/: No such file or directory [root@infy-WebEx Social-director auto]# ls [root@infy-WebEx Social-director auto]# pwd /mnt/auto Note: We followed steps mentioned in attached mail to resolve these certificate related error but didn¿t help. Please advise on it.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: Haydeh Dejbakhsh on 07-02-2012 06:53:11 PM Hi, Output of the commands [root@infy-WebEx Social-director ~]# puppetca --list --all indicates that "Director" cannot communicate with Notifier, Analytics Store, & WebEx Social Web.
If DNS has been checked and all FQDN are correct then to make sure that there is no misspelled hostname in Director GUI I suggest to do a quick test: Shutdown Notifier VM Remove from Disk Notifier VM From Director Topology page "Delete" Notifier Node/role Add at Director Topology page Notifier node (Make sure the hostname is what is being defined at DNS) Deploy Notifier VM Power up Notifier VM and run the "Setup" make sure all the information (Hostname, IP, Domain, etc.) are correct After notifier came up then @ Director Topology page click on "Refresh" next to Notifier and see what does it report is it now running or still report un-reachable? Also at Director node issue: puppetca --list --all
Let us know if the Notifier is still missing from the list
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 07-02-2012 09:39:56 PM Yes, we are able to ping FQDN of these 3 VMs from Director and cross checked Topology page as well. FYKI, we have tried steps related to Notifier in past as well in same way as you mentioned but didn¿t help much. Nevertheless, we can try this again. Please confirm if we need to delete WebEx Social web VM prior to removing Notifier VM.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: Haydeh Dejbakhsh on 07-02-2012 11:31:28 PM For now please just power down the WebEx Social Web VM for this test and until all other nodes are deployed and working
If this is fresh install WebEx Social Web VM should be powered up when all other nodes are deployed and operational.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 08-02-2012 10:07:24 AM We have tried the steps suggested by James. Notifier is still showing unreachable host. But in director we are getting the certificate of notifier also now.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: Kalin Sheytanov on 08-02-2012 10:20:50 AM Hello James,
I presume that all you DNS / naming / networking settings are correct.
Please run "service puppet debug" on the Notifier node send us the output.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 08-02-2012 06:07:52 PM 1. For Notifier, its giving following output for service puppet debug:
[root@infy-WebEx Social-notifier ~]# service puppet debug
notice: Ignoring --listen on onetime run
err: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to parse template roles/openfire/of_property_update.sql: Could not find value for 'openfirexmppadminpass' at /opt/cisco/software/puppet/manifests/classes/openfire.pp:96 on node infy-WebEx Social-notifier.ad.infosys.com
notice: Using cached catalog
err: Could not retrieve catalog; skipping run
2. Other 2 nodes: Analytics Store and WebEx Social Web are still showing unreachable host as we have not made any change to them.
3. As we observed in Director and other reachable VMs like RDBMS Store and Index Store, Domain under /etc/idmapd.conf is showing as Infosys.com though we have mentioned Default DNS Domain as ad.infosys.com under setup for each VM. Please advise if it is fine or do we need to change it.
4. We are getting loads of debug statements on console of Director node when accessing through vSphere client. You can have a look on these over webex.
5. As we checked, couldn¿t find any error related to NFS server. When we set Mount point under DeploymentàSettings, it doesn¿t show any message like success or failure. Is it expected behavior? it will be great if you also can have a look at our NFS server settings and verify them.
6. On Director node, still /mnt/auto is empty and there is no directory inside it.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: Haydeh Dejbakhsh on 08-02-2012 07:57:11 PM Hi,
Based on the item #6 the NFS appears not to be working and as stated in the install guide if NFS is not working the installation would fail.
Issue the following command at NFS server and provide the output.
[root@WebEx Social-NFS ~]# exportfs -v
[root@WebEx Social-NFS ~]# service rpcidmapd status
[root@WebEx Social-NFS ~]# showmount -e
[root@WebEx Social-NFS ~]# service nfs status
[root@WebEx Social-NFS ~]# rpcinfo -p
Depends on the output of the above commands would give instruction as how to debug the issue.
Could you also provide the screen capture of the NFS config from the Director GUI setting?
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 09-02-2012 01:40:18 PM Please find below the output of the commands taken from NFS server. Also Please find below the screenshot of deployment settings page of director. Just FYI. We are able to ping NFS VM from director and viceversa. 1. [root@nfs-1 ~]# exportfs -v
/export/WebEx Social <world>(rw,wdelay,root_squash,no_subtree_check,fsid=0,anonuid=99,anongid=99) 2. [root@nfs-1 ~]# service rpcidmapd status rpc.idmapd (pid 2439) is running... 3. [root@nfs-1 ~]# service nfs status rpc.mountd (pid 3181) is running... nfsd (pid 3178 3177 3176 3175 3174 3173 3172 3171) is running... 4. [root@nfs-1 ~]# rpcinfo -p rpcinfo: can't contact portmapper: RPC: Remote system error - Connection timed out 5. [root@nfs-1 ~]# showmount -e mount clntudp_create: RPC: Port mapper failure - RPC: Timed out
¿
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: Haydeh Dejbakhsh on 09-02-2012 08:48:26 PM Check the following by editing the
/etc/idmapd.conf file to have the same domain name as it is in the same file /etc/idmapd.conf in Director node.
If file in NFS node need to be corrected then save and exit.
Do the following to active the changes
Service rpcidmapd restart
Is Firewall is on or stop at NFS server?
Issue again service rpcidmapd status rpcinfo -p and showmount -e
and send us the output. Thanks,
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 09-02-2012 11:11:27 PM Having them do what you asked Hayden
In the mean time, received the following:
now we are getting logs captured under /mnt/auto/diagnostics/logs/common/messages. PFB the logs we are getting for Notifier: Feb 9 18:18:03 infy-WebEx Social-director puppet-master[2889]: Failed to parse template roles/openfire/of_property_update.sql: Could not find value for 'openfirexmppadminpass' at /opt/cisco/software/puppet/manifests/classes/openfire.pp:96 on node infy-WebEx Social-notifier.ad.infosys.com Feb 9 18:20:01 infy-WebEx Social-rdbms kernel: SCSI device sdb: 209715200 512-byte hdwr sectors (107374 M Feb 9 18:19:05 infy-WebEx Social-director puppet-master[2889]: Failed to parse template roles/openfire/of_property_update.sql: Could not find value for 'openfirexmppadminpass' at /opt/cisco/software/puppet/manifests/classes/openfire.pp:96 on node infy-WebEx Social-notifier.ad.infosys.com Please advise on it.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 09-02-2012 11:31:09 PM From partner: As we had communicated earlier as well: As we observed in Director and other reachable VMs like RDBMS Store and Index Store, Domain under /etc/idmapd.conf is showing as Infosys.com though we have mentioned Default DNS Domain as ad.infosys.com under setup for each VM. Please advise if it is fine or do we need to change Domain in Director node¿ file.
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: Haydeh Dejbakhsh on 10-02-2012 01:15:21 AM The NFS server's Domain should be made the same as what it is in the Director /etc/idmapd.conf/
/heidi
Subject: RE: host not reachable message for Analytics Store, Notifier and WebEx Social Web. Replied by: James Maudlin on 13-02-2012 12:12:54 PM FYKI, we changed Domain to Infosys.com in /etc/idmapd.conf file on NFS (as it is in Director) and restarted rpcidmapd but it didn¿t help much. PFB the result of commands post this change: "/etc/idmapd.conf" 13L, 177C written [root@nfs-1 ~]# service rpcidmapd restart Stopping RPC idmapd: [ OK ] Starting RPC idmapd: [ OK ] [root@nfs-1 ~]# service rpcidmapd status rpc.idmapd (pid 3280) is running... [root@nfs-1 ~]# rpcinfo -p rpcinfo: can't contact portmapper: RPC: Remote system error - Connection timed out [root@nfs-1 ~]# showmount -e mount clntudp_create: RPC: Port mapper failure - RPC: Timed out
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: