cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1057
Views
0
Helpful
2
Replies

Tail-f HCC HaException: wrong response for HA:BESLAVE

equinix
Level 1
Level 1

Hello,

 

I'm trying to set up a local HA lab which consists of two VMs, each running an NSO instance (system install).

Using the tailf-hcc package and the manual, I attempted to set up the basic example of HA - one master and one slave with failover-master=true.

Both the VMs are reachable via Ping between themselves.

 

However, when I activate HA (ha commands activate), nothing happens. The master NSO shows the correct state when doing 'show ncs-state ha', with the exception that it doesn't show any connected-slaves.

The slave remains on "ha mode none" and its ncs-java-vm.log shows the following exception:

com.tailf.ha.HaException: wrong response for HA:BESLAVE responded with: {error,25}

    at com.tailf.ha.Ha.checkResponse(Ha.java:297)

    at com.tailf.ha.Ha.beSlave(Ha.java:153)

    at com.tailf.ns.tailfHcc.Cluster.updateLocalNode(Cluster.java:1090)

    at com.tailf.ns.tailfHcc.Cluster.update(Cluster.java:1229)

    at com.tailf.ns.tailfHcc.TcmApp$PeriodicTask.run(TcmApp.java:493)

    at java.util.TimerThread.mainLoop(Timer.java:555)

    at java.util.TimerThread.run(Timer.java:505)

 

The slave's ncs.log also has the following error:

Failed to connect to master: host is unreachable

 

Has anyone had this issue?

 

Best regards,

Gabriel

1 Accepted Solution

Accepted Solutions

lmanor
Cisco Employee
Cisco Employee

Gabriel,

 

Since you can ping form device-to-device, the it is likely a firewall (or openstack security) issue not allowing the TCP port required for HA messaging to get thru.

 

Can you check if port 4570 is being listened for on the Master:

 

$ netstat -anp | grep tcp
...
tcp        0      0 0.0.0.0:4570            0.0.0.0:*               LISTEN      -
 

Also, an easy way to see if a port is blocked or not, is to use telnet.

 

telnet <ipaddr> portnumber

 

if it connects, the port is open  (with port 4570 enabled  on 103).

 

$ telnet 192.168.56.103 4570

Trying 192.168.56.103...

Connected to 192.168.56.103.

Escape character is '^]'.

$

 �m

CentOS64-17�Z��BSZ8�2~�

^]

telnet>

 

and a non-connection error like this if the port is not open (with port 4570 disabled  on 103):

 

[lmanor@CentOS64-2 logs]$ telnet 192.168.56.103 4570

Trying 192.168.56.103...

telnet: connect to address 192.168.56.103: No route to host

 

-Larry

View solution in original post

2 Replies 2

lmanor
Cisco Employee
Cisco Employee

Gabriel,

 

Since you can ping form device-to-device, the it is likely a firewall (or openstack security) issue not allowing the TCP port required for HA messaging to get thru.

 

Can you check if port 4570 is being listened for on the Master:

 

$ netstat -anp | grep tcp
...
tcp        0      0 0.0.0.0:4570            0.0.0.0:*               LISTEN      -
 

Also, an easy way to see if a port is blocked or not, is to use telnet.

 

telnet <ipaddr> portnumber

 

if it connects, the port is open  (with port 4570 enabled  on 103).

 

$ telnet 192.168.56.103 4570

Trying 192.168.56.103...

Connected to 192.168.56.103.

Escape character is '^]'.

$

 �m

CentOS64-17�Z��BSZ8�2~�

^]

telnet>

 

and a non-connection error like this if the port is not open (with port 4570 disabled  on 103):

 

[lmanor@CentOS64-2 logs]$ telnet 192.168.56.103 4570

Trying 192.168.56.103...

telnet: connect to address 192.168.56.103: No route to host

 

-Larry

Thanks Larry! That was exactly it, it's running smoothly now. 

 

I just opened the port on the master NSO via:

 

$ sudo firewall-cmd --zone=public --add-port=4570/tcp --permanent
$ sudo firewall-cmd --reload