cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2424
Views
0
Helpful
1
Replies
Highlighted
Enthusiast

MSE HA broke apart

Hello board,

I installed two MSE-virtual appliances in HA mode, using one interface on each MSE (eth0).

Version: 7.2.110.0

After initial installation, I added the MSEs to NCS and everything worked perfectly.

Now for the HA case: The link to the 2nd MSE was down for 5 minutes. Then I repaired the link - both MSE can ping each other.

But the HA is not restored AND the NCS cannot reach the MSEs any more ("unavailable").

Here are some outputs:

Primary MSE:

[root@MSE01-PRIMARY ~]# gethainfo

Health Monitor is running. Retrieving HA related information

----------------------------------------------------

Base high availability configuration for this server

----------------------------------------------------

Server role: Primary

Health Monitor IP Address: *******

Virtual IP Address: *******

Version: 7.2.110.0

UDI: AIR-MSE-VA-K9:V01: *******

Number of paired peers: 1

----------------------------

Peer configuration#: 1

----------------------------

Health Monitor IP Address  *******

Virtual IP Address:  *******

Version: 7.2.110.0

UDI:  *******

Failover type: Automatic

Failback type: Manual

Failover wait time (seconds): 10

Instance database name: mseos3s

Instance database port: 1624

Dataguard configuration name: dg_mse3

Primary database alias: mseop3s

Direct connect used: No

Heartbeat status: Down

Current state: SETUP_FAILED

[root@MSE01-PRIMARY ~]# getserverinfo

Health Monitor is running

MSE services are down

MSE services are down?!

[root@MSE01-PRIMARY ~]# /etc/init.d/msed status

STATUS:

Health Monitor is running

MSE services are down

Ok ... then I'll start the MSE services ...

[root@MSE01-PRIMARY ~]# /etc/init.d/msed start

Starting MSE Platform

MSE Platform is already running

What?!?!

Ok .... next try:

[root@MSE01-PRIMARY ~]# /etc/init.d/msed restart

Stopping MSE Platform

Flushing firewall rules:                                   [  OK  ]

Setting chains to policy ACCEPT: nat filter                [  OK  ]

Unloading iptables modules:                                [  OK  ]

Starting MSE Platform

Flushing firewall rules:                                   [  OK  ]

Setting chains to policy ACCEPT: filter                    [  OK  ]

Unloading iptables modules:                                [  OK  ]

Starting Health Monitor, Waiting to check the status.

Health Monitor successfully started

Starting Admin process...

Started Admin process.

So far so good .....

[root@MSE01-PRIMARY ~]# /etc/init.d/msed status

STATUS:

Health Monitor is running

MSE services are down

aaaargh .... :-)

What is happening here? I cannot start the primary MSE ... and the secondary did not took over: (i guess)

[root@MSE01-SECONDARY ~]# gethainfo

Health Monitor is running. Retrieving HA related information

----------------------------------------------------

Base high availability configuration for this server

----------------------------------------------------

Server role: Secondary

Health Monitor IP Address: *******

Virtual IP Address: Not Applicable for a secondary

Version: 7.2.110.0

UDI: AIR-MSE-VA-K9:V01:*******

Number of paired peers: 1

----------------------------

Peer configuration#: 1

----------------------------

Health Monitor IP Address *******

Virtual IP Address: *******

Version: 7.2.110.0

UDI: AIR-MSE-VA-K9:V01:*******

Failover type: Automatic

Failback type: Manual

Failover wait time (seconds): 10

Instance database name: mseos3

Instance database port: 1524

Dataguard configuration name: dg_mse3

Primary database alias: mseop3s

Direct connect used: No

Heartbeat status: Down

Current state: FAILOVER_ACTIVE

So - the virtual IP is pingable - and the secondary MSE owns it (i guess).... But the virtual IP is not reachable by the NCS (status "unreachable).

From the NCS CLI I can ping the virtual IP. NCS and MSE are in the same layer-2 segment and on the same switch - no firewall.

What could be the reason for this. Is there any way to recover the MSE without running the installation procedure again?

Thanks!

1 REPLY 1
Highlighted
Enthusiast

I haven't opened a TAC case yet ... but I assume, that I ran into the bug CSCtz44158.

From the bug description, it should be fixed in the version I'm using (7.2.110.0) - but it really looks like this bug.