Showing results for 
Search instead for 
Did you mean: 

Fault tolerance sharing infrastructure with backup master?

Marc Clasby
Level 1
Level 1

For those that use fault tolerance in production (and non-prod)

Has anyone setup FT master and Backup Master on same server infrastructure?

Typically we have kept these separate machines but there seems to be little or no benefit besides high availability..

  1. All you are buying is failover from primary to backup
  2. If you have 2 data centers... you would typically put backup and ft servers in same data center anyway
  3. They run different sevices so they should not conflict
  4. You could create a dependecy on the FT master service being up before bringing up the backup master service

With our 6.1x install out architecture/infrastructure teams are looking to reduce overall infrastructure footprint since we are adding a heavy CM component.


8 Replies 8

The good thing with separate servers for master is so that you incur no downtime during server patching/hardware component failure but if you are not bound by tha requirement and yur servers is robust enough - why not.  Though I wonder why you would need the FT architecture in that instance?  WHy not just have one master with no FT and save money on the license?

I do agree about the footprint. we have 10 servers total now for TES 6.1 ( DEV and PRD) that excludes the Transporter/Win Agent and Database Cluster Servers so much overhead when it comes to monitoring, downtime DR etc.  They are VMs but still same work as physical.

Prakash Hemchand
Cisco Employee
Cisco Employee

We use FT with masters in two seperate datacenters; all running Linux on seperate servers.

Our FT server is very small compared to our master servers, 1 CPU x 2 GB RAM on a VM.

Imagine a scenario where you're running on the secondary master then that 'server infrastructure' goes down.  You won't be able to fail back over to the primary master without a manual intervention.

(Note, I don't work for or represent the Cisco Tidal BU).

I can see the point with separating FT from Backup Master... but where do you keep FT safe?

if you lose you primary datacenter you have HA...

Everything works by failing over to Backup Master since FT master and Backup Master are both located in secondary data center

if you lose your secondary data center, your primary keeps working (no need for HA), but FT master is gone so you couldn't use it to failover (backup master is gone too) so there is no need for FT in this scenario

I don't see a maintenance case for us that need for us to failover / back where we couldn't control the services on the server.

I don't see a case where we are in a failed over state (lost our primary master or datacenter) and then lose the backup master service / ft master service but have recovered the primary master .... odds are we have much bigger problems/ significant disaster.

I know Cisco does not recommend FT running on the same machine as the Backup but I am more curious as to why

(I am asking Cisco for more detail)

Ah, sorry about that,must be post turkey brain state - for some reason I was reading that you want all three Fault Monitor, master and backup master on the same server when you really just wanted FM and BM together-  hence my confused response. 

But yeah, as you have stated and knowing what I know now about BM's dependency on FM but not necessarily PM's - I don't see why you can't have FM and BM on same server.  I am assuming that you will actually incur a dowmtime unless you have time to failover to the master first (if BM was the active one) and  set FT=OFF before bringing the FM/BM server out of comission - and you won't during a DR scenario.

I imagine that as a good practice, fault tolerance requires the components to be on separate machines.

Good that you're checking with Cisco support as there may also be a a port conflict if sharing a box for the FT and BM.

Ports shouldn't be an issue, they all use different ports to comounicate that you can control. I did discuss briefly with Cisco, I will go with a separete machine...

From the Fault Tolerance Guide

Prerequisites for Installation

  • There must be at least three machines for a fault tolerance setup.
  • All three machines must be in the same domain.

so just to put together a highly available Tidal Envrironment you need a mininum of 7 machines...

  1. Primary - Data Center 1
  2. Backup - Data Center 2
  3. FT Master - Data Center 2
  4. CM - Data Center 1
  5. CM - Data Center 2
  6. Agent - Data Center 1
  7. Agent - Data Center 2

Our environment is all VMware and we will be building out on Windows Server 2012. I am aslo likely going to install an AGENT on all of the Masters (primary, backup, ft) but leave them turned off and only turn on when we need a quick increase in capacity. For example if we are running Agent 1 and Agent 2 to a percentage of their practical resource limits I would turn on the Backup Master Agent to relieve the pressure until infrastructure could build additional agent(s).

I'm under the impression that FT infrastructure works only in one datacenter, to failover to Backup incase we have any maintenance (patching etc) on primary master. As I see you are distributing primary and backup to 2 different datacenters, I'm wondering if we can extend one FT infrastructure between Production and DR environments. Has any one tested this set up ?

@jpforums2 - we actually have already been using two mastes in two datacenters in 5.3 - given our datacenters are in the same city and have a robust pipe between them.  At least in 5.3 we don't see any difference during the times when our database is on datacenter A and the master that is active is on datacenter B.   We are hoping it will be the same in 6.1 just that its more complicated now when you have to add the TES schemas for the client managers in the mix and how you accomplish DR for those databases as well across data centers.

I can see Marc's initial point though that if technically it is not an issue (no port conflict etc) to place FM and BM together then in my mind it should be OK long as they are aware of/have tested  the steps they need to take when the FM/BM server becomes unavailable.  Having so many servers increases overhead and points of failure (I know FT architecture is meant to spread the point of failure - heh).

Having said that, I myself have a separate FM server in my 6.1 environment, but I was planning on piggy backing the analytics tool on there if we ever get it.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: