We have a single 10Gb fiber connection between 2 datacenters. All production servers are in our main datacenter. Recently we moved production application server workloads to the remote datacenter. We perfrom failover testing from time to time in the remote datacenter which requires us shutting down the 10Gb interface between the sites on our Nexus. Now that the production workloads are running in the remote datacenter we cannot shutdown the 10Gb interface. How can we perform testing without affecting the production workloads running in the remote datacenter? Currently all servers in the remote datacenter are running on a single VLAN with EIGRP routing between sites. I am looking for input on how to allow the production application traffic from the remote datacenter to our main site while still able to perform our failover testing in the remote datacenter. Any suggestions would be appreciated.
We perfrom failover testing from time to time in the remote datacenter which requires us shutting down the 10Gb interface between the sites on our Nexus
What exactly were you testing before moving some production servers to the secondary DC. You say failover but failover from what to what ie. it cannot be failover between the DCs because you were shutting the DC interconnect down.
Perhaps you could clarify ?
The discussion subject may have been a bit misleading. I will edit and change to more appropriate subject.
We are replicating our production virtual infrastructure to the remote datacenter. We bring up these servers for testing so we can't have the same servernames existing in both sites.
When they have the same servernames that presumably resolves to the same IP so you are saying that one of the servers in the backup DC could accidentally be used instead of the one that should be the production server.
Is that right ?
If so, when you bring up the servers do they need to communicate with anything in the production environment or not ?
During testing the servers that are brought online do not need to communicate with anything production. However the server workloads running in the remote datacenter need to be able to communicate with the main datacenter.
These are separate servers though yes ?
Do you need just one vlan ie is it that you need L2 adjacency between the DCs on the interconnect ?
The easiest thing would then be to have the servers you want to bring up in their own vlan(s) and use acls to restrict traffic to and from them.
A further step down the line would be to use VRFs (switch dependant) so even if there was a mistake in the acls they would not have any routes to any production servers and vice versa.
And of course there is always a firewall although this would need careful placement to ensure it only stopped traffic to and from those servers and not anything else.
Edit - regarding the firewall, you definitely wouldn't want it facing the 10Gbps interconnect ie. in the direct path from one DC to the other, as even if you allowed everthing through it could have a serious impact on the production traffic.
I have some additional questions to already been asked by Jon :
1. These remote servers have different IP in remote DC? As they the same IP range or different IP range all together?
2. If these are virtual workloads, you have these servers shutdown and brought up only for testing or they are active at remote sites at all the times?
3. When you say application fail-over testing, what excatly do you mean by that? Server fail-over, aplication failover, database fail-over or the whole application instance only. There might be some dependencies for a testing on a specific application that is what I am trying to understand.
4. If you have the same server on the remote DC active, do you have user conencetd to it. How does your Storage traffic gets syncd at the back-end for active-active application traffic?
I might have more questions on this but I will write as they come on my mind. We need to understand what exactly you want to achieve so that we can look at some new tehcnologies to help better.
Haven't seen you around for ages. Mind you i haven't been round for a while until recently
Hope everythings good with you.
How you have been?
I am very well Mate. I know, its been ages and I have just started picking the things a bit more on supportform. Its jus work always mate, keeping me busy.
Lets connect on emails (email@example.com). I believe I still have your email address.
In response to you questions.
1. The servers have different IP's when thay are brought up. However, since we have DNS replicating between the datacenters there will be name resolution conflicts.
2. The virtual servers are replicating to the remote datacenter and are shutdown at all times.
3. The virtual servers are brought online for testing while the 10Gb link shutdown.
4. We have storage traffic that is replicating at all times. The luns are failed over to the DR site for testing. Any changes on the DR side are deleted after testing is completed.
I was thinking same as far as creating a separate VLAN for the production workload servers running in the remote site and restricting via ACL. The VRF option seems to be achievable, but would that complicate things a bit.Would you mind sharing some insight on how this would be accomplished.
Happy to help but i think it may be better to wait on Amit as he has more experience than me with DC interconnects/VmWare etc.
As a general guideline you could have a VRF for the servers. You don't need to use just one vlan, you can have multiple vlans and they can all be in the same VRF. The only routes those servers could use would be the ones for their actual vlans.
So they couldn't route anywhere else nor could anything route to them. In fact they are not visible to anything outside of the VRF.
It would provide complete separation on the same physical infrastructure at L3, the separation at L2 is the vlan. It's not full MPLS VPNs, it is more VRF-Lite which is just not as full featured.
So you could have the server vlans all allocated into one VRF, they could route to each other but not anywhere else. It does depend on an extent to the switch interconnects you have ie. if some of the servers were on one switch and some on another and this switch was shared by production servers then if the uplink between the switches was L3 this would be a problem ie. it would need to be a L2 trunk.
But it sounds like you only have one vlan anyway so i doubt you are using L3 uplinks between switches.
They can be a useful solution but vlans with acls might be enough and if you need to start allowing access to shared resources then it can get more complex with VRFs because you need to leak routes and you may well end up having to use acls anyway.
So it's an option but i think Amit will probably have some more and probably better than mine
Sorry for a late response, had to step out. What I was thinking is to use Private Vlans for this scenario.
We can use private Vlans on DC which will restrict the inter-vlan communication. This is easier than using VACLS ir IP ACL's.
If its the virtual enviornment we can also use some advanced solution to carry the private vlans back to main DC using OTV, if possible, or VXLAN is Nexus1000v is being used.
I back-up Jon's idea of using a diff vlan and then using an ACL or VRF to stop the inter-vlan comm. This can be a good start.
Appreciate the response to my question. So just for starts I have decided to go down the ACL path to permit access to only the subnet in the remote datacenter hosting the production workload when we are performing a test using the same servernames. The PVLAN and VRF option does make sense, but I would need to do some route leaking to provide web access to the VLAN in that particular VRF. So my question is where to place the ACL's and do I need an ingress and egress access-group on the interface to block traffic both ways. Here is the ACL I came up with. Let me know if this makes sense.
ip access-list dr-acl-in
remark Permit Ingress ATL subnet 10.x.x.x/22 to DevQa 11.x.x.x/23 and Mgmt 11.x.x.x/24
permit ip 10.x.x.x/16 11.x.x.x/23
permit ip 10.x.x.x/16 11.x.x.x/24
ip access-list dr-acl-out
remark Permit Egress DevQa 11.x.x.x/23 and Mgmt 11.x.x.x/24 to ATL subnet 10.x.x.x/22
permit 11.x.x.x/23 10.x.x.x/16
permit 11.x.x.x/24 10.x.x.x/16
ip access-group dr-acl-out out
ip access-group dr-acl-in in