Hopefully my gurus out there can assist with this problem. All help is greatly appreciated.
The Network Layout:
We have a site in India where we have a primary and secondary router with a Riverbed configured inline to optimize the traffic. Local traffic routes to an HSRP address shared between the 2 routers with preempt and tracking configured. The primary and secondary routers are using diverse carriers and have BGP established with each carrier. A majority of the customers on site use thin clients and connect to RDS servers in the U.S.A.
Traffic is flowing through the primary router and the carrier causes the circuit to drop or the BGP session flaps. This causes all traffic as designed to route from the primary router to the secondary router. After traffic is re-routed our thin clients report that their desktop session freeze and have to re-establish their sessions. Once they re-establish their sessions over the secondary path no problems exist.
From a routing perspective Is there a way to minimize the impact to the RDS clients, so they don't notice the fail over? I'm not a server guy, but is there a way to configure RDS so its aware of this routing change?
Thanks in advance!
Based on what you have told us so far I would guess that the issue is caused by the network address translation being performed on outbound traffic. A thin client establishes a session with its traffic going through the primary router and is translated to some Public IP. Then some event causes the traffic to switch to the secondary router. I am guessing that the secondary router translates the traffic to a different public IP and the change in the source IP causes the session to the server to hang.
Hey Rick thanks for chiming in. Sorry I didn't paint a clearer picture. The provided carrier circuits are MPLS and not internet. Would the situation fall under the same details you noted?
When you mentioned diverse carriers and BGP I assumed it was connections to the Internet which would use NAT. Thanks for clarifying that the connection is MPLS. Are the routers doing any address translation? Perhaps you could post partial configs from the routers?
Thanks for confirming that the routers are not doing address translation. So we have eliminated one possible explanation for the symptoms. Is there any stateful examination of traffic (firewall etc) as traffic goes through your network to the provider or as it goes through the provider network?
No firewalls inline of the traffic flows. Our India teams login to their thin clients, that connect to our RDS server environment in the U.S. This environment sits behind an F5 where multiple RDS pool members exist. We were asked by the server team why the sessions drop when BGP fails over to the backup path and if it can be prevented. Standard desktop connections, wireless, etc didn't notice the failover.
Thanks for the additional information. It is good to know that there are no firewalls filtering this traffic. It is especially interesting to know that desktops and wireless do not experience any impact when a failover event takes place. This makes me think that there is something about the routing for the pools. ow does the F5 process these?