Optimal Gateway Selection (OGS) is a feature that can be used for determining which gateway has the lowest RTT and connect to that gateway. Using the Optimal Gateway Selection (OGS) feature, administrators can minimize latency for Internet traffic without user intervention. With OGS, AnyConnect identifies and selects which secure gateway is best for connection or reconnection. OGS begins upon first connection or upon a reconnection at least four hours after the previous disconnection. More information can be foundin the Administrator's guide, this document explains how to troubleshoot issues with OGS.
How does OGS work?
A simple “ping” request will not work here as many ASA firewalls are configured to block ICMP packets in order to prevent discovery. Instead, the client will send three HTTP/443 requests to each headend that appears in a merge of all profiles. In order to ensure a (re)connection doesn't take too long, OGS by default selects the previous gateway if it doesn't receive any ping results within 7 seconds. (Look for "OGS ping results" in the log)
OGS determines user location based on network information such as DNS suffix and DNS server IP. The RTT results, along with this location, are stored in the OGS cache.
OGS location entries are cached for 14 days.
Enhancement CSCtk66531was filed to make these settings user-configurable.
OGS is not run again from this location until 14 days after the location entry was first cached, during this time it will use the cached entry and the RTT times determined for that location. This means that if the user starts the AnyConnect client, it doesn't perform OGS again, it just uses the optimal gateway order in the cache for that location. In the DART logs the following message is seen: ****************************************** Date : 10/04/2013 Time : 14:00:44 Type : Information Source : acvpnui
Description : Function: ClientIfcBase::startAHS File: .\ClientIfcBase.cpp Line: 2785 OGS was already performed, previous selection will be used.
RTT is determined using TCP to the SSL port of the gateway the user is trying to connect to as specified by the hostentry in the AnyConnect profile. Note: Unlike http-ping which does a simple http post and then displays the RTT and the result, OGS computations are a bit more complicated. The client sends out 3 probes for each server and calculates the delay between the http syn that it sends out and the fin/ack for each of those probes. It then uses the lowest of the deltas to compare the servers and make it's selection. So, even though http-pings are fairly good indication of which server the client will choose, it may not necessarily tally. More on this in the following sections.
Currently, OGS only does the checks if the user is coming out of suspend and the threshold has been exceeded. OGS will not connect to a different ASA if the ASA the user is connected to crashes or becomes unavailable. OGS contacts only the primary servers in the profile to determine the optimal one.
Once the calculation is done the results are stored in the preferencs_global file. In Windows it can be found under "C:\Documents and Settings\All Users\ApplicationData\Cisco". There have been issues with this data not being stored in the file before, refer bug CSCtj84626for more details.
OGS caching works on a combination of DNS domain and individual DNS server IP addresses. It works as follows:
Location A has a DNS domain of locationa.com and 2 DNS server IPs 'ip1' and 'ip2'. Each domain/ip combination creates a cache key pointing to an OGS cache entry. For example:
locationa.com|ip1 -> ogscache1
locationa.com|ip2 -> ogscache1
If the client then connects in to a physically different network the same buildup of domain/ip combinations is created and checked against the cached list. If there are any matches at all that OGS cache value is used and the client is still considered to be at location A.
When using OGS, if the user loses connectivity to the gateway they are connected to, the AnyConnect client will then connect to the servers in the backup server list, *not* the next OGS host. The order of operations is as follows:
OGS contacts only the primary servers to determine the optimal one.\
Once determined, the connection algorithm is:
Attempt to connect to the optimal server.
If that fails, try the optimal server’s backup server list.
If that fails, try each remaining server in the OGS selection list, ordered by its selection results."
Note that when the admin configures the backup server list the current profile editor only allows the admin to enter the FQDN for the backup server, but not the user-group as is possible for the primary server: Bug CSCud84778has been filed to correct this but for the complete url needs to be entered in the host address field for the backup server and it should work: "https://<ip-address>/usergroup"
Resume after Suspend: When starting OGS after suspend, AC has to have a connection established when the machine was put to sleep in order for OGS to run after resume. OGS after resume is only performed after the network environment test occurs, which is meant to confirm that network connectivity is available. This test includes a DNS connectivity sub-test, however, if the DNS server drops type A requests with an IP address in the query field, as opposed to replying with "name not found" (the more common case, always encountered during our testing), then http://cdets.cisco.com/apps/dumpcr?&content=summary&format=html&identifier=CSCti20768CSCti20768"DNS query of type A for IP address, should be PTR to avoid timeout" applies.
Typical User Example
The most common use case is when a user at home runs OGS the first time it records the DNS settings and the ping results in the cache (defaults to 14 days timeout). When the user returns home the next evening OGS detects the same DNS settings, finds it in the cache and skips the ping test. Later when the user goes to a hotel or restaurant that offers internet service OGS detects different DNS settings, runs the ping tests, selects the best gateway and records the results in the cache.The processing is identical when resuming from a suspended or hibernate state, assuming the OGS and AnyConnect resume settings allow for it.
Step 1. Clear the OGS Cache to force a re-evaluation
To clear the OGS cache in order to reevaluate the RTT of available gateways simply delete the Global AnyConnect Preferences file from the PC.
Step 2. Capture the server probes during connection attempt
Tip: Sicne the capture is only for testing OGS, it's best to stop the captures as soon as the client selects a gateway. It's best to not run through a complete connection attempt as that can cloud the packet capture.
****************************************** Usually they should be around the same time, but in case the captures are large, knowing the time stamp helps narrow down which packets were the http probes and which ones were the actual connection attempt.
Once the client has sent three probes to the server the following message is generated with the results for each of the probes: ******************************************
Date : 10/04/2013 Time : 14:31:37 Type : Information Source : acvpnui
Date : 10/04/2013 Time : 12:29:38 Type : Information Source : vpnui
Description : Function: CHeadendSelection::logPingResults File: .\AHS\HeadendSelection.cpp Line: 589 *** OGS Selection Results *** OGS performed for connection attempt. Last server: 'gw2.cisco.com'
Results obtained from OGS cache. No ping tests were performed.
Server Address RTT (ms) gw1.cisco.com 302 gw2.cisco.com 132 <============== As seen, 132 was the lowest delay of the three probes from the previous DART log gw3.cisco.com 506 gw4.cisco.com 877
Selected 'gw2.cisco.com' as the optimal server.
Step 4. Validate the OGS calculations done by the client.
Inspect the capture for the TCP/SSL probes used to calculate RTT. See how long the HTTPs request takes over a single TCP connection. Each probe request should use a different TCP connection. To do this open the capture in wireshark and repeat the following steps for each of the servers:
using the "ip.addr" filter isolate the packets sent to each of the servers into it's own capture. This can be done by going to Edit and selecting "Mark all Displayed Packets" and then going to file > save as and selecting the "marked packets only" otion:
In this new capture go to View > Time Display Format > Date and Time of Day:
Identify the first http syn packet in this capture that was sent when the OGS probe was sent according to the DART logs as identified in step 3.3.2.
Using the feature to colourize tcp conversations identify each of the probes. This can de done by right clicking on the http syn for the first probe and then selecting the colourize conversation as shown below:
This needs to be repeated for the syn on the next probe and the next one. As shown here the first two probes have been depicted in differing colours. The advantage of colourising the TCP conversations is that in case there are some retransmissions or other such oddities, they can be easily spotted per probe.
Change the time display to use "starting of epoch" as shown below:
Ensure that milliseconds is selected as the the level of precision since that's what OGS uses.
Calculate the time different between the http syn and the fin/ack as shown in the diagram of step 4. Repeat this process for each of the three probes and then compare the values to those shown in the DART logs in step 3.3.3.
Once the capture is analysed for each server and the values are compared to those seen in the DART logs, if they all match up and it still seems like the wrong gateway is being selected, then it's due to one of two things:
1. Are there to many retransmissions from one particular headend, or any other such oddities seen in the probes?This could indicate an issue on the headned.
2. Fragmentation or large delays seen for one particular headend usually indicate problems with the ISP.
Q: Does OGS work with load balancing?
A: Yes. OGS will only be aware of the cluster master name and will use that for judging the nearest head-end.
Q: Does OGS work with proxy settings defined in the browser?
A: OGS doesn't support auto proxy or PAC files but does support a hard-coded proxy server. As such, OGS operation does not occur. The relevant log message is: "OGS will not be performed because automatic proxy detection is configured"