cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

AnyConnect Optimal Gateway Selection Operation

17650
Views
25
Helpful
11
Comments

 

Introduction

 

Optimal Gateway Selection (OGS) is a feature that can be used for   determining which gateway has the lowest RTT  and connect to that   gateway. Using the Optimal Gateway Selection (OGS) feature, administrators can minimize latency for Internet traffic without user intervention. With OGS, AnyConnect identifies and selects which secure gateway is best for connection or reconnection. OGS begins upon first connection or upon a reconnection at least four hours after the previous disconnection. More information can be foundin the Administrator's guide, this document explains how to troubleshoot issues with OGS.

 

How does OGS work?

 

A simple “ping” request will not work here as many ASA firewalls are configured to block ICMP packets in order to prevent discovery. Instead, the client will send three HTTP/443 requests to each headend that appears in a merge of all profiles. In order to ensure a (re)connection doesn't take too long, OGS by default selects the previous gateway if it doesn't receive any ping results within 7 seconds. (Look for "OGS ping results" in the log)

 

Note: The Anyconnect client should send an HTTP request to 443 as we don't care about getting a successful response, just a response. Unfortunately, the fix for proxy handling had the side-effect of sending all requests as HTTPS. See http://cdets.cisco.com/apps/dumpcr?&content=summary&format=html&identifier=CSCtg38672CSCtg38672 OGS should ping with HTTP requests

 

  • OGS  determines user location based on network information such as DNS  suffix and DNS server IP.  The  RTT results, along with this location,  are stored in the OGS cache.
  • OGS location entries are cached for 14 days.

Enhancement CSCtk66531 was filed to make these settings user-configurable.

  • OGS  is not run again from this location until 14 days after the  location  entry was first cached, during this time it will use the cached  entry  and the RTT times determined for that location.  This means that  if the user starts the AnyConnect client, it doesn't perform OGS again, it  just  uses the optimal gateway order in the cache for that location. In the DART logs the following message is seen:
    ******************************************
    Date : 10/04/2013
    Time : 14:00:44
    Type : Information
    Source : acvpnui

    Description : Function: ClientIfcBase::startAHS
    File: .\ClientIfcBase.cpp
    Line: 2785
    OGS was already performed, previous selection will be used.

    ******************************************
  • RTT   is determined using TCP to the SSL port of the gateway the user is trying to connect to as specified by the hostentry in the AnyConnect profile.
    Note: Unlike http-ping which does a simple http post and then displays the RTT and the result, OGS computations are a bit more complicated. The client sends out 3 probes for each server and calculates the delay between the http syn that it sends out and the fin/ack for each of those probes. It then uses the lowest of the deltas to compare the servers and make it's selection. So, even though http-pings are fairly good indication of which server the client will choose, it may not necessarily tally. More on this in the following sections.
  • Currently, OGS only does the checks if the user is coming out of suspend and the threshold has been exceeded. OGS will not connect to a different ASA if the ASA the user is connected to crashes or becomes unavailable. OGS contacts only the primary servers in the profile to determine the optimal one. 

 

OGS Cache

Once the calculation is done the results are stored in the preferencs_global file. In Windows it can be found under "C:\Documents and Settings\All Users\ApplicationData\Cisco". There have been issues with this data not being stored in the file before, refer bug CSCtj84626 for more details. 

Location Determination

OGS caching works on a combination of DNS domain and individual DNS server IP addresses. It works as follows:

  • Location  A has a DNS domain of locationa.com and 2 DNS server  IPs 'ip1' and  'ip2'. Each domain/ip combination creates a cache key  pointing to an  OGS cache entry. For example: 
    • locationa.com|ip1 -> ogscache1
    • locationa.com|ip2 -> ogscache1
  • If  the client then connects in to a physically different network  the same  buildup of domain/ip combinations is created and checked  against the  cached list. If there are any matches at all that OGS cache  value is  used and the client is still considered to be at location A.

 

Failure Scenarios

 

  • When   using OGS, if the user loses connectivity to the gateway they are connected  to, the AnyConnect client will then connect to the servers in the  backup  server list, *not* the next OGS host.  The order of operations is as follows:
    1. OGS contacts only the primary servers to determine the optimal one.\
    2. Once determined, the connection algorithm is:
      1. Attempt to connect to the optimal server.
      2. If that fails, try the optimal server’s backup server list.
      3. If that fails, try each remaining server in the OGS selection list, ordered by its selection results."
    Note that when the admin configures the backup server list the current profile editor only allows the admin to enter the FQDN for the backup server, but not the user-group as is possible for the primary server:Screen Shot 2012-12-22 at 4.59.47 PM.png
    Bug CSCud84778 has been filed to correct this but for the complete url needs to be entered in the host address field for the backup server and it should work: "https://<ip-address>/usergroup"
     
  • Resume after Suspend: When starting OGS after suspend, AC has to have a connection established when the machine was put to sleep in order for OGS to run after resume. OGS after resume is only performed after the network environment test occurs, which is meant to confirm that network connectivity is available. This test includes a DNS connectivity sub-test, however, if the DNS server drops type A requests with an IP address in the query field, as opposed to replying with "name not found" (the more common case, always encountered during our testing), then http://cdets.cisco.com/apps/dumpcr?&content=summary&format=html&identifier=CSCti20768CSCti20768 "DNS query of type A for IP address, should be PTR to avoid timeout" applies.

 

 

Typical User Example

The most common use case is when a user at home runs OGS the first time it records the DNS settings and the ping results in the cache (defaults to 14 days timeout). When the user returns home the next evening OGS detects the same DNS settings, finds it in the cache and skips the ping test. Later when the user goes to a hotel or restaurant that offers internet service OGS detects different DNS settings, runs the ping tests, selects the best gateway and records the results in the cache.The processing is identical when resuming from a suspended or hibernate state, assuming the OGS and AnyConnect resume settings allow for it.


 

Troubleshooting OGS:

 

Step 1. Clear the OGS Cache to force a re-evaluation

To clear the OGS cache in order to reevaluate the RTT of available  gateways simply delete the Global AnyConnect Preferences file from the  PC. 

Step 2. Capture the server probes during connection attempt

Tip: Sicne the capture is only for testing OGS, it's best to stop the captures as soon as the client selects a gateway. It's best to not run through a complete connection attempt as that can cloud the packet capture.

Step 3 Verify the gateway selected by OGS

To verify why OGS selected a particular gateway:

  1. Initiate a new connection:
  2. Run AnyConnect DART (Diagnostics & Reporting Tool):   Launch AC > Click "Advanced" > Click "Diagnostics" > Click "Next" > Click "Next"
  3. Examine the DART results found in the newly created "DartBundle_XXXX_XXXX.zip" file on the desktop
    1. Go to "Cisco AnyConnect Secure Mobility Client" > "AnyConnect.txt"
    2. Note down the time the OGS probes started for a particular server from the following DART log:
      ******************************************

      Date : 10/04/2013
      Time : 14:21:27
      Type : Information
      Source : acvpnui

      Description : Function: CHeadendSelection::CSelectionThread::Run
      File: .\AHS\HeadendSelection.cpp
      Line: 928
      OGS starting thread named gw2.cisco.com

      ******************************************
      Usually they should be around the same time, but in case the captures are large, knowing the time stamp helps narrow down which packets were the http probes and which ones were the actual connection attempt.
    3. Once the client has sent three probes to the server the following message is generated with the results for each of the probes:
      ******************************************

      Date : 10/04/2013
      Time : 14:31:37
      Type : Information
      Source : acvpnui

      Description : Function: CHeadendSelection::CSelectionThread::logThreadPingResults
      File: .\AHS\HeadendSelection.cpp
      Line: 1137
      OGS ping results for gw2.cisco.com: (219 218 132 )

      ******************************************It's important to note these three values as they will have to match up with the capture results.
    4. Look for the message containing "*** OGS Selection Results***"  to see the evaluated RTT & if the most recent connection attempt was  the result of a cached RTT or a new calculation.

      Example:******************************************

      Date        : 10/04/2013
      Time        : 12:29:38
      Type        : Information
      Source      : vpnui

      Description : Function: CHeadendSelection::logPingResults
      File: .\AHS\HeadendSelection.cpp
      Line: 589
      *** OGS Selection Results ***
      OGS performed for connection attempt. Last server: 'gw2.cisco.com'

      Results obtained from OGS cache. No ping tests were performed.

      Server Address     RTT (ms)
      gw1.cisco.com     302
      gw2.cisco.com     132            <============== As seen, 132 was the lowest delay of the three probes from the previous DART log
      gw3.cisco.com     506
      gw4.cisco.com     877


      Selected 'gw2.cisco.com' as the optimal server.

      ******************************************

 

Step 4. Validate the OGS calculations done by the client.

Inspect the capture for the TCP/SSL probes used to calculate RTT.   See how long the HTTPs request takes over a single TCP connection.  Each probe request should use a different TCP connection. To do this open the capture in wireshark and repeat the following steps for each of the servers:

  1. using the "ip.addr" filter isolate the packets sent to each of the servers into it's own capture. This can be done by going to Edit  and selecting "Mark all Displayed Packets" and then going to file > save as and selecting the "marked packets only" otion:
    Screen Shot 2013-10-26 at 5.43.22 PM.png
  2. In this new capture go to View > Time Display Format > Date and Time of Day:
    Screen Shot 2013-10-26 at 5.49.34 PM.png
  3. Identify the first http syn packet in this capture that was sent when the OGS probe  was sent according to the DART logs as identified in step 3.3.2.
  4. Using the feature to colourize tcp conversations identify each of the probes. This can de done by right clicking on the http syn for the first probe and then selecting the colourize conversation as shown below:
    Screen Shot 2013-10-26 at 5.53.49 PM.png
    This needs to be repeated for the syn on the next probe and the next one. As shown here the first two probes have been depicted in differing colours. The advantage of colourising the TCP conversations is that in case there are some retransmissions  or other such oddities, they can be easily spotted per probe.
  5. Change the time display to use "starting of epoch" as shown below:
    Screen Shot 2013-10-26 at 6.00.06 PM.png
    Ensure that  milliseconds is selected as the the level of precision since that's what OGS uses.
  6. Calculate the time different between the http syn and the fin/ack as shown in the diagram of step 4. Repeat this process for each of the three probes and then compare the values to those shown in the DART logs in step 3.3.3.

 

Once the capture is analysed for each server and the values are compared to those seen in the DART logs, if they all match up and it still seems like the wrong gateway is being selected, then it's due to one of two things:

1. Are there to many retransmissions from one particular headend, or any other such oddities seen in the probes?This could indicate an issue on the headned.

2. Fragmentation or large delays seen for one particular headend usually indicate problems with the ISP.

 

 

Q&A

 

Q: Does OGS work with load balancing?

A: Yes. OGS will only be aware of the cluster master name and will use that for judging the nearest head-end.


Q: Does OGS work with proxy settings defined in the browser?

A: OGS doesn't support auto proxy or PAC files but does support a hard-coded proxy server. As such, OGS operation does not occur. The relevant log message is: "OGS will not be performed because automatic proxy detection is configured"

Comments

Thanks for this great contribution!

Hello,

Can you please let me know how will the client reach if the optimized gateway is full?

Thanks,

Deepak

Beginner

very helpful information thanks.

Beginner

That is very good info about OGS, Thanks !

Do you know if enhancement request CSCtk66531 will also include an option for user to manually force OGS to run again ?

Cisco Employee

No it does not. I've updated the bug so that more details are visible. They should be published to cisco.com shortly. I've also filed enhance #CSCum05373 to track your request.

Beginner

Of note, OGS doesn't currently work correctly with Always On VPN.  this incompatibility is referenced in CSCuq37889. Behaviour is that the first gateway which responds is selected as the "optimal" gateway. No other gateways are even checked. The file referenced as storing the selected gateway ends up empty. 

I'd class this as a fairly bad bug, but it is currently classified as an "enhancement" request.

 

Cisco Employee

Hi Chris,


The reason that bug is an enhancement request is because OGS was never designed to work with Always ON VPN. However, because customers are using it in tandem we have decided to redesign the two features so that they may work together. If you would like to see this integration implemented soon please let your Account Manager know so they can follow up with the Anyconnect Product Manager.


Regards,
Atri

Contributor

Hello,

I would like to know if OGS is still supported on anyconnect version 4.x?

I am trying to configure it using client 4.1 but I don't seem to have the right configuration. Is there a good guide that I can follow for the complete configuration?

Contributor

Can we have a single domain name (testanyconnect.com) so that it can resolve more than one IP?

We would like to have two or more ASAs across the US being capable of accepting Anyconnect users. The users should be able to connect to the closest ASA.

Is that possible? 

Beginner

Paul Gilbert Arias, to present a single FQDN but be routed to the closest VPN gateway requires a global load balancer or sorts such as F5's Global Traffic Manager. This is basically an intelligent DNS server that uses GeoIP location to look up where the DNS query for your public FQDN is being sourced from and depending on the GeoIP database lookup will determine what IP the load balancer gives to the client. You would be able to control west based IP's to be served the IP address of your west VPN gateway and east based IP's to be served the IP address of your east VPN gateway. There is nothing built into the ASA to do this exact setup on it's own.

 

However, you could setup an AnyConnect Client Profile that has your VPN gateways predefined and use OGS so that the client can choose the closest VPN gateway based on the RTT values it gets from the probes. You'd want to be sure to disable the "User Controllable" option for OGS so that the users can't manually choose the gateway.

Cisco Employee

Many thanks for the complete 1 stop all you need to know about OGS / Kudos !!

Also Very much appreciate the great , on point , extremely useful comments left by @Atri Basu 

 

More details can be found in this rich article created by Atri :

https://www.cisco.com/c/en/us/support/docs/security/anyconnect-secure-mobility-client/116721-technote-ogs-00.html