cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
87
Views
0
Helpful
1
Replies

Dual ISP & Using OnPlus to help diagnose VoIP/Possible Bad ISP issues...

davebainum
Level 1
Level 1

Hi all,

So we have a somewhat interesting deployment scenario that I'd like some additional guidance/perspective on... and there are a couple of specific OnPlus messages (and questions) further below related to this.  There's also an interesting "twist" to this story (Dual ISP), which I'll mention further below...

We have a client (a remote site we haven't physically visited before) who set up their own phone system, and is having VoIP quality issues.  Certainly, there are many, many, MANY potential causes to this - which we've run through before at other client locations which we've helped set up from scratch, closer to our local area.  In this case, we're trying to help get this client and location up to "best practices" in a somewhat disciplined manner.

They are encountering periodic dropped calls or voice quality issues.  We first assisted them in upgrading their switches and firewall, which were sorely needed.  This has helped quite a bit.  They are still encountering periodic problems, but not nearly as many as before.

In particular, periodically on OnPlus, they are encountering alarms such as the below:

2011-08-02 10:35

Warning

Monitor: WAN network performanceOK on Network Performance on host PLG1000 (192.168.16.39) at 2011-08-02 10:35:14 -0400 - OK: Latency=167.447ms Jitter=58.863ms Loss=2.00%
2011-08-02 10:20

Critical

Monitor: WAN network performanceCRITICAL on Network Performance on host PLG1000 (192.168.16.39) at 2011-08-02 10:20:08 -0400 - CRITICAL: Latency of 1127.262ms exceeds threshold of 400ms Jitter=58.362ms Loss of 22.00% exceeds threshold of 10%.

I wanted to get some more insight into what this specific alarm means, and/or if there are more meaningful alarms that could be set up to help tackle or track the problem.  As I recall from an earlier posting, these types of alarms are hitting a test site or URL at Cisco, correct?  Is there any way to replace those test destinations, alarms, or URL's with other ones?

Curiously, when this alarm was encountered, there was very little traffic going through the firewall at that time.  This almost makes me wonder if the ISP itself may be "weak" or flaky.  It is a smaller, more rural ISP - not one of the "big guys" like Cox, Comcast, Verizon, etc..

And now - here's the interesting wrinkle.  The client recently added a second circuit, in the hopes that it may provide some relief to the voice quality problems.  Unfortunately, they added that 2nd ISP from the same provider as the first circuit.  We had suggested that they specifically pick a DIFFERENT provider, not only for redundancy reasons, but also because it would help rule out in case the ISP itself just has really bad routes or other poor internal traffic management.  Unfortunately, they decided to proceed their own way - so now they have two circuits with the same ISP...

Right now, my understanding is that Onplus can only monitor a single WAN interface or a single path out to the Internet - is that correct?  It would seem to be the case, given that OnPlus only has a single IP address and gateway out of the network...

Their network is comprised of a Cisco and Linksys switch, and a Sonicwall TZ210 firewall.  We have suggested that the firewall may possibly (still) be overloaded for the amount of traffic that they are sending in and out, which is not an insignificant amount.  However, the TZ210 is much better than what they had in place before, which was a fairly underpowered $99-$299 "thing" which I don't even remember what it was - possibly D-Link...

In the meantime, we've asked the client to maintain a central log sheet of the dropped calls or call quality problems, to help log the dates and times that these occur - so that we can then correlate those with the firewall logs (in terms of when peak traffic may be occuring) - as well as, possibly with Onplus alerts as well.

Thanks in advance for your advice...

-- Dave Bainum, PMP* (dbainum@ritetech.net)

RiteTech LLC / www.ritetech.net / Tel. +1 (703) 561-0607

[*PMP=PMI Certified Project Management Professional]

1 Reply 1

Michael Holloway
Cisco Employee
Cisco Employee

Hi Dave,

The current bandwidth monitor bounces packets (UDP port 14931) off of a responder at the portal (www.onplusbeta.com / www.cisco-onplus.com) which actively participates in performing the jitter calculations. It's an adequate solution, but it does have it's shortcomings - any connectivity problems between the ON100 and the portal anywhere over the internet will simply be detected as a fault in the customer's bandwidth. The portals are well-connected and should serve as a stable test point on the internet, but it does make the data difficult to use to pinpoint a local connectivity problem. This traffic also does not (currently) set any QoS bits in the packets, so this test traffic could be throttled by the customer's router if there is higher-priority traffic happening on the network while the test is run (every 5 minutes).

Something you could currently do in order to help verify or rule-out local connectivity issues is to do a traceroute to the customer's WAN IP and find their local ISP's gateway IP address(es), and then manually add these hosts to this customer's topology. Add a 'Host Performance (ICMP)' monitor for each of these 'devices', perhaps one for their first hop, and one for their voip provider's gateway, and you should then start to be able to gauge when and where local connectivity problems might be coming from. Unfortunately, without an active responder participating in the fun, only latency and loss (not jitter) can be monitored, but this should be enough to detect voice 'drop-outs'. High jitter usually just results in garbled VoIP.

Looking forward, the engineering team here has done some preliminary research on IPSLA support for OnPlus service. IPSLA does a much better job of testing bandwidth when VoIP is involved because it can simulate voice traffic during the test. Many IOS routers support the ability to be an IPSLA responder, so the idea is that if we added IPSLA as a type of monitor to OnPlus, you could point it at an IPSLA responder present on the VoIP provider's network, or the ISP's (or both), and would be able to tell when customers are experiencing poor VoIP performance. If this sounds like something you would be interested in, please let us know! The OnPlus product owners here at Cisco raise and lower the priority of these types of features that engineers get to work on based directly on your feedback and what you see as useful to your business.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: