Curious if anyone else has seen something similar to this, or has any thoughts on what the cause could be.
We have a small farm of WSA S670 boxes we maintain. Recently, they came under the limelight after dropped traffic and interface resets began happening almost uniformly across the entire farm. As such, we implemented a much greater degree of SNMP monitoring on these devices, and noticed an interesting trend.
The average CPU use percentage tends to sit around 5-7% on every S670, the memory usage hovers around 15-20% usage on them, and disk IOPS never peaks above 200. The amount of socket connections each WSA makes averages around 8k-13k during business hours - well below the 40k each is rated for at max. In other words, they're seemingly experiencing minimal load.
The Linux load average for them, however, always stays between '3' and '5'. In a nutshell, this means that there's some type of resource bottleneck on the WSAs that's causing processes to be cached in RAM instead of handed to the CPU because the CPU can't handle them at the current time. Unless a Linux box is getting high resource utilization, load average should never exceed '1'.
After doing some extensive head-scratching and digging, we found that the /proc directory on every single WSA in our farm was at 100% utilization. While I don't claim to be an expert in Linux by any means, this immediately threw a red flag in my mind since if this directory is completely filled, a traditional Linux installation won't allow you to start new processes as it has no place to store the ID files or related content.
Has anyone else seen a situation like this with their WSAs and a full /proc directory correlating to some bizarre traffic drops and latency? (and yes, I've already opened a case with TAC - I'm looking for community feedback, here)
in normal state /proc directory in WSA should not be 100% and the usage on this directory should be very minimal, and should be dissolved at time of shutdown.
If the usage is 100%, need to identify which process that consume that much (most likely occurred during system boot up, since proc will contains all processes in WSA and created on the fly when system boots) and most likely that process having issues or possible corruption.
TAC case definitely required for the engineer to get in to the root level of the WSA to check this from backend and escalate further if needed.
When I log into SecureX, I'm given an option to Sign in with MIcrosoft. What information is shared from my profile with Cisco?
1. If you signed in with your work email, the information shared from your profile is controlled by your or...
Stealthwatch Enterprise can be leveraged to monitor vulnerable devices, and alert on potential exploitation by bad actors looking to exploit Ripple20 and other potential vulnerabilities.
Note that the concepts and procedures outlined here can be used for...
The following is useful to those entities interested in monitoring appropriate usage of Cisco WebEx resources within their environments, as well as those interested in tracking additional metrics around usage of the WebEx service.
The relevant supporting...
I'm using AMP, and when I activated the SecureX Ribbon, I mistakenly used the wrong account to connect to SecureX. Now my SecureX Ribbon is connected to the wrong account. How do I fix it?
You can clear the SecureX Authorizatio...
I'm using Umbrella, and when I activated the Ribbon, I mistakenly used the wrong account to connect to SecureX. Now my SecureX Ribbon is connected to the wrong account. How do I fix it?
You can clear the SecureX Authorization for t...