Curious if anyone else has seen something similar to this, or has any thoughts on what the cause could be.
We have a small farm of WSA S670 boxes we maintain. Recently, they came under the limelight after dropped traffic and interface resets began happening almost uniformly across the entire farm. As such, we implemented a much greater degree of SNMP monitoring on these devices, and noticed an interesting trend.
The average CPU use percentage tends to sit around 5-7% on every S670, the memory usage hovers around 15-20% usage on them, and disk IOPS never peaks above 200. The amount of socket connections each WSA makes averages around 8k-13k during business hours - well below the 40k each is rated for at max. In other words, they're seemingly experiencing minimal load.
The Linux load average for them, however, always stays between '3' and '5'. In a nutshell, this means that there's some type of resource bottleneck on the WSAs that's causing processes to be cached in RAM instead of handed to the CPU because the CPU can't handle them at the current time. Unless a Linux box is getting high resource utilization, load average should never exceed '1'.
After doing some extensive head-scratching and digging, we found that the /proc directory on every single WSA in our farm was at 100% utilization. While I don't claim to be an expert in Linux by any means, this immediately threw a red flag in my mind since if this directory is completely filled, a traditional Linux installation won't allow you to start new processes as it has no place to store the ID files or related content.
Has anyone else seen a situation like this with their WSAs and a full /proc directory correlating to some bizarre traffic drops and latency? (and yes, I've already opened a case with TAC - I'm looking for community feedback, here)
in normal state /proc directory in WSA should not be 100% and the usage on this directory should be very minimal, and should be dissolved at time of shutdown.
If the usage is 100%, need to identify which process that consume that much (most likely occurred during system boot up, since proc will contains all processes in WSA and created on the fly when system boots) and most likely that process having issues or possible corruption.
TAC case definitely required for the engineer to get in to the root level of the WSA to check this from backend and escalate further if needed.
Hello! I run 22.214.171.124.When I click download updates in ASDM I get:Download updates failed: Peer certificate cannot be authenticated with known CA certificates I have 3 identical devices and all of them have the same problem.. How can I fix ...
You would like to use the ASA Firewall Umbrella Connector to enforce DNS policy with Umbrella. However you would also like to exclude certain IP addresses or subnets from using this policy. I recently had the need to do this, had a bit of tro...
Hi Everyonem Just wondering if anyone knows why I am getting an error that says "Cryptographic algorithms required by the secure gateway do not match those supported by AnyConnect. Please contact your network administrator.". See attached...
The Cisco 2020 CISO Benchmark Report provides valuable takeaways and data on the most pressing topics: the impact of vendor consolidation, cybersecurity fatigue, outsourcing, top causes of downtime, the most impactful threats, and more. The repo...
Hi, Has anyone run into the "Channel down" issue when updating the identity certificate on the Stealthwatch SMCv and SFCv. I'm doing a POC for a client and every time I go an update the identity cert the SMC says "it could save the configuration" and...