Solved: Cisco Stealthwatch and Never trigger alarm when less than setting?

Meddane · ‎03-19-2023

I am reading the Cisco Stealthwatch Desktop Client User Guide and the following section about the Variance-based alarms

Never trigger alarm when less than: Also known as the minimum
threshold, this is a static value that indicates the lowest value to allow
for triggering an alarm. The alarm will not trigger when the observed
value falls below this setting. In other words, even if a host is greatly
over its expected value, if it is not more than the minimum indicated in
this dialog, then do not trigger an alarm.

For the option " Never trigger alarm when less than ", per the definition, if the obeserved value is less than 100 M as shown below, the alarm is not triggered.

Does it means that if the observed value is greater than 100 M as shown below an alarm is triggered ?

Tolerence and Threshold.PNG

jamegill · ‎03-31-2023

Hi @Meddane ...

> Does it means that if the observed value is greater than 100 M as shown below an alarm is triggered ?

No. It means it will never trigger below 100 M and always trigger above 1 G "points".

Let's unpack that a little (actually, a lot) ... I'll go over how that applies to the configuration for the Data Exfiltration Catgory Alarm in your screenshot specifically, but the patterns here apply to many other alerts in Cisco Secure Network Analytics (SNA).

For completeness, here's your screenshot again:

Tolerence and Threshold.PNG

Within SNA Core Events there are two types of Events: Category and Security.

This Data Exfiltration configuration applies to a Category Event . Category Events are always made up of some number of individual Security Events. In the case of Data Exfiltration the number is one, and that one Security Event is Suspect Data Loss. If you were to look at a different Category Event (for example, Concern Index) you would find a larger group of Security Events.

The settings for Suspect Data Loss Security Event will be measured in bytes rather than abstracted to points because, again, Category Events are conglomerated. Here too we have the "Never trigger when ..." and "Always trigger when ..." thresholds we can set:

So a host subject to the settings shown here will Always trigger this event when it is seen to upload more than 5 T (!!) of payload bytes to outside hosts in a 24h period.

In the case where a behavorial event is enabled and would trigger, we can suppress the event where the measure of client payload bytes is under 10 Mb (again, for the 24h period). The behavorial event can be disabled if the "Threshold Only" radio button is selected, in which case the only case for this event is the setting for Always Trigger value.

About those Behavorial and Threshold and Tolerance settings:

In both Category and Security Events the SNA system models (synonyms: baselines, measures, learns) the normal behavior over time for each host and Host Group in Inside Hosts (exception: if a Host Group has the "Enable baselining for hosts in this group" box un-checked). When a host has a busy day and exceeds the expected value for that setting, we can allow (tolerate) that behavior without triggering an event until the excess is too great. That's what the 1-100 tolerance does, it defines "how deviation much is too much."

However, if a host on the network has a modeled behavior that only transmits a few kilobytes each day, and it transmits two whole megabytes the deviation may be huge but the total volume is still in floppy-disk range and we don't want to set the SOC team to work through a "Data Exfiltration" event just to find that's all there was. So the Never trigger alarm when less than value enables that suppression.

I drew this picture to help explain:

Like I said, this pattern applies throughout Stealthwatch -- I mean Secure Network Analytics. It applies to bytes in Suspect Data Loss, and ICMP packets in the ICMP Flood event and flows in the New Flows Served event and so on. In the Category Events it always applies to points which aren't something you can configure or really need to worry about, but the idea is that this point system assigns a relative importance of those componet events. Because the Data Exfiltration example is boring with just one Security Event, let's look at this illustration of the components to the High Concern Index alarm (these are just examples for illustration, not actual values):

Have you noticed I keep talking about Events and not saying Alarms?

Both Category Events and Security Events can be configured (via the drop-down on the right side) to be set to On, Off, Alarm+On, or Ignore. An event that is Off will not do anyting. An event that is On will trigger and contribute points to its category, and that's it. An event set to On + Alarm will both contribute to the category event as well as create an alarm for the individual security event. Role policies also have the option to Ignore which allows the behavior for that event to pass-through to an underlying Role policy.

Policies are a lot simpler than they sound. They can be either Default or Role.

There are only two Default policies, one for Inside Hosts and one for Outside Hosts and the events enabled in those two policies apply to all hosts in either of those two disjoint sets. The system installs with a number of Role policies, and you should define more as needed. Role policies override or "mask" the settings of events that are also present in a Default policy. This can get tricky where you have multiple overlapping Role polices, which is why the Ignore option exists, but the recommendation is to have every host in the network assigned to one functional group, have a Role policy for the functional group, and only apply Role policies to functional groups. And then for carrying context (like a tag), add additional groups outside the By Function branch of the host group tree.

That's a lot to un-pack but now you know what all the switches and drop-downs mean in your screenshot and you know how to use Category events to bring up behaviors that might be otherwise missed because the individual indicative behaviors were not malicious in an of themselves. Pretty neat, right? This tool was originally called StealthWatch for a reason. (;

--jg

View solution in original post

jamegill · ‎03-31-2023