cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
46833
Views
56
Helpful
35
Replies

Drain of unprocessed events from Connection Events - Appear on my Sourcefire

pongsiri_chu
Level 1
Level 1

Hi all, 

Have your ever got error "Drain of unprocessed events from Connection Events" on your device? I look on my FMC and it show this error on my sourcefire module. I tried to search this error on internet but didn't found any issue like this. Please help

noted : I check disk free space, both are fine.

35 Replies 35

We're running the latest version(6.2.2.1) and still get these issues.  We notice the most issues with our Firepower 2110 Active/Failover pair.

I also have 6.2.2.1 and i am receiving the error, someone has fixed this¿?

Getting the error in 6.4.0.1 also.

 

It usually means your FMC is receiving more events that it can handle.

Reducing the logging or increasing the FMC resources are the two best fixes.

Increasing cpu?

If it's a VM, CPU and memory might help.

An FMCv has inherent limitations though and may not be able to keep up with too many events no matter how much CPU and RAM you allocate. For medium to larger installations, a hardware FMC is always advisable (or else logging connection events to an external SIEM like Splunk and not to the FMC).

It's a virtual FMC and should have enough of everything. It have 8 CPU and 16GB memory.

But for me the problem startet with the latest version

If it's something other than resource limitation (the most common cause) it would be worth opening a TAC case and having them check. They can do some database queries and log checks to determine the root cause in your particular case.

Hello Community

 

I also have this issue, we have an existing tac case and we already did their recommendation 

 

configure log-events-to-ramdisk disable.

 

our software version is 6.2.3.6

 

waiting for TAC next action plan

Is that recommendation worked for the bug?

Is that recommendation worked for the bug?

argrullo
Cisco Employee
Cisco Employee

Hello everyone. This is a common error. Please read below. If the below does not solve your issue, please open a TAC SR for further investigation. The messages is version independent, this is not a bug on itself. 

One of the main reason is oversubscription/ the amount of logging being done in the FMC. If you set up logging, we recommend to do it at "End of Connection", not both. The end will also catch the beginning timestamp. 

 

+ Why are we seeing this message?

When a silo is being drained on a consistent basis before the events can be sent to the Defense Center, you will get a Health Alert for Frequent Drain of X Events (X is whatever type of events are being drained).  This is generally due to a communications error/issue between the sensor and Defense Center which prevents the events from being sent.  These events will be queued to be sent to the Defense Center once communication is re-established.  This alert doesn't necessarily means the events were unprocessed.  Whether they were processed or not can be verified by checking the dismanager.log file.

 

+ How is this determined?

Depending on the amount of diskspace available on the system, the The High and Low Water Mark will be assigned for each sileo. The HWM and LWM for each of these categories can be found in the /var/log/diskmanager.log file . The output for a Drain of Events will show several related pieces of information. We will look at a conn_events drain as an example:

conn_events, 1430229196,0,0,0,0,0,0,7761331964,129355537,0,7

In this instance, the HWM is 7761331964 bytes and the LWM is 129355537 bytes.  The last piece of this output is possibly the most important.  This section shows how many unprocessed files were deleted, 7 in this case. Seeing this can indicate that the Defense Center is oversubscribed and that connection event files are being deleted faster than the Defense Center can process.

 

+ What events were drained?

Run the following command on the sensor CLI

> expert
$ sudo su
#   grep -v '0,0' /var/log/diskmanager.log |less

 

+ How may of the files were drined on this sensor?

Total Silo change:

 

+ How do I check the silo?

The diskmanager process manages the various file types using silos. This silo information can be seen from the CLISH by running the "show disk-manager" command.

Here is some sample output:
> show disk-manager
Silo                                    Used        Minimum     Maximum
Temporary Files                         142.472 MB  145.368 MB  581.474 MB
Action Queue Results                    0 KB        145.368 MB  581.474 MB
Connection Events                       306.195 MB  1.420 GB    8.518 GB
User Identity Events                    0 KB        145.368 MB  581.474 MB
UI Caches                               184 KB      436.105 MB  872.211 MB
Backups                                 0 KB        1.136 GB    2.839 GB
Updates                                 26 KB       1.704 GB    4.259 GB
Other Detection Engine                  0 KB        872.211 MB  1.704 GB
Performance Statistics                  3.017 MB    290.736 MB  3.407 GB
Other Events                            2.126 MB    581.474 MB  1.136 GB
IP Reputation & URL Filtering           0 KB        726.843 MB  1.420 GB
Archives & Cores & File Logs            1.315 GB    978.961 MB    5.678 GB
RNA Events                              0 KB        1.136 GB    4.543 GB
File Capture                            0 KB        2.839 GB    5.678 GB
IPS Events                              229 KB      3.407 GB    8.518 GB

As you can see, this command shows the Silo, how much space is used, and is another way to see the LWM and HWM.

 

+ What should I do when files are being drained on the sensor?

  1. Check disk space.

# df -h

  1. Check tunnel status. There should be no error in the output.

# sftunnel_status.pl |less

  1. Reset tunnel

# manage_procs.pl

****************  Configuration Utility  **************

1   Reconfigure Correlator
2   Reconfigure and flush Correlator
3   Restart Comm. channel
4   Update routes
5   Reset all routes
6   Validate Network
0   Exit

**************************************************************
Enter choice: 5
Enter choice: 4
Enter choice: 3
Enter choice: 0

Thank you for the explanation, I will affect the services with that procedure?

It would not affect services since what is being restarted is part of the management channel, not the Detection Engine.


Seems my problem is solved.

We had a lot of policies, where logging was enabled, but not only in beginning but also at the end.

We also had to many filestypes selected in filepolicies.

Review Cisco Networking products for a $25 gift card