cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1381
Views
12
Helpful
16
Replies

High CPU Utilization due to IPS Policy

Ditter
Level 4
Level 4

Hi to all,

i am posting this in order to have your opinion about it.

Today our users behind the FTD faced timeouts as well as high RTTs.

Digging a little bit i noticed that CPU core 16 (not the other cores)  was continuously steady at 100%. 

After disabling the IPS policy for the outgoing traffic  the timeouts stopped and the RTTs returned to normal.

So i decided to keep  the IPS process only for the incoming traffic. 

How could i identify the offending host or hosts ?  In addition is there any possibility for this to happen due to elephant flows passing through the firewall or probably a huge backup from inside to the Internet?

Any views/opinions are most welcome.

Thanks 

Ditter.

16 Replies 16

What FTD hardware are you running and what version software is installed on the FTD?

I have seen high latency being caused by Elephant flows and enabling Elephant flow remediation or sending that traffic outside of the IPS solves the issue.  Do you have Elephant flow detection enabled?  If yes you can search the "Analysis" logs for Elephant flows and see which source IPs were causing it.

But seeing a single core at 100% is normal at times.  The core number will change from time to time also.  When CPU becomes a problem is when several or all CPU cores are at 100%.

--
Please remember to select a correct answer and rate helpful posts

Hi Marius and @MHM Cisco World 

thank you for your response.

i am running 7.2.8 on the FTD cluster 

> show version
---------------------[ ftd-1 ]----------------------
Model : Cisco Firepower 2140 Threat Defense (77) Version 7.2.8 (Build 25)
UUID : 5857ad62-0bf5-11ed-b5a5-a5352e00b8f4
LSP version : lsp-rel-20241030-1856
VDB version : 397

and i am running 7.4.2 on the FMC.

> show version
----------------------[ fmc ]-----------------------
Model : Secure Firewall Management Center for VMware (66) Version 7.4.2 (Build 172)
UUID : 0be5b5be-bc49-11ed-8b60-038ff8fad965
Rules update version : 2024-10-30-001-vrt
LSP version : lsp-rel-20241030-1856
VDB version : 397

What i noticed is that although i had activated elephant flows i hadn't enable the bypass from within the same menu. 

However IAB was active. 

But i can not find any elephant flows from the analysis menu (although i filter with the field "Reason" for Elephant Flows.

Please see the attached PNGs.

Thanks,

Ditter.

Please see attached PNGs. 

When I had a TAC case on this they said that this was remediated in version 7.2.5, that being said it is quite possible that the issue was not actually solved or was re-introduced.

The thing with the FTD2000 series is that although you can enable Elephant flow detection, there is no remediation even if you enable it.  For remediation you would need to exchange the FTD2140 with either FTD1000, FTD3000 or FTD4100.

A suggestion from me would be to upgrade to the latest star version which is 7.4.2.1.  This will no doubt be a suggestion from TAC should you open a case with them.

--
Please remember to select a correct answer and rate helpful posts

  show asp inspect-dp snort

Hi MHM,

on my primary FTD:

 

> show asp inspect-dp snort

SNORT Inspect Instance Status Info

Id Pid Conns Segs/Pkts Status
-- ----- ---------- ---------- ----------
0 32159 1.9 K 0 READY
1 32162 1.9 K 0 READY
2 32166 2 K 0 READY
3 32183 2 K 0 READY
4 32164 1.9 K 0 READY
5 32165 1.9 K 0 READY
6 32185 2 K 0 READY
7 32138 2 K 0 READY
8 32186 2 K 0 READY
9 32156 1.9 K 0 READY
10 32140 1.9 K 0 READY
11 32158 2 K 1 READY
12 32187 2 K 0 READY
-- ----- ---------- ---------- ----------
Summary 25.4 K 1

 

ckleopa
Cisco Employee
Cisco Employee

It would be interesting to see what type of traffic are you currently inspecting under that IPS policy. Even though Elephant Flows event did not trigger, maybe the massive amount of inspected traffic could have something to do with it. What are your top applications that are hitting that IPS Rule at this moment? 

When the problem first occured i did inspect all outgoing traffic from around 1000 PCs (that is traffic going to Internet) , after the problem occured i stopped inspecting this traffic and the problem stopped, so currently  i do not inspect outgoing traffic , only incoming traffic to specific ports.

OK so you don't see this issue anymore. I guess one possible solution is to see if your connection events during the time of the high cpu usage is still available and try to create a report of the top Application Protocols that hit that IPS rule and see if this would provide any hints on where most of the inspection time went though it. 

In general, application protocols that are not encrypted do get the most inspection and based on those you can get some guidelines on what could have been the reason for the high cpu usage.

Hi Ckleopa, i limited down the observation window during the high cpu load period and by using the predefined searches i searched for elephant flows  but i did not find anything.  So i assume something else kept the cpu load to 100% (CPU core num. 16 in particular). However don't know how to search further for the reason of high cpu util. which actually affected all users with dropped packets and high RTT.  Thanks for your help.

When it happened again check snort cpu 

For snort not lina cpu health use 

Root@firepower:/opt/cisco/csp/application# top

MHM

 

Currently , with no intense network traffic the snort process keeps cpu load @ 48% and as mentioned it inspects traffic only in the incoming direction and only in high tcp/udp ports.  

 

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28736 root 1 -19 17.4g 8.1g 3.0g S 48.7 12.9 326:50.34 snort3

 

Ditter
Level 4
Level 4

Hi to all,

The problems continued with high unresponsiveness and with No active IPS rule and in addition   i also enabled the NO RULE ACTIVE in the IPS policy.  

The users still complaining  for very slow network response even though i checked the CPU and int was not high at all.

The message i got (i am not sure when i got this message) , i mean before or after activating the no rules active in the IPS Policy )   and it was the following:

Module: Automatic Application Bypass Status
Description: [12132] Process '/ngfw/var/sf/detection_engines/f08edaa6-0bf5-11ed-9aa5-95282f00b8f4/snort3 --plugin-path /ngfw/var/sf/detection_engines/f08edaa6-0bf5-11ed-9aa5-95282f00b8f4/plugins:/ngfw/var/sf/lsp/active-so_rules --daq-dir /ngfw/usr/local/sf/lib/daq3 -M -Q -v -c /ngfw/var/sf/detection_engines/f08edaa6-0bf5-11ed-9aa5-95282f00b8f4/snort3.lua -l /ngfw/var/sf/detection_engines/f08edaa6-0bf5-11ed-9aa5-95282f00b8f4 --id-offset 1 --id-subdir --id-zero --run-prefix instance- --control-socket /ngfw/var/sf/detection_engines/f08edaa6-0bf5-11ed-9aa5-95282f00b8f4/snort3.sock --create-pidfile -s 1500 -z 13 ' bypassed.

Dont seem to be very clear message to me , as IPS policy was not active in any rule and what i did in order to make things work again , was to remove from the FTD a vlan consisting of many users.  This brought things back to normal and traffic started to flow again.  

Looking at connection events and unified events i could not find the offending host (or hosts) . 

How could i troubleshoot this situation and have insight before just removing a vlan because simply it consisted of many users (and apparently my assumption was correct , but just an assumption

Thanks 

Ditter.

as you share above the Top was Snort 
so it snort issue 
try reduce the snort level 

Screenshot (183).png

Thanks @MHM Cisco World and @ckleopa Previously i had enabled the balanced mode , now according to your suggestions i activated the Connectivity over Security (now only 584 rules active). In addition i upgraded the FTDs to version 7.4.2.1-30 as well as the FMC to version 7.6.0-113.

I will also try the Cisco Recommended Rules but i noticed that it asked me on what ipv4/ipv6 networks i want the ips rules active.  But i have already configured the appropriate ipv4/ipv6 networks that i have activated the IPS policy, i do not understand why the system asks me this question about the networks i want the IPS policy active.

Review Cisco Networking for a $25 gift card