Solved: Re: FWSM 4.0(8) high CPU utilisation

Alan Boyd · ‎11-02-2010

We have a pair of FWSM blades operating as a failover pair, each in a separate 6509E chassis (Sup 720).

CPU utilisation is becoming excessive, but I don't believe this should be the case. Using the technique described in https://supportforums.cisco.com/thread/2017737 to look at the top processes in the FWSM the consistent top process is "fast_fixup", e.g.:

Runtime #1	Runtime #2	Diff	Process
360273017	360351472	78455	fast_fixup
1174932076	1174938333	6257	Dispatch Unit
3210954835	3210958511	3676	snp_timer_thread
6993331	6995201	1870	Logger
386634686	386636531	1845	OSPF Router

Runtime #3	Runtime #4	Diff	Process
361825694	361959758	134064	fast_fixup
1175032117	1175044607	12490	Dispatch Unit
3211013306	3211020798	7492	snp_timer_thread
386662410	386666305	3895	OSPF
7030900	7034264	3364	Logger

I've switched off inspection engines which I don't believe are required, but to no avail. I'm left with:

policy-map global_policy
class inspection_default
inspect icmp
inspect icmp error
inspect dns
inspect ftp
inspect h323 h225
inspect h323 ras
inspect netbios
inspect sqlnet
inspect sunrpc
inspect tftp
inspect xdmcp
class class_sip_tcp
inspect sip

Can anyone offer any insight as to why fast_fixup should be so cpu intensive, or where else I should be looking to resolve this?

Thanks

praprama · ‎11-02-2010

Hi,

When you run the command "show service-policy" a few times, which counter do u see incrementing the most? try disabling that particular fixup to see if it helps.

I would suggest opening up a TAC case for this as we can then troubleshoot better.

Regards,

Prapanch

View solution in original post

Kureli Sankar · ‎11-02-2010

Try to remove dns inspection and see if the CPU subsides.

-KS

View solution in original post

praprama · ‎11-02-2010

Hi,

When you run the command "show service-policy" a few times, which counter do u see incrementing the most? try disabling that particular fixup to see if it helps.

I would suggest opening up a TAC case for this as we can then troubleshoot better.

Regards,

Prapanch

Kureli Sankar · ‎11-02-2010

Try to remove dns inspection and see if the CPU subsides.

-KS

Alan Boyd · ‎11-02-2010

"show service-policy" output over various time periods consistently shows DNS inspection to be processing the most packets.

e.g. during a ten minute window DNS inspection processed over a million packets, as compared to the next busiest inspection engine (netbios) which only processed 40k packets.

I'll need to schedule a window to switch off the DNS inspection, but it does look very much like this is the source of the problem.

Thanks to both of you for your suggestions - I'll update this thread once I can confirm wither way.

ab

Alan Boyd · ‎11-09-2010

Partial success: removing the DNS inspection from the FWSM configuration dramatically reduced the CPU utilisation from a peak of ~90% to around 25%, which was fantastic.

Unfortunately it also introduced a connectivity issue between client hosts routed on some FWSM interfaces and servers routed on other (higher security) FWSM interfaces.

Roll back to having DNS inspection enabled resolved the connectivity issue, but obviosuly reintroduced the high CPU problem.

After a great deal of fiddling about I upgraded the FWSM software to 4.0(13) as a vain attempt to resolve thissue, but this has had no effect.

Until I can get to the root of why connectivity should be broken by removing DNS inspection it will have to remain in place - having reviewed the DNS inspection documentation again I'm at a loss to explain what the source of this fault is.

ab

Kureli Sankar · ‎11-09-2010

Increase the message lenght to 4096 and enable inspect dns.

inspect dns maximum-length 4096

command ref here: http://www.cisco.com/en/US/docs/security/fwsm/fwsm40/command/reference/i2.html#wp1623043

More details below:
ASAs/PIXen that are configured with the default DNS inspection policy which permits traffic up to 512 bytes.
This default will drop the DNSSEC packets which leverage EDNS0 and in turn are quite a bit larger than 512. In testing generally if a hard
number is to be used we (Cisco Security Research & Operations) would recommend 4096 as no DNSSEC packets should be larger than approx 3000.

In addition running ASA 8.2.2 or later ( due to enh defect CSCta35563 - fix) you can leverage the
following commands/configuration: (These commands are not available for FWSM - increasing the message length is the way to go)

policy-map type inspect dns preset_dns_map
parameters
message-length maximum client auto
message-length maximum 512

This configuration essentially provides the most optimal solution in that it behaves according to RFC/spec and using the 'message-length
maximum client auto' line allows the ASA to look into the DNSSEC query packets and set the size accordingly for the subsequent DNSSEC traffic
to pass. Note you use both the 'client auto' and the '512' command in tandem as if a non DNSSEC (EDNS0) packet is being processed it will be
processed according to the standard or defacto message-length command.

Without the 'client auto' command DNS traffic will still pass, but if it is EDNS0 packets AND they are larger than the configured DNS msg length
they should be dropped as any other DNS packet larger than the msg length would be. The client auto command resolves this by looking into
the DNS query which is where the msg size is requested and sets the msg-length "dynamically" to match this request.

-KS

Alan Boyd · ‎11-09-2010

Thanks for the suggestion, but the connectivity problems only occur when "inspect dns" is disabled, which is more than a little confusing.

Just as an aside, the documentation at http://www.cisco.com/en/US/docs/security/fwsm/fwsm40/command/reference/i2.html#wp1623043 for "inspect dns" states:

"If you enter the inspect dns command without the maximum-length option, DNS packet size is not checked."

I had read this as meaning packet size was irrelevant unless maximum length was configured. Is this not the case?

Thanks

Kureli Sankar · ‎11-09-2010

If the responses recd. are larger than what you had configured you may run into high cpu issues. These dropped messages will be logged in the syslog. That is the reason Prapanch is asking if you saw anything in the syslogs.

So, increase the len to 4096 and if you still see the issue pls. open a TAC case. May be the DNS server IPs that are being used are not active and we will be able to find that out once we got on the box.

-KS

Alan Boyd · ‎11-09-2010

If the responses recd. are larger than what you had configured you may run into high cpu issues. These dropped messages will be logged in the syslog. That is the reason Prapanch is asking if you saw anything in the syslogs.

Ok, that makes sense, thanks for the clarification.

So, increase the len to 4096 and if you still see the issue pls. open a TAC case.

No change unfortunately. CPU is still as high whether maximum-length is configured as 4096 or not.

May be the DNS server IPs that are being used are not active and we will be able to find that out once we got on the box.

The large amount of DNS traffic inspection taking place is almost certainly down to the fact we have our organisation web caches must traverse the FWSM in order to do name-lookups. Not ideal, but that is the set up we have.

Thanks for all the help

praprama · ‎11-09-2010

Hi,

With DNS inspection removed, did you notice any syslogs on the FWSM? If you have not tried that, i would suggest you to get soime syslogs with inspection removed so that we can see why connectivity is breaking up.

Also, regarding the high CPU, it might be better if you open up a TAC case for deeper analysis now that we know what exactly is cauing the high CPU as well.

Regards,

Prapanch

Alan Boyd · ‎11-09-2010

praprama wrote:

With DNS inspection removed, did you notice any syslogs on the FWSM? If you have not tried that, i would suggest you to get soime syslogs with inspection removed so that we can see why connectivity is breaking up.

I haven't seen anything in the logs to indicate a problem being logged, however I have configured explicit logging for packets going from the client network to a sample server (on the server network interface). When DNS inspection is enabled the ICMP packets used to verify connectivity from client to server are logged successfully. When DNS inspection is disabled no packets are logged by the ACE.

My next step will be to enable the equivalent logging on the client FWSM interface ACL and re-test.

Also, regarding the high CPU, it might be better if you open up a TAC case for deeper analysis now that we know what exactly is cauing the high CPU as well.

I think a TAC case is probably the way forward with this problem.

Thanks