12-15-2022 07:17 AM
We just updated a couple of 5545s this morning. Everything appears to be fine with one minor exception; we can see that traffic on the Cluster Control link is now sitting at a steady 140MBps and CPU is at a steady 30%. There is very little traffic on our network right now, and normally these numbers would be closer to 5Mbps on the CCL and less than 10% CPU.
Any thoughts on what to check to see what might be causing this?
12-15-2022 08:50 AM
Hi @t-andrews what ASA version have you upgraded to? Have you checked for specific bugs on that image?
12-15-2022 08:56 AM
Hi Rob...
We bumped up to 9.12(1)3
We saw this issue about a year ago and the problem just went away. I find it odd that it started up again right after the upgrade and reboot. The process also moved the ASA that was Master all this time to Slave. Not sure if that is part of the puzzle or not.
12-15-2022 09:05 AM
@t-andrews run show process cpu-usage sorted non-zero and check what process is using the CPU the most.
Check the CCL interface for errors show interface <int>
Ultimately it may lead to a TAC case, unless the output from above provides a clue to a specific resolved bug.
12-15-2022 09:13 AM
CPU:
- - 32.0% 32.2% 32.3% DATAPATH-0-1513
cluster interface:
Interface GigabitEthernet0/7 "cluster", is up, line protocol is up
Hardware is i82574L rev00, BW 1000 Mbps, DLY 10 usec
Auto-Duplex(Full-duplex), Auto-Speed(1000 Mbps)
Input flow control is unsupported, output flow control is off
Description: Clustering Interface
MAC address 0027.e3e4.1ee4, MTU 9198
IP address 192.168.231.2, subnet mask 255.255.255.252
1041133905 packets input, 192907058669 bytes, 0 no buffer
Received 59828 broadcasts, 0 runts, 0 giants
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
0 pause input, 0 resume input
0 L2 decode drops
2661252 packets output, 3391640993 bytes, 0 underruns
0 pause output, 0 resume output
0 output errors, 0 collisions, 1 interface resets
0 late collisions, 0 deferred
0 input reset drops, 0 output reset drops
input queue (blocks free curr/low): hardware (479/362)
output queue (blocks free curr/low): hardware (453/336)
Traffic Statistics for "cluster":
31413 packets input, 4561005 bytes
2661252 packets output, 3346755971 bytes
16 packets dropped
1 minute input rate 3 pkts/sec, 446 bytes/sec
1 minute output rate 240 pkts/sec, 306553 bytes/sec
1 minute drop rate, 0 pkts/sec
5 minute input rate 2 pkts/sec, 322 bytes/sec
5 minute output rate 239 pkts/sec, 308254 bytes/sec
5 minute drop rate, 0 pkts/sec
12-15-2022 01:20 PM
Hi @t-andrews,
Any specific reason to go for 9.12.1? Why haven't you upgraded to newer v9.12 (e.g. latest 9.12.4-55)?
All of your CPU usage is going for DATAPATH process, which is a process handling traffic processing. By looking at the output of your Gi0/7, I see about 2Mbps flowing, not 140. How many devices do you have in cluster? What is their status? Could it happen that not all nodes have joined cluster, thus no load sharing but only one device is processing all traffic, which could explain increased CPU usage?
Kind regards,
Milos
12-16-2022 05:24 AM - edited 12-16-2022 06:14 AM
Thank you for the response Milos. We went to 9.12.1 at the recommendation of the TAC earlier in the year. I originally bumped up to the latest version and ran into a configuration error that was preventing the data unit to join the cluster. This version gets us https access and the latest ASDM. I attached what I am seeing for that interface in LibreNMS. I'm going to grab a packet capture shortly now that we are back to having physical access to the units.
Edit: There are only two ASAs in this cluster.
12-16-2022 06:22 AM
A quick capture of about 1000 packets shows that the vast majority of this traffic is broadcast UDP 49495 from each of the CCL interfaces on each ASA.
12-16-2022 01:24 PM
Well... I'm officially baffled. After a full day of normal traffic (600Mbps 5 minute average) the traffic on the CCL remained high, but now that people are headed home and traffic is down to a trickle, the CCL traffic has backed all the way down normal levels. CPU looks to be returning back to below 10% as well. The way I interpret the attached graph: We are looking at the Master (control) ASA and the traffic Out is primarily updates being send to the Slave unit (NAT translations, etc). Traffic In was whatever the heck the Slave unit was sending back to the Master, which is now next to nothing.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide