cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
705
Views
0
Helpful
0
Replies

Cisco PI - Syslog Streaming Performance

Good Evening,

Note: I don't belong to an organization who sells Cisco equipment, so my knowledge is limited the documentation made available from Cisco. If this needs to be escalated directly to Cisco support (TAC), I can have our client submit this directly. I've already put in a ticket with our SIEM provider to review performance settings and options. 

I'm hoping someone can provide guidance on an issue a client is encountering. They have a substantial product portfolio of various Cisco assets, but the problematic one is the Cisco Prime Infrastructure device. They've requested that we ingest the logs from that asset into a SIEM solution. Shortly after allowing the logs through, we were alerted to large swaths of UDP packets (10-20% day over day) being dropped from the appliance. After some digging, we came up with some base information.

General Problem (CMD: netstat -su)

MichaelRedbourne_1-1670018815925.png

Specific Bottleneck (CMD: cat /proc/net/udp)

MichaelRedbourne_2-1670018958992.png

Notes: This tells me there's a performance bottleneck on lo:25224. That port belongs to Microsoft's OMS Agent for Linux package. 

UDP Checking (CMD: dropwatch -l kas)

MichaelRedbourne_3-1670019062539.png

General overview, but a good indication that the OMS Agent isn't clearing the queue fast enough. Dropwatch logs can vary substantially... Sometimes it drops 10k logs in a run, sometimes it drops 500k logs in a run. This seems to happen at evenly spaced times, which was a good indicator that 1 or more log sources was simply dumping some buffer of logs to the Syslog agent, instead of streaming them over in real time.

TCPDump - Identify the Asset (CMD: tcpdump -i ens192 port 514 -A > dump.pcap - wireshark analysis)

MichaelRedbourne_4-1670019296528.png

This is only a single dump. But we're averaging 1k-2k EPS without any performance issues. Then something around the 210-211s mark in the PCAP dump starts transferring 17k-18k EPS in a short timespan. From the PCAP, right around that mark, there are several thousand packets from a single source, all with the same type of Syslog message - "$TIMESTAMP ERROR [wirelessuser] [seqtaskexecutor-$PID] ERROR: Station entry NULL for ----<MAC>.\n".

MichaelRedbourne_6-1670019698719.png

I reached out to the client, who informed me the asset was a Cisco Prime Infrastructure device. Given the reliable nature in the timespan in which these logs show up, I'm assuming the asset is storing the logs and then sending them in one go (at least these types of logs... It's possible that some logs are streamed in real time.) So, I have a couple questions:

1. Is the default logging behaviour Cisco PI to buffer and then send some logs?

2. If it is, is there a way to change that default logging behaviour?

3. If it isn't, why is it doing this?

4. If there isn't [a way to change the logging behaviour], is it possible to stop logging specific messages?

0 Replies 0
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco