cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
861
Views
0
Helpful
3
Replies

What is the best way to keep up with "RED" agents?

Steve Atwood
Level 1
Level 1

TES 6.03.265

Hello all,

I'm fairly new to Tidal, and have a question mainly for those managing a hundred or more agents.

We have several hundred agents, and it is difficult for the operators to track & follow-up on all of the "RED" agents.

In the GUI, one cannot sort by "red or green"....and the other view of agents -  the "master" view of connections -  does not distinguish between disabled agents and broken (red) agents...it shows them all as "grey".

We REALLY need for our operations team to worry about RED agents....not so much the disabled agents (which are grey in both views).

A job that runs at a regular interval, and emails a list of any "RED" agents that it finds....would probably work just fine for us, if nothing better was available.

But I can't figure out the query....the closest I can get is to report on all inactive agents with "null" time zones....but I have seen where agents are green with no timezone (the master's agent) and red WITH a timezone (an aix agent)....so that is not 100% reliable.

I guess my question boils down to this:  how does that agent status dot turn "RED"?

Is that value extracted from a table, or does something trigger it to go red?

What does everyone else who manages hundreds of agents do to keep up with all the RED ones?

Thanks, - Steve

3 Replies 3

Tracy Donmoyer
Level 1
Level 1

I believe the master service tracks the agent status in memory, so you cannot query a table to determine the agent status directly.  You could query the msglog table each hour looking for the message indicating the master lost the connection to the agent.

You can use a System Event to trigger an action (email, SNMP, Alert) when an agent goes "RED".  This is what we do.

The status indicators in the lower, right of the client provide a quick status for "Alerts", "Master", "Connections" and "Data".

If the "Connections" background is "Yellow" you have at least one "RED" agent.  If the "Connections" background is "Green" all agents are "Green".

You can also double-click on these to quickly jump to the corresponding view in the Client.

Hope this helps.


Tracy

Thanks Tracy,

Guess I need to drill down a bit deeper.

My goal is to give the operators the ability to take a list of "RED" jobs - that they could generate by running a "report" or running a "report job", or by grouping them together in the client so they could copy/paste only the rows with RED dots....anything to generate a list that they could save as a txt file or attach to an email to another team, things like that. Sometimes we have a lot of broken agents to keep up with.

I've had mixed results with the alerts/events, anyway, in terms of general monitoring...sometimes we have a lot of agents that go from green to red back to green before the operator could even isolate the agent in question....a lot of false alarms...tweaking the timeout settings didn't help much, and a value too big seems to hang up the agent processes...not a good thing.

Thanks again for the response!

-Steve

oops I meant "RED" agents, not "RED" jobs.