cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
173
Views
0
Helpful
4
Replies

OnPlus - Asymmetric Events and Useless Timestamps - DOWN, but no UP...

Kurt Schumacher
Beginner
Beginner

Ongoing... DOWN events, but no UP events...

Every site is part of a time zone - sorry to say we're not interested to know in what time zone the cloud servers are located.: We expect either standardized (GMT) or LOCAL - inccluding time zone and DST time on any events. conclude, time is still not handled according to ISO: Back to RD.

Event

OnPlus: Connecton status

Event Date/Time

2011-05-30 18:31:56-05:00

Event Message

Site Comms down: 84.nn.nn.nn

Customer Name

KCS (SOHO)

Device ID

00:50:43:nn:nn:nn

CONN, CLOSE or HEARTBEAT

HEARTBEAT

UP or DOWN

DOWN

Event

OnPlus: Connecton status

Event Date/Time

2011-05-30 19:09:46-05:00

Event Message

Site Comms down: 84.nn.nn.nn

Customer Name

KCS (SOHO)

Device ID

00:50:43:nn:nn:nn

CONN, CLOSE or HEARTBEAT

HEARTBEAT

UP or DOWN

DOWN


Tastes to me like a bad (statless) system design....

4 Replies 4

jamwyatt
Beginner
Beginner

Hi Kurt,

While I can't comment on the Timezone concerns, I can comment on the missing 'up' events. It turns out that they are simply a lower severity and don't show in the default event view (shows warnings and above). Further, there are two types of 'down' events. The ones you see below are generated when we detect loss of connection with the site (cable pull type of event). The second class is ones that are generated when we expected the loss of connection (i.e. after we trigger a reboot from the topology view) or the operating system was still active when the heart client was stopped (software upgrade causing a reboot). Both are also of a lower severity than the default view and we got a TCP packet from the site to close the socket.

While that's the details of today's operations, I can note that we discussed the severity issues several times. It was finally decided to use this two severity setting. While it is easy to change, the question is should we? The final thinking was to leave the 'heartbeat' failure at a higher severity so that the user could trigger alerts on 'warnings' and avoid general warnings from normal 'up/down' events.

Robert

Kurt,

Regarding the timestamps being formatted in the wrong timezone, you are absolutely correct in stating that we don't have it quite right yet. We're aware of the issue and I'm certain that it will eventually be addressed to your satisfaction. Gone are the days in the trial when we presented ambiguous timestamps with no timezone listed and english abbreviations for days and months (Fri Jun 2010). Internally, we store all timestamps in a way that we can generate an ISO-8601 style timestamp offset for any timezone.

We've picked up additional development folks, and right now we're in the mode of working down a prioritzed list of issues for stability, hardening, and a few features left to be implemented before we can release. You've pointed out some of the bugs and missing features in other posts, and we're always appreciative for your unfiltered feedback. It helps our management folks reorder the priorities correctly.