[wellylug] hating on the Logwatch

Wed Mar 4 15:41:55 NZDT 2009

Spiro Harvey <spiro at starforge.net.nz> writes:
>> On reviewing this I see that, in error, I misstated myself: I intended
>> to say "...because you *DON'T* want to know...", reversing the sense
>> of my comment.  Darn.
>
> No, it worked out fine because my brain just registered that as
> sarcasm. :)

Oh, the irony. ;)

>> Anyway, was there some part of that explanation that wasn't
>> comprehensive enough, or was otherwise unclear?  That seems likely,
>> given the context and my error.
>
> I was curious about the "model" you didn't like, but it appears that
> you see "analysis" as incomplete monitoring, which is understandable.

Not quite: I see analysis as part of monitoring, and a critical part of
it, because the volume of events in any meaningful system is vastly too
large for any sane person to review effectively.[1]

So, I see analysis and monitoring as part of the same process: turning
raw data into useful information for a variety of purposes.

Oh, and I think I had the "ah-ha!" moment about the other model of
monitoring.  When I say that, do you think "watching for specific events
and reacting to them?"

> For those people who *do* like knowing all the routine stuff their
> servers are doing (or not doing) and don't need that to be realtime,
> Logwatch makes a passable attempt at scratching that itch.

My issue, primarily, is that I believe that is the worst possible model
for monitoring (or analysis).  It gives the impression that monitoring
is happening without achieving that result.

> Of course, like most things, you can modify the pants off it to get
> what you *do* want out of it.

That is true.

>> In any case I could equally well state my objection to logwatch as a
>> log analysis tool: it focuses on the wrong area, working hard to
>> highlight routine and correct operation of the server, without
>> effective tools to identify or analyse exceptions.
>
> I use Logwatch on several of my servers, but it's by no means the only
> monitoring or analysis that gets done. It's just a quick way of me
> getting an overview before caffeine has kicked in. It has (usually by
> nature of omission) told me when stuff wasn't working right and led me
> to investigate.

*nod*  That parenthetical comment is the bit that really gets me: you
can only tell that something interesting happened because the routine
noise[2] has changed or something has been omitted?

That isn't an effective mechanism for alerting you to anything, even if
it sometimes works.

Actually, no, that isn't a fair statement, although it is a tempting way
to view the problem.  The real issue I have is this:

That is a mechanism for identifying problems that interacts extremely
poorly with human beings, and which is very prone to false-positive and
false-negative results.

I don't believe this is a good model to use, because those problems will
result in real issues being missed, where an "exceptions only" model
will highlight them.[3]

Regards,
        Daniel

Footnotes: 
[1]  This is the same problem that aflicts CCTV style monitoring, as
     well as bagage X-ray solutions: people get bored, quickly, and stop
     seeing anything — even important things — if they get "a lot" of
     uninteresting events.  A lot is not a big number, in this case.

[2]  Things that don't matter, which you see every single day.

[3]  ...any "exceptions only" model is also prone to this, if allowed to
     have many false-positive hits.  Generally, though, it is easier to
     correct those than the "spot the change in routine data" problem.