[wellylug] hating on the Logwatch
Daniel Pittman
daniel at rimspace.net
Wed Mar 4 15:41:55 NZDT 2009
Spiro Harvey <spiro at starforge.net.nz> writes:
>> On reviewing this I see that, in error, I misstated myself: I intended
>> to say "...because you *DON'T* want to know...", reversing the sense
>> of my comment. Darn.
>
> No, it worked out fine because my brain just registered that as
> sarcasm. :)
Oh, the irony. ;)
>> Anyway, was there some part of that explanation that wasn't
>> comprehensive enough, or was otherwise unclear? That seems likely,
>> given the context and my error.
>
> I was curious about the "model" you didn't like, but it appears that
> you see "analysis" as incomplete monitoring, which is understandable.
Not quite: I see analysis as part of monitoring, and a critical part of
it, because the volume of events in any meaningful system is vastly too
large for any sane person to review effectively.[1]
So, I see analysis and monitoring as part of the same process: turning
raw data into useful information for a variety of purposes.
Oh, and I think I had the "ah-ha!" moment about the other model of
monitoring. When I say that, do you think "watching for specific events
and reacting to them?"
> For those people who *do* like knowing all the routine stuff their
> servers are doing (or not doing) and don't need that to be realtime,
> Logwatch makes a passable attempt at scratching that itch.
My issue, primarily, is that I believe that is the worst possible model
for monitoring (or analysis). It gives the impression that monitoring
is happening without achieving that result.
> Of course, like most things, you can modify the pants off it to get
> what you *do* want out of it.
That is true.
>> In any case I could equally well state my objection to logwatch as a
>> log analysis tool: it focuses on the wrong area, working hard to
>> highlight routine and correct operation of the server, without
>> effective tools to identify or analyse exceptions.
>
> I use Logwatch on several of my servers, but it's by no means the only
> monitoring or analysis that gets done. It's just a quick way of me
> getting an overview before caffeine has kicked in. It has (usually by
> nature of omission) told me when stuff wasn't working right and led me
> to investigate.
*nod* That parenthetical comment is the bit that really gets me: you
can only tell that something interesting happened because the routine
noise[2] has changed or something has been omitted?
That isn't an effective mechanism for alerting you to anything, even if
it sometimes works.
Actually, no, that isn't a fair statement, although it is a tempting way
to view the problem. The real issue I have is this:
That is a mechanism for identifying problems that interacts extremely
poorly with human beings, and which is very prone to false-positive and
false-negative results.
I don't believe this is a good model to use, because those problems will
result in real issues being missed, where an "exceptions only" model
will highlight them.[3]
Regards,
Daniel
Footnotes:
[1] This is the same problem that aflicts CCTV style monitoring, as
well as bagage X-ray solutions: people get bored, quickly, and stop
seeing anything — even important things — if they get "a lot" of
uninteresting events. A lot is not a big number, in this case.
[2] Things that don't matter, which you see every single day.
[3] ...any "exceptions only" model is also prone to this, if allowed to
have many false-positive hits. Generally, though, it is easier to
correct those than the "spot the change in routine data" problem.
More information about the wellylug
mailing list