[wellylug] High load averages but no apparent cause
Daniel Reurich
daniel at centurion.net.nz
Wed Mar 24 15:44:54 NZDT 2010
Hi David,
Just a hunch: I wonder if your harddrives are going into standby.
It's normally a bios setting, but can be overridden by issuing hdparm
-S0 -K for each drive. (I suggest you read the manual for hdparm first,
but -S sets the standby timeout, and -K makes it persistant across
reboot & powercycle).
Also make sure you haven't got laptop mode or any other "powersaving"
utilities installed or configured.
Regards,
Daniel.
On Wed, 2010-03-24 at 13:56 +1300, David Harrison wrote:
> Hi,
> Thanks for taking an interest.
>
>
> Attached are the "smartctl -a" outputs for the three drives.
>
>
> If you spot anything let me know.
> In the meantime we are arranging to have a server shipped up from
> Wellington to replace this one.
>
>
>
>
> David
>
>
>
>
> On Wed, Mar 24, 2010 at 12:47 PM, Daniel Reurich
> <daniel at centurion.net.nz> wrote:
> Can you post the full output of smartctl -a for each drive
> (offlist
> maybe?).
>
> Daniel
>
>
>
> On Wed, 2010-03-24 at 11:54 +1300, David Harrison wrote:
> > Just a follow up, this thread describes my problem exactly:
> >
> http://centos.org/modules/newbb/viewtopic.php?viewmode=flat&order=ASC&topic_id=22554&forum=37
> >
> >
> > The short story is that even though smartctl reports no
> issue there is
> > probably a hardware issue.
> >
> >
> >
> >
> > Below is the output from iostat on the server when the disk
> problem is
> > taking place.
> > The utilisation of sda and sdb are 100% even though they are
> hardly
> > doing anything.
> > This lockup remains for 15-20 seconds until something in the
> > kernel/hardware resets, and then it is happy again.
> >
> >
> > Output of iostat while problem is taking place:
> >
> >
> > avg-cpu: %user %nice %system %iowait %steal %idle
> > 0.38 0.00 0.00 99.62 0.00 0.00
> >
> >
> > Device: rrqm/s wrqm/s r/s w/s rsec/s
> wsec/s
> > avgrq-sz avgqu-sz await svctm %util
> > sda 0.00 1.75 0.25 0.50 2.00
> 14.00
> > 21.33 10.73 4300.00 1333.33 100.00
> > sdb 0.00 1.75 0.50 0.25 4.00
> 28.00
> > 42.67 10.10 3340.00 1333.33 100.00
> > sdc 1.50 0.00 0.25 0.50 14.00
> 4.00
> > 24.00 0.01 10.00 10.00 0.75
> > md0 0.00 0.00 0.00 0.00 0.00
> 0.00
> > 0.00 0.00 0.00 0.00 0.00
> > md1 0.00 0.00 0.00 0.00 0.00
> 0.00
> > 0.00 0.00 0.00 0.00 0.00
> > md2 0.00 0.00 0.00 0.00 0.00
> 0.00
> > 0.00 0.00 0.00 0.00 0.00
> > md3 0.00 0.00 0.25 2.75 2.00
> 22.00
> > 8.00 0.00 0.00 0.00 0.00
> >
> >
> >
> >
> > Output of iostat when problem is not taking place:
> >
> >
> > avg-cpu: %user %nice %system %iowait %steal %idle
> > 0.62 0.00 4.62 6.50 0.00 88.25
> >
> >
> > Device: rrqm/s wrqm/s r/s w/s rsec/s
> wsec/s
> > avgrq-sz avgqu-sz await svctm %util
> > sda 4.50 14.25 3.75 13.00 66.00
> 234.00
> > 17.91 0.08 4.93 4.63 7.75
> > sdb 6.50 12.25 5.25 12.00 94.00
> 210.00
> > 17.62 0.12 7.25 6.09 10.50
> > sdc 7.25 11.00 4.75 12.25 96.00
> 202.00
> > 17.53 0.10 5.88 4.85 8.25
> > md0 0.00 0.00 0.00 0.00 0.00
> 0.00
> > 0.00 0.00 0.00 0.00 0.00
> > md1 0.00 0.00 0.00 0.00 0.00
> 0.00
> > 0.00 0.00 0.00 0.00 0.00
> > md2 0.00 0.00 0.00 1.50 0.00
> 12.00
> > 8.00 0.00 0.00 0.00 0.00
> > md3 0.00 0.00 1.75 33.00 14.00
> 264.00
> > 8.00 0.00 0.00 0.00 0.00
> >
> >
> >
> >
> > This suggests that either there's a problem with sda and
> sdb, or
> > there's an issue with the SATA controller which is leaving
> both
> > hanging. Annoying either way...
> >
> >
> >
> > David
>
>
>
> --
> Daniel Reurich.
>
> Centurion Computer Technology (2005) Ltd
> Mobile 021 797 722
>
>
>
>
> --
>
>
> Wellington Linux Users Group Mailing List:
> wellylug at lists.wellylug.org.nz
> To Leave:
> http://lists.wellylug.org.nz/mailman/listinfo/wellylug
>
>
>
More information about the wellylug
mailing list