[wellylug] hardware errors

kingsley at loaner.com kingsley at loaner.com
Wed Jun 11 20:23:16 NZST 2014


Hi Richard,

I searched Google's indices for the error messages
you wisely linked to.

Evidently 

    they're called "Machine Check Exceptions",

    they can be caused by a bad CPU, and

    a package named "mcelog" can collect and
    decode them.

I hope that helps,
Kingsley

On 06/11/14 18:51, Richard Hector wrote:
> Hi all,
> 
> I've got a (client's) machine that reports hardware errors, probably
> relating to memory. I'm running memtest86+ (latest version) on it; that
> hasn't shown anything so far. I suspect that may be because ECC corrects the
> errors before memtest86+ sees it, while Linux receives an exception or
> something and that's what it's logging.
> 
> Does anyone know of better tools to diagnose this - perhaps ones that are
> likely to always hit the error, or ones that can get around ECC hiding the
> problem, or (best) ones that will identify a specific flakey DIMM (or
> chipset or cpu or whatever is causing the problem)?
> 
> A sample of the errors is shown here:
> http://paste.debian.net/104039/
> 
> Any hints very welcome :-)
> 
> Thanks,
> Richard
> 
> 
> -- 
> Wellington Linux Users Group Mailing List: wellylug at lists.wellylug.org.nz
> To Leave:  http://lists.wellylug.org.nz/mailman/listinfo/wellylug

-- 
Time is the fire in which we all burn.



More information about the wellylug mailing list