[wellylug] hardware errors

Richard Hector richard at walnut.gen.nz
Wed Jun 11 18:51:58 NZST 2014


Hi all,

I've got a (client's) machine that reports hardware errors, probably 
relating to memory. I'm running memtest86+ (latest version) on it; that 
hasn't shown anything so far. I suspect that may be because ECC corrects 
the errors before memtest86+ sees it, while Linux receives an exception 
or something and that's what it's logging.

Does anyone know of better tools to diagnose this - perhaps ones that 
are likely to always hit the error, or ones that can get around ECC 
hiding the problem, or (best) ones that will identify a specific flakey 
DIMM (or chipset or cpu or whatever is causing the problem)?

A sample of the errors is shown here:
http://paste.debian.net/104039/

Any hints very welcome :-)

Thanks,
Richard



More information about the wellylug mailing list