[mythtv-users] Track down unstable hardware?

jarpublic at gmail.com jarpublic at gmail.com
Thu Feb 12 02:10:23 UTC 2009


On Wed, Feb 11, 2009 at 8:16 PM, Brian Wood <beww at beww.org> wrote:

> On Wednesday 11 February 2009 17:42:37 jarpublic at gmail.com wrote:
>
> >
> > At this point I am getting off topic for this list. It is certainly some
> > hardware failure. When it fails I can't get it to reboot. When I try to
> > boot from a live CD I get the same kernel panic. However, I would hate
> get
> > rid of the whole system, just because I am too ignorant to track down
> > exactly which pieced of hardware is failing. Does anybody know a good
> linux
> > list that may be able to help me track down which bit of hardware is
> going
> > bad? It is especially challenging because if I let the system sit for a
> > while it will boot up an work fine for some some indeterminate amount of
> > time. I have used lm-sensors to track temps and nothing seems to be hot,
> > all of the fans are running, and I have checked all of the drives for bad
> > blocks. I don't know what else to do at this point. I don't want to
> bother
> > the list anymore but does somebody know the right group to bother about
> > troubleshooting linux hardware?
>
> A machine that always works after being off for a while probably has some
> sort
> of thermal problem. Sensors are seldom helpful, as this could be on just
> about anything, chips, resistors, or even solder connections.
>
> You might try cooling various components with freeze-spray, that sometimes
> helps identify this sort of trouble. Remember that if the problem is on a
> chip die or the like it will take several seconds at least before things
> start to work after you spray it. Don't be impatient, or you will have
> sprayed lots of components and not know which one it was if it starts
> working.
>
> Otherwise, unless you have a lab full of test gear, the only practical
> troubleshooting method is substitution, replace things one by one with
> known
> good replacements until you find the problem.
>
> I'd suspect the PSU first, but YMMV.
>

A thermal problem seemed to be the most likely problem to me, but I wasn't
sure how to narrow this thing down. I didn't really consider the power
supply because it doesn't completely crash. It just freezes on the current
screen, and I lose all input and network. Even if I had hardware around to
switch out the problem is made complicated by the fact that even the bad
hardware works for some of the time. So it would be hard to say if switching
a component out help things work because of that component or because the
failing component happens to be working at that moment. The kernel panic
comes up immediately after grub before anything happens. So I was hoping
that it would be simple to narrow it down to a drive or perhaps there was
some way to get me some fore verbose error messages.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mythtv.org/pipermail/mythtv-users/attachments/20090211/9c02bc3b/attachment-0001.htm>


More information about the mythtv-users mailing list