[mythtv-users] Need ideas for auto-recovery of locked up backend

Simon Hobson linux at thehobsons.co.uk
Tue Apr 23 13:15:22 UTC 2013


Craig Huff wrote:
>Simon- are you talking about a watchdog timer that exists in the
>native hardware?  IIRC, such a timer exists in Intel CPUs.  Sadly, my
>system is based on an AMD Athlon CPU.  If I'm wrong, I'd gladly like
>more information on how to implement it with AMD CPUs.

Don't know if it's any help, but I spotted this on The Reg
http://forums.theregister.co.uk/forum/1/2013/04/22/dont_buy_without_ipmi/

Paul Crawford
System watchdog?
Quite a lot of ordinary motherboards have hardware watchdogs built in, for example the w83627 and similar chips that provide hardware monitoring (voltages, temperature, fan speeds, etc). This can provide a last-resort method of rebooting a sick server if you don't have lights-out support, but only SSH access.
With Linux you can add the corresponding watchdog driver module (they are black-listed by default in Ubuntu) and then the watchdog daemon and configure it to check a few vital signs. Typically you would check the load averages are not stupidly high (say over 5 per CPU core), maybe that rsyslogd is running, that you can run a simple bash script, etc.
If any of those tests fail then you get a moderately orderly reboot, and the hardware watchdog makes sure you get a reboot even if there is a kernel panic style of fault. Brutal perhaps, but it gets the system back up and hopefully either all OK again or at least you can SSH in to fix it.


More information about the mythtv-users mailing list