[mythtv-users] My solution to prevent system hanging forever at shutdown

Nicolas Krzywinski myth at site7even.de
Fri Aug 29 21:13:46 UTC 2014


Hi all,

it took me some time to get this problem first identified, then solved, which I want to share now.

As usual, one may skip the story and proceed to the hard facts below.

The Mythbox:
Self-assembled consumer hardware, focused on low power consumption, low noise and hifi-style optics running mythbackend, mysql, mythfrontend and mythwelcome on the same machine, configured for automatic startup and automatic shutdown.

The Problem:
Completely lacking any frequency and any identifyable reason, my mythbox _sometimes_ "forgot" to power off, hanging at shutdown screen forever. There were up to four weeks with everyday usage where everything went fine - followed by a couple of days with multiple hangs per day!

The Confusion:
Just to make it even more complicated, the system tended to change its mood after I tried to solve the issue. E. g. after adding acpi=force to the boot arguments, the system simulated of being tamed with powering off after shutdown for 1,5 weeks - just the day before I wanted to write my "solution" to this list, it kicked my ass and hung at shutdown again!

The Desperation:
Because - a long time ago - this hardware setup made problems on Microsoft Windows as well (bluescreens) which initially was my reason to switch this last windows system to Linux also, I started to give up and plan for new hardware.

The Idea:
Well, tired from trying to beat the beast, I got the idea to lock it down.

Meanwhile, I figured out that each hang at shutdown was preceeded by a kernel stack trace:
Aug 28 23:03:54 htpc7even kernel: [10297.933561] irq 16: nobody cared (try booting with the "irqpoll" option)
Aug 28 23:03:54 htpc7even kernel: [10297.933571] Pid: 0, comm: swapper/0 Tainted: G         C O 3.2.0-65-generic #98-Ubuntu
Aug 28 23:03:54 htpc7even kernel: [10297.933575] Call Trace:
Aug 28 23:03:54 htpc7even kernel: [10297.933578]  <IRQ>  [<ffffffff810dcd4d>] __report_bad_irq+0x3d/0xe0
Aug 28 23:03:54 htpc7even kernel: [10297.933596]  [<ffffffff810dd185>] note_interrupt+0x135/0x190
Aug 28 23:03:54 htpc7even kernel: [10297.933602]  [<ffffffff810da9ca>] handle_irq_event_percpu+0xaa/0x210
Aug 28 23:03:54 htpc7even kernel: [10297.933607]  [<ffffffff810dab81>] handle_irq_event+0x51/0x80
Aug 28 23:03:54 htpc7even kernel: [10297.933613]  [<ffffffff810ddbda>] handle_fasteoi_irq+0x6a/0x110
Aug 28 23:03:54 htpc7even kernel: [10297.933621]  [<ffffffff81016412>] handle_irq+0x22/0x40
Aug 28 23:03:54 htpc7even kernel: [10297.933626]  [<ffffffff8166e7aa>] do_IRQ+0x5a/0xe0
Aug 28 23:03:54 htpc7even kernel: [10297.933634]  [<ffffffff81663aee>] common_interrupt+0x6e/0x6e
Aug 28 23:03:54 htpc7even kernel: [10297.933637]  <EOI>  [<ffffffff81319574>] ? timerqueue_add+0x74/0xc0
Aug 28 23:03:54 htpc7even kernel: [10297.933652]  [<ffffffff81370f0d>] ? intel_idle+0xed/0x150
Aug 28 23:03:54 htpc7even kernel: [10297.933656]  [<ffffffff81370eef>] ? intel_idle+0xcf/0x150
Aug 28 23:03:54 htpc7even kernel: [10297.933664]  [<ffffffff8150db51>] cpuidle_idle_call+0xc1/0x290
Aug 28 23:03:54 htpc7even kernel: [10297.933670]  [<ffffffff8101322a>] cpu_idle+0xca/0x120
Aug 28 23:03:54 htpc7even kernel: [10297.933676]  [<ffffffff81629fae>] rest_init+0x72/0x74
Aug 28 23:03:54 htpc7even kernel: [10297.933684]  [<ffffffff81cfcc06>] start_kernel+0x3b5/0x3c2
Aug 28 23:03:54 htpc7even kernel: [10297.933690]  [<ffffffff81cfc388>] x86_64_start_reservations+0x132/0x136
Aug 28 23:03:54 htpc7even kernel: [10297.933696]  [<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140
Aug 28 23:03:54 htpc7even kernel: [10297.933701]  [<ffffffff81cfc459>] x86_64_start_kernel+0xcd/0xdc
Aug 28 23:03:54 htpc7even kernel: [10297.933705] handlers:
Aug 28 23:03:54 htpc7even kernel: [10297.933710] [<ffffffff81498080>] usb_hcd_irq
Aug 28 23:03:54 htpc7even kernel: [10297.933717] [<ffffffff8146cb80>] ata_bmdma_interrupt
Aug 28 23:03:54 htpc7even kernel: [10297.933721] Disabling IRQ #16

(Of course I did everything I could find, to try to solve this crash, but nothing worked - all system functionality remains unaffected when this crash occurs, this seem to only impact the power off functionality)

The Solution:
I wrote some little helper scripts to monitor syslog for the above error and to act with disabling the automatic shutdown and emailing me what happened:

-------------------------------------------------------------------------------------------------------------

nsk at htpc7even:~$ cat bin/start-crash-monitor.sh
#!/bin/bash
mkfifo /tmp/logmonfifo
/home/nsk/bin/fiforeader.sh &
tail -n0 -F /var/log/syslog | grep --line-buffered "irq 16: nobody cared" > /tmp/logmonfifo

-------------------------------------------------------------------------------------------------------------

nsk at htpc7even:~$ cat bin/fiforeader.sh
#!/bin/bash
pipe=/tmp/logmonfifo

while true
do
     if read line <$pipe; then
         if [[ "$line" == 'quit' ]]; then
             break
         fi
	echo "$0 received data (doing actions now):"
         echo $line
	mythshutdown --lock
	state=`mythshutdown --status`
	echo -e "IRQ-Fehler in HTPC7even aufgetreten!\n\nHerunterfahren wurde abgeschaltet:\n$state\n\n\n-------------------------------------------------------------------------------------------\n\nFehler-Dump:\n$line" | mail --subject "HTPC7even-Shutdown deaktiviert" me at mydomain.example
     fi
done
echo "Reader exiting"

-------------------------------------------------------------------------------------------------------------

This finally works well!!

I have to manually reboot then, but this is far better than miss recordings because the mythbox hangs for half a day.

Next on the agenda is to first monitor this "solution" for some weeks, then try to automate what I currently am doing: reboot the system
This is really interesting, that rebooting the machine works in those situations without a problem! When automating, I just have to care for killing the frontend or s. th. like that, because mythwelcome treats this as manual startup (which is correct) and starts mythfrontend therefore.

Btw.: is there anything I can do, to prevent mythwelcome from starting mythfrontend just for those "special startups"?
This would let me omit killing the frontend.

Regards,
Nicolas

-- 
www.nskcomputing.de


More information about the mythtv-users mailing list