[mythtv-users] My solution to prevent system hanging forever at shutdown
Nicolas Krzywinski
myth at site7even.de
Fri Aug 29 21:13:46 UTC 2014
Hi all,
it took me some time to get this problem first identified, then solved, which I want to share now.
As usual, one may skip the story and proceed to the hard facts below.
The Mythbox:
Self-assembled consumer hardware, focused on low power consumption, low noise and hifi-style optics running mythbackend, mysql, mythfrontend and mythwelcome on the same machine, configured for automatic startup and automatic shutdown.
The Problem:
Completely lacking any frequency and any identifyable reason, my mythbox _sometimes_ "forgot" to power off, hanging at shutdown screen forever. There were up to four weeks with everyday usage where everything went fine - followed by a couple of days with multiple hangs per day!
The Confusion:
Just to make it even more complicated, the system tended to change its mood after I tried to solve the issue. E. g. after adding acpi=force to the boot arguments, the system simulated of being tamed with powering off after shutdown for 1,5 weeks - just the day before I wanted to write my "solution" to this list, it kicked my ass and hung at shutdown again!
The Desperation:
Because - a long time ago - this hardware setup made problems on Microsoft Windows as well (bluescreens) which initially was my reason to switch this last windows system to Linux also, I started to give up and plan for new hardware.
The Idea:
Well, tired from trying to beat the beast, I got the idea to lock it down.
Meanwhile, I figured out that each hang at shutdown was preceeded by a kernel stack trace:
Aug 28 23:03:54 htpc7even kernel: [10297.933561] irq 16: nobody cared (try booting with the "irqpoll" option)
Aug 28 23:03:54 htpc7even kernel: [10297.933571] Pid: 0, comm: swapper/0 Tainted: G C O 3.2.0-65-generic #98-Ubuntu
Aug 28 23:03:54 htpc7even kernel: [10297.933575] Call Trace:
Aug 28 23:03:54 htpc7even kernel: [10297.933578] <IRQ> [<ffffffff810dcd4d>] __report_bad_irq+0x3d/0xe0
Aug 28 23:03:54 htpc7even kernel: [10297.933596] [<ffffffff810dd185>] note_interrupt+0x135/0x190
Aug 28 23:03:54 htpc7even kernel: [10297.933602] [<ffffffff810da9ca>] handle_irq_event_percpu+0xaa/0x210
Aug 28 23:03:54 htpc7even kernel: [10297.933607] [<ffffffff810dab81>] handle_irq_event+0x51/0x80
Aug 28 23:03:54 htpc7even kernel: [10297.933613] [<ffffffff810ddbda>] handle_fasteoi_irq+0x6a/0x110
Aug 28 23:03:54 htpc7even kernel: [10297.933621] [<ffffffff81016412>] handle_irq+0x22/0x40
Aug 28 23:03:54 htpc7even kernel: [10297.933626] [<ffffffff8166e7aa>] do_IRQ+0x5a/0xe0
Aug 28 23:03:54 htpc7even kernel: [10297.933634] [<ffffffff81663aee>] common_interrupt+0x6e/0x6e
Aug 28 23:03:54 htpc7even kernel: [10297.933637] <EOI> [<ffffffff81319574>] ? timerqueue_add+0x74/0xc0
Aug 28 23:03:54 htpc7even kernel: [10297.933652] [<ffffffff81370f0d>] ? intel_idle+0xed/0x150
Aug 28 23:03:54 htpc7even kernel: [10297.933656] [<ffffffff81370eef>] ? intel_idle+0xcf/0x150
Aug 28 23:03:54 htpc7even kernel: [10297.933664] [<ffffffff8150db51>] cpuidle_idle_call+0xc1/0x290
Aug 28 23:03:54 htpc7even kernel: [10297.933670] [<ffffffff8101322a>] cpu_idle+0xca/0x120
Aug 28 23:03:54 htpc7even kernel: [10297.933676] [<ffffffff81629fae>] rest_init+0x72/0x74
Aug 28 23:03:54 htpc7even kernel: [10297.933684] [<ffffffff81cfcc06>] start_kernel+0x3b5/0x3c2
Aug 28 23:03:54 htpc7even kernel: [10297.933690] [<ffffffff81cfc388>] x86_64_start_reservations+0x132/0x136
Aug 28 23:03:54 htpc7even kernel: [10297.933696] [<ffffffff81cfc140>] ? early_idt_handlers+0x140/0x140
Aug 28 23:03:54 htpc7even kernel: [10297.933701] [<ffffffff81cfc459>] x86_64_start_kernel+0xcd/0xdc
Aug 28 23:03:54 htpc7even kernel: [10297.933705] handlers:
Aug 28 23:03:54 htpc7even kernel: [10297.933710] [<ffffffff81498080>] usb_hcd_irq
Aug 28 23:03:54 htpc7even kernel: [10297.933717] [<ffffffff8146cb80>] ata_bmdma_interrupt
Aug 28 23:03:54 htpc7even kernel: [10297.933721] Disabling IRQ #16
(Of course I did everything I could find, to try to solve this crash, but nothing worked - all system functionality remains unaffected when this crash occurs, this seem to only impact the power off functionality)
The Solution:
I wrote some little helper scripts to monitor syslog for the above error and to act with disabling the automatic shutdown and emailing me what happened:
-------------------------------------------------------------------------------------------------------------
nsk at htpc7even:~$ cat bin/start-crash-monitor.sh
#!/bin/bash
mkfifo /tmp/logmonfifo
/home/nsk/bin/fiforeader.sh &
tail -n0 -F /var/log/syslog | grep --line-buffered "irq 16: nobody cared" > /tmp/logmonfifo
-------------------------------------------------------------------------------------------------------------
nsk at htpc7even:~$ cat bin/fiforeader.sh
#!/bin/bash
pipe=/tmp/logmonfifo
while true
do
if read line <$pipe; then
if [[ "$line" == 'quit' ]]; then
break
fi
echo "$0 received data (doing actions now):"
echo $line
mythshutdown --lock
state=`mythshutdown --status`
echo -e "IRQ-Fehler in HTPC7even aufgetreten!\n\nHerunterfahren wurde abgeschaltet:\n$state\n\n\n-------------------------------------------------------------------------------------------\n\nFehler-Dump:\n$line" | mail --subject "HTPC7even-Shutdown deaktiviert" me at mydomain.example
fi
done
echo "Reader exiting"
-------------------------------------------------------------------------------------------------------------
This finally works well!!
I have to manually reboot then, but this is far better than miss recordings because the mythbox hangs for half a day.
Next on the agenda is to first monitor this "solution" for some weeks, then try to automate what I currently am doing: reboot the system
This is really interesting, that rebooting the machine works in those situations without a problem! When automating, I just have to care for killing the frontend or s. th. like that, because mythwelcome treats this as manual startup (which is correct) and starts mythfrontend therefore.
Btw.: is there anything I can do, to prevent mythwelcome from starting mythfrontend just for those "special startups"?
This would let me omit killing the frontend.
Regards,
Nicolas
--
www.nskcomputing.de
More information about the mythtv-users
mailing list