[mythtv-users] System freezes

John Pilkington johnpilk222 at gmail.com
Tue Mar 21 11:09:13 UTC 2023


On 21/03/2023 01:59, Stephen Worthington wrote:
> On Mon, 20 Mar 2023 16:22:22 +0000, you wrote:
> 
>> Since the end of February I've been getting occasional freezes of my
>> Fedora 36 / MythTV master box, needing a complete reboot.  Their onset
>> coincided with a major update of the desktop environment (KDE) and I
>> have posted about the problem on the fedora-kde users list; there hasn't
>> been much response.
>>
>> My best clue about the cause seems to be a 'canary' from the
>> rtkit-daemon, where the 'action' has often included the freeze.  My
>> fixes-32 box (still with el7) has no canary events and I've added a few
>> lines relating to its daemon at the end.
>>
>> Any suggestions?  Thanks.  The 'tda1004' refs are to drop tuner reinit
>> events that seem to be routine.
>>
>> John P
>>
>> {{{
>>
>> [john at HPFed ~]$ sudo SYSTEMD_COLORS=false journalctl --since -40d  |
>> grep -v tda1004x | grep  canary
>>
>> [sudo] password for john:
>> Feb 27 01:39:02 HPFed rtkit-daemon[888]: The canary thread is apparently
>> starving. Taking action.
>> Feb 28 18:29:01 HPFed rtkit-daemon[906]: The canary thread is apparently
>> starving. Taking action.
>> Mar 02 02:41:48 HPFed rtkit-daemon[903]: The canary thread is apparently
>> starving. Taking action.
>> Mar 04 02:34:24 HPFed rtkit-daemon[889]: The canary thread is apparently
>> starving. Taking action.
>> Mar 07 18:54:38 HPFed rtkit-daemon[903]: The canary thread is apparently
>> starving. Taking action.
>> Mar 10 01:58:04 HPFed rtkit-daemon[909]: The canary thread is apparently
>> starving. Taking action.
>> Mar 13 01:19:45 HPFed rtkit-daemon[887]: The canary thread is apparently
>> starving. Taking action.
>> Mar 15 02:18:06 HPFed rtkit-daemon[891]: The canary thread is apparently
>> starving. Taking action.
>> Mar 20 01:57:02 HPFed rtkit-daemon[887]: The canary thread is apparently
>> starving. Taking action.
>> [john at HPFed ~]$
>>
>> Fixes-32 system:
>>
>> [root at HP_Box john]# journalctl -S -4d | grep  rtkit
>>
>> Mar 20 15:13:36 HP_Box rtkit-daemon[875]: Supervising 6 threads of 3
>> processes of 1 users.
>> Mar 20 15:13:36 HP_Box rtkit-daemon[875]: Supervising 6 threads of 3
>> processes of 1 users.
>> Mar 20 15:13:36 HP_Box rtkit-daemon[875]: Supervising 6 threads of 3
>> processes of 1 users.
>>
>> Fedora 36 most recent exammples:
>>
>> [john at HPFed ~]$ sudo SYSTEMD_COLORS=false journalctl --since -6d  | grep
>> -v tda1004x | grep -C 5 canary
>> --
>> Mar 15 02:01:03 HPFed anacron[56959]: Anacron started on 2023-03-15
>> Mar 15 02:01:03 HPFed run-parts[56961]: (/etc/cron.hourly) finished 0anacron
>> Mar 15 02:01:03 HPFed CROND[56947]: (root) CMDEND (run-parts
>> /etc/cron.hourly)
>> Mar 15 02:01:03 HPFed anacron[56959]: Normal exit (0 jobs run)
>> Mar 15 02:09:32 HPFed kioslave5[57197]: kf.coreaddons: Expected a
>> KPluginFactory, got a KIOPluginForMetaData
>> Mar 15 02:18:06 HPFed rtkit-daemon[891]: The canary thread is apparently
>> starving. Taking action.
>>
>>
>> -- Boot 9895bb71d2bb46228c4a08159c32218a --
>> Mar 15 08:32:40 HPFed kernel: Linux version 6.1.15-100.fc36.x86_64
>> (mockbuild at bkernel02.iad2.fedoraproject.org) (gcc (GCC) 12.2.1 20221121
>> (Red Hat 12.2.1-4), GNU ld version 2.37-37.fc36) #1 SMP PREEMPT_DYNAMIC
>> Fri Mar  3 17:22:46 UTC 2023
>> Mar 15 08:32:40 HPFed kernel: Command line:
>> BOOT_IMAGE=(hd0,msdos1)/vmlinuz-6.1.15-100.fc36.x86_64
>> root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/swap
>> rd.lvm.lv=fedora/root rhgb quiet
>> Mar 15 08:32:40 HPFed kernel: x86/fpu: x87 FPU will use FXSAVE
>> Mar 15 08:32:40 HPFed kernel: signal: max sigframe size: 1440
> 
> There are endless different scenarios for system freezes, so it would
> be helpful to have a bit more information about your particular one.

Stephen, thanks for your reply and suggestions.  I'll work on them, but 
here's more info:

The freezes have typically happened at night when screens are off. 
First warning is that the disk access light is found on.  Keyboard and 
mouse unresponsive; caps lock key doesn't affect its light.  ssh from 
another box gets no response.

Power down by front button long press.  Reboot with both vga monitor and 
hdmi Sony Android tv active, at first looks normal but on completion the 
tv screen may be seen as 'off' by nVidia-settings 470xx. and needs to be 
reset (1360x768) and repositioned to abut the DELL monitor.  (I'm still 
having problems in getting consistent screen assignments as monitor and 
tv do their various power-saving changes of status).

At that point I have usually run the DB-optimise-and-backup from a 
konsole tab, and an in-and-out mythtv-setup(.real) before restarting the 
backend and frontend(.real), again in konsole tabs.  All back to normal.

The usual boot fsck checks have been ok, but I haven't yet run one from 
a live disk.  The system is still looking ok after yesterday's pipewire 
update and reboot.

There are more details here, from earlier in the chase.  But I dislike 
the HYPERKITTY  archive...

https://lists.fedoraproject.org/archives/list/kde@lists.fedoraproject.org/thread/R4PC7CZWD2JYMSX645KYY6SWXKIITO2Z/

> 
> What exactly happens when it freezes?  Can you still ssh into the box
> and reboot it that way (ie is the display frozen but the box still
> running underneath that)?  Can you still ping it and get a response?
> Do you have a systemd early debug shell enabled?  See here for how to
> do that:
> 
> https://freedesktop.org/wiki/Software/systemd/Debugging/
> 
> If you do have an early debug shell available, does Ctrl-Alt-F9 get
> you to it?  Does it still work and allow you to do a reboot command?
> 
> What sort of video card does it use?  Nvidia driver updates are a
> common cause of freeze problems.
> 
> When you reboot, how are you doing that?  Do you have SysRq support
> enabled so you can use the SysRq REISUB keyboard sequence to do the
> reboot as safely as possible to prevent filesystem corruption?
> 
> https://fedoraproject.org/wiki/QA/Sysrq
> 
> Do you do a full fsck of at least the system partition(s) before
> rebooting the system normally, to ensure the system is not getting
> more and more corrupted?  The easiest way to do that is to boot a live
> USB image and do it from there, or to have another bootable partition
> on the system.




More information about the mythtv-users mailing list