[mythtv-users] minimizing impact of RAID rebuild/resync on MythTV
Tom Metro
tmetro+mythtv-users at gmail.com
Sun Feb 3 19:56:12 UTC 2008
I have a 4-drive software RAID 5 array (mdadm 2.5.6) running on an
Ubuntu Fiesty system. The first Sunday of every month a cron job in
/etc/cron.d/mdadm kicks off this command:
/usr/share/mdadm/checkarray --cron --all --quiet
This is done to insure the array is in good health. This is all fine and
good, except it stomps on the performance of the machine (AMD Athlon XP
2600+, SATA drives on a PCI controller). Playback of MythTV content from
a front-end running on the same machine or a remote front-end results in
constant pauses.
Research shows that the normal way to address this is to do something like:
echo -n 100 > /proc/sys/dev/raid/speed_limit_min
which will drop the minimum bandwidth used during a rebuild/resync (when
the array is in use) to 100 from the default of 1000 on Ubuntu.
However, while this helps, this doesn't seem to do the trick.
(On a side note, I also tried accomplishing the same setting by adding:
dev.raid.speed_limit_min = 100
to /etc/sysctl.conf, but on bootup it reports that it doesn't recognize
dev.raid.speed_limit_min (and the setting doesn't change), despite that:
# sysctl -a | fgrep -i raid
dev.raid.speed_limit_max = 200000
dev.raid.speed_limit_min = 1000
And running this:
# sysctl -p /etc/sysctl.conf
kernel.printk = 4 4 1 7
dev.raid.speed_limit_min = 100
produces no error. Must be some Ubuntu startup oddity. Perhaps a timing
issue, as the system doesn't boot using md, so perhaps the module is
being loaded after sysctl is ran?)
Today while a rebuild was running I noticed that the md_* processes
(there were two) that supposedly were performing the resync were running
at nice -5. I took the one that was using up most of the CPU and
manually boosted it to 3, and then the MythTV playback was fine.
I'm wondering if the problem is more of being CPU bound, rather than I/O
bound, during the rebuild. Though I'd expect the resync process to
substantially drop its CPU usage when it is being bandwidth throttled.
So should I try dropping the speed_limit_min even lower, or figure out a
way to drop the execution priority of the resync process?
Doing the latter is a bit tricky, as /usr/share/mdadm/checkarray is
merely a shell script that causes the rebuild to be queued up for
execution by the md driver. So it's not as trivial as sticking nice in
front of a command. Also, any priority adjusting solution ideally should
distinguish between a preventive rebuild and required rebuild.
Anyone have a good technique for addressing this?
A search of the list archives and the wiki didn't turn up anything
specifically addressing this issue. I'll add a section to the MythTV
wiki page on RAID if I find something that works.
On a related note, another way to minimize the inconvenience of a
rebuild is to make it run faster. It currently seems to be taking about
9 hours (4 320 GB drives). speed_limit_max is set to 200000, which is
already higher than the value recommended in the wiki:
http://www.mythtv.org/wiki/index.php/RAID#Software_Raid_Online_Capacity_Expansion_.28OCE.29_.28for_raid_5_with_XFS.29
so I'm not sure boosting it will have any benefit.
-Tom
More information about the mythtv-users
mailing list