<br><br><div class="gmail_quote">On Tue, Mar 6, 2012 at 2:36 PM, Matt Emmott <span dir="ltr"><<a href="mailto:memmott@gmail.com">memmott@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br><br><div class="gmail_quote"><div><div class="h5">On Tue, Mar 6, 2012 at 11:22 AM, Raymond Wagner <span dir="ltr"><<a href="mailto:raymond@wagnerrp.com" target="_blank">raymond@wagnerrp.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="h5">
<div bgcolor="#FFFFFF" text="#000000"><div><div>
On 3/6/2012 10:50, Matt Emmott wrote:
<blockquote type="cite"><br>
<br>
<div class="gmail_quote">On Mon, Mar 5, 2012 at 8:48 PM, Matt
Emmott <span dir="ltr"><<a href="mailto:memmott@gmail.com" target="_blank">memmott@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div><br>
On Mar 5, 2012 7:23 PM, "Thomas Boehm" <<a href="mailto:mythtv-users@lists.boehmi.net" target="_blank">mythtv-users@lists.boehmi.net</a>>
wrote:<br>
><br>
> Raymond Wagner wrote:<br>
> > Matt Emmott wrote:<br>
> >> My slave backend has now lost a second 750GB
disk, ...<br>
><br>
> > Did you recently lose a hard drive, or have a
corrupt database?<br>
><br>
> Looks like to me ;-)<br>
> <br>
</div>
Yup! Two seagates went boom. My mbe is hooked to a raid 5
fiber array so hopefully that won't be an issue in the future.
</blockquote>
</div>
<br>
This has been a very slow crawl. I have some ssh screens open on
another monitor at work and have been running a cleanup every few
minutes. I'm down to 1026 recordings. Every time I tell it to
delete orphaned entries, I get a ton of the following on my MBE:<br>
<br>
2012-03-06 10:43:56.183 ProgramInfo(4746_20120128013000.mpg),
Error: GetPlaybackURL: '4746_20120128013000.mpg' should be local,
but it can not be found.<br>
<br>
Now, I assume this is normal because it's parsing through the
orphaned entries. But I also see a few of the following:<br>
<br>
2012-03-06 10:43:24.182 Error deleting
'GetPlaybackURL/UNABLE/TO/FIND/LOCAL/FILE/ON/magicbe/4067_20120209180000.mpg'
could not open<br>
eno: No such file or directory (2)<br>
2012-03-06 10:43:24.193 Delete Error
'GetPlaybackURL/UNABLE/TO/FIND/LOCAL/FILE/ON/magicbe/4067_20120209180000.mpg'<br>
eno: No such file or directory (2)<br>
<br>
"Magic" is my SBE and "MagicBE" is my MBE. I've been having
ongoing issues with my HDHR prime failing on some recordings.
That's on the list to fix later. <br>
<br>
Is there anything I can do to stop the race condition, as Raymond
called it? It's a bit frustrating to have to keep cycling through
the delete function over and over (Although I do appreciate the
fact that such a tool exists).<br>
</blockquote>
<br></div></div>
The race condition is a consequence of how the bindings are set to
set up to delete. The list of recordings are pulled from the
database, but the bindings are not allowed to delete from the
database (so they don't have to keep track of all the linked sundry
tables and files). Instead, it uses the slightly inefficient
mechanism of requesting a ProgramInfo from the backend, and then
passing it right back to the backend to delete.<br>
<br>
Now after you've run find_orphans.py the first pass through, it
should have queued up all 1056 entries for deletion. When it
refreshed, a number of those entries still existed in the database,
however when you tell it to delete a second time, it tries to pull
that ProgramInfo from the backend, but since some have been deleted
in the meantime, there will be no program to pull, and that method
will raise an error. At that point, it flushes back through and
repopulates the list.<br>
<br>
In theory, that should mean all 1026 recordings are queued up in the
backend for deletion, and the backend should get to them in
relatively short order. Since this is at least a day now, and it's
only managed to delete 30 of those entries, that sounds like there
is something wonky going on in the backend, outside the control of
find_orphans.py. You should be seeing logs in the backend where it
is working through all of those files. The backend currently starts
a new thread for each new file it intends to delete. Those threads
sit idle for roughly three seconds, and then run synchronously, one
at a time, through a shared lock. Perhaps there are some problems
dealing with that volume of threads in that short a time.<br>
</div>
<br></div></div>_______________________________________________<br></blockquote><div><br>Thanks for the info. I assume when you say that things should occur in the logs, you mean mythbackend.log. When I run the script I get the errors I quoted above... The file cannot be found error seems to occur once for every orphaned recording, but I haven't gone back and counted. After it runs and I'm back at the menu prompt for find_orphans, I don't see anything else happening in mythbackend.log. I've had tail -f running all day and all I see are the standard job queues, etc. After the 100% spike in mysqld (I have 2 CPUs but it never goes over 100 for that process), I don't see any real load on the server. Mythbackend happily chugs along at very low CPU usage, and my load averages tend to stay around 0.3%. Anything else I can try?</div>
</div>
</blockquote></div><br>I've been plugging away with the console open in another monitor. I run the cleanup, watch the mythbackend log and mysqld CPU usage, and repeat my delete when everything dies down. I'm down to 523 orphaned files after two days.<br>
<br>One thing I notice... As it churns through the "should be local, but it can not be found" errors, it stops at the following:<br><br>2012-03-08 16:09:26.425 Scheduled 2643 items in 18.6 = 0.01 match + 18.54 place<br>
2012-03-08 16:09:26.440 Reschedule requested for id 0.<br>2012-03-08 16:09:45.127 Scheduled 2643 items in 18.7 = 0.02 match + 18.64 place<br><br>This is always the last line in the log after a delete run. I have a lot of schedules set up with the "max recordings" set/ My feeling is that the script chokes when it deletes an episode of something that lowers the max recordings, and the reschedule kicks in. Does the "2643 items" mean that I actually have that many schedules set up? Seems like a lot.<br>