[mythtv-users] Auto-delete duplicate recordings

Michael T. Dean mtdean at thirdcontact.com
Wed Dec 18 02:11:28 UTC 2013


On 12/17/2013 10:07 AM, George Nassas wrote:
> On Dec 17, 2013, at 12:56 AM, Jacob Strandlien wrote:
>
>> Hello!  I have been trying to find a way to do this for a couple of days now with no luck.
>>
>> I have a large archive of recordings on a well-used mythbox (0.27) and I know I have accumulated a lot of duplicates.  I need to free some space, but going through and manually getting rid of the duplicates would be a monumental task.
>>
>> Is there any mechanism or script available to track down these duplicates, and preferably delete all but the newest recording?  I've searched the web and poked at my interface with no luck.
> I think your only recourse is a direct update to the database but the main question is what counts as a duplicate? A good identifier is the column programid and using that I whipped up this little bit of sql:
>
> update recorded r1
> set recgroup = 'Duplicates'
> where r1.programid != ''
>    and substr(r1.programid, 11) != '0000'
>    and exists (select 1
>                  from recorded r2
> 	       where r1.starttime<  r2.starttime
> 	         and r1.programid = r2.programid);
>
> it finds shows with a program id which is not generic (ending in 0000) but do have a later recording with the same programid and moves them to a "Duplicates" group where you can review and delete at your leisure (you can do mass deletes using playlists, ask if you're not sure how). It's easy enough to undo, just change 'Duplicates' to 'Default' and you're back to before. If you're outside of North America and don't have reliable program ids then the columns would have to change, post back for more info.
>
> For people on master, or who are reading this from the future, the recgroup column has changed to an integer so the update won't work for you without a tweak.

Probably better would be to use the Python bindings to create a script 
that gets all recording information from the backend, then sort it all 
by title/subtitle and/or programid and/or whatever criteria you come up 
with to call it a dup, and then list any for which you have more than 
one matching episode (possibly even interactively asking whether to 
delete any of the dups).

You could use http://www.mythtv.org/wiki/Delete_recordings.py as a 
starting point, then change it to use MythBE.getRecordings() instead of 
MythDB.searchRecorded().  You could even do like delete_recordings.py 
does and allow the user to specify the "duplicate-matching" criteria on 
the command line, as delete_recordings.py allows the user to specify the 
search criteria on the command line. See also 
http://www.mythtv.org/wiki/0.26_Python_Bindings and 
http://www.mythtv.org/wiki/0.25_Python_Bindings/Connection_Handlers#MythBE .

And, if you want to take the safe route, instead of deleting the 
recordings, use the bindings to change the recording group.

Mike


More information about the mythtv-users mailing list