[mythtv] myth.rebuilddatabase.pl with new filenames

Sat Oct 22 16:51:40 UTC 2005

Carl Reynolds wrote:

> Michael T. Dean wrote:
>
>> <snip... >
>> _Proposal_:
>>
>> <snip... >
>
Heh.  You figured out that I put my "Proposal" header in the wrong 
place.  ;)

>> IMHO, we should create a script that either a) accepts info as 
>> command-line arguments and/or b) prompts the user for info.  (Whether 
>> it accepts command-line args or not, it needs the ability to prompt 
>> the user for info--otherwise, we're forcing the user to deal with 
>> things like maximum command-line length and shell-based 
>> quoting/escaping issues.)  It would be possible to allow the user to 
>> easily provide information by asking for channel id--even listing the 
>> available channel ID's, channel numbers, and channel names and 
>> allowing the user to select an option or type their own).  Then, the 
>> script should ask for the start date/start time (allowing the user to 
>> provide a complete start time or simply the date).  Then, if the user 
>> provided a chanid/date that exists within the data in oldrecorded, 
>> display a list of shows from that date for the user to select or 
>> allow the user to specify a different time.  If the user selected a 
>> show, verify the info and then insert it into the db (and update the 
>> recstatus in oldrecorded).  If not, then continue prompting for the 
>> additional required information.
>>
>> Comments?  Suggestions?
>>
> Sorry for taking so long to reply to you. My hard drive filled up on 
> Wednesday and I have been trying to manage the data on it since then:
>
> It seems to me that using File::Find::Rule instead of glob to search 
> for files would solve most of the problems you have mentioned. We 
> could hard code a list of extensions to look for, or the user could 
> provide a list of extensions in an ASCII file. File::Find::Rule can be 
> told to filter out directory names and files below a certain size and 
> will look in any sub-directories. I haven't thought of files existing 
> on other machines. Don't know how to solve that one yet.
>
> The user would still need to be prompted for the channel-id, and 
> start-, end-date information.
>
> When dealing with massive amounts of files, I don't want to have to 
> type the name of each I want to process at a command line prompt.

Right.  I completely agree, but as most programs accepting lists of 
filenames will also allow wildcard characters, if we did likewise, it 
wouldn't be too difficult to say:

myth.rebuilddatabase.pl --files *.mpg some_other_file.nuv

> Cutting the info down to a [Y/n] prompt or no prompt at all is ideal, 
> but since the new naming scheme leaves certain critical information 
> out of the file name, I don't see how to eliminate the prompt all 
> together.

more below...

> It might be possible to derive the start and end dates by looking at 
> the creation and last modified dates for the file, but those are just 
> guesses since these dates would not be accurate if the file was 
> transcoded (for instance).

Yeah.  I think that's too fragile a solution. Also may fail for moved or 
copied files or on filesystems where these attributes are

> But, even if we can derive the start and end dates from the file 
> times, we would still have to prompt the user for the channel-id.
>
> It might also be possible to allow an option for the user to pass the 
> name of an ASCII file to the script. The file would contain a list of 
> the file names and associates channel-ids and start-, end-dates. 
> Creating the list of files would be fairly straight forward for the 
> average user using 'ls' and an editor, but then entering the 
> channel-ids and dates could get really cumbersome because the script 
> would have to allow for some variance in personal style.

The variance part would make that difficult.  It wouldn't be too bad if 
we only accept the filenames and then prompt for info, but...

> I'm not sure what you intend to accomplish by re-writing the entire 
> script, the basis of what needs to be done is contained in the current 
> script and the user/machine interfaces need to be tweaked to make the 
> script perform properly and easier to use. Could you tell me why you 
> think a total re-write is necessary? Also comment on my proposals 
> above and let's see if we can't work out a more usable interface.

I'm suggesting that trying to fit the old script to the new requirements 
may not be the best approach.  If we re-write it--thus throwing away all 
the assumptions it makes--we have much more flexibility.

So, I would like to see it modeled after Chris Petersen's 
mythrename.pl.  Since many users will probably rename files with his 
script, we could provide the same filename parsing constructs allowing 
the user to specify a filename pattern.  Since it's possible that not 
all files that would match any given criterion (i.e. certain filename 
extension, ...) would match the filename format pattern, we force the 
user to "list" the files (possibly with wildcards) to import.  Then, the 
user could run myth.rebuilddatabase.pl once for each filename format 
pattern.  For example:

myth.rebuilddatabase.pl --format '%T - %Y%m%d - %H%i - %eH%ei - %S' \
                        --files 'Nova - 200510*.mpg'
myth.rebuilddatabase.pl --format '%Y%m%d - %H%i - %eH%ei - %T - %S' \
                        --files '200507*.mpg'
myth.rebuilddatabase.pl --files '*.myth'
and, for the "classic" approach:
myth.rebuilddatabase.pl --format '%c_%Y%m%d%H%i%s_%eY%em%ed%eH%ei%es.nuv' \
                        --files '*.nuv'

Then, prompt for any missing information (which would be all information 
if no format is specified).  Also, don't worry about checking what's 
already in the database (since files may not exist locally), or about 
checking files that weren't "listed" as ones to import.

The main points I'm set on are that:
    a) we shouldn't be looking for all files that match some generic 
pattern--it makes more sense to let the user specify the files (either 
by name or using their own pattern)
    b) we shouldn't be verifying the existence of files for all database 
entries--they may not exist locally and, with current revs, the user can 
delete them from Myth if the file doesn't exist, so they can scroll 
through the list and see which ones don't play the preview if they're 
really concerned (or just run an appropriate 'SELECT count(*)' and 'ls 
-1 | wc -l').

Everything else is negotiable (of course, since I'm not a committer, I 
guess everything is truly negotiable :).  I guess I should be very 
careful about responding to thread or someone might expect me to write 
the new script...  ;)

And on that note, I recognize that just putting progstart and progend 
into the existing script (along with the basename changes I posted 
before) may be more appropriate since it will work for most cases 
(assuming users properly rename their files with the old filename format 
and don't have any .nuv files with the new format/don't mind saying no 
to prompts for the new nuv files) and the script is not used that often, 
anyway.  (Or, we can change the glob to *.myth--so it doesn't match 
existing files and still have them use the old filename format.)  The 
changes I'm proposing would take quite a bit more time than the simple 
fix, but since the script is not used that often, they may not be 
worthwhile.

Mike