[mythtv-commits] Ticket #278: Update mythlink.sh to use Perl code in mythlink.pl and add extra functionality to mythlink.pl

MythTV mythtv at cvs.mythtv.org
Mon Aug 29 23:00:57 UTC 2005

#278: Update mythlink.sh to use Perl code in mythlink.pl and add extra
functionality to mythlink.pl
 Reporter:  mtdean at thirdcontact.com  |       Owner:  xris
     Type:  patch                    |      Status:  new 
 Priority:  minor                    |   Milestone:      
Component:  mythtv                   |     Version:      
 Severity:  low                      |         Cc:                           |  
 Most importantly, the default behavior of both the mythlink.pl and
 mythlink.sh scripts is preserved.  :)  I'm assigning to xris since he
 wrote mythlink.pl.

 The attached patch removes all the perl code from mythlink.sh's here
 document and instead uses mythlink.pl to create the links.  The new
 mythlink.sh simply provides a means of grouping calls to mythlink.pl;
 however, some changes to mythlink.pl were required to make the mythlink.sh

 Instead of adding complex functionality to the mythlink.sh script to
 locate the mythlink.pl script, the user can specify an environment
 variable with the appropriate command name.  The default value assumes
 that the mythlink.pl script is in the user's PATH.

 The patch makes the following changes to mythlink.pl:

 Adds --separator-char (or --sepchar) option (to allow removal of trailing
 separator even when separator and replacement characters are different).
 The format string may either use literal separator characters or use the
 "%p" specifier to output separators (although the literal character is
 probably more readable).  The primary reason for this option is to allow
 the script to clean up trailing separator characters or multiple
 separators caused by missing data.

 Adds --replacement-char (or --repchar) option

 Adds --space-char (or --space) option (to allow creation of link names
 without spaces)

 Adds format options for:
   * category (%C)
   * recording group (%U)
   * end time (same format specifiers for start time but with "e" prepended
 to format code -- i.e. %eG)
   * original airdate (same format specifiers for year/month/day but with
 "o" prepended to format code -- i.e. %oY)  (Handles null values for
 original airdate.)

 Makes explicit variables for default values and provide the default to
 Getopt instead of doing the "||=".  The explicit defaults ensure the
 defaults specified in the "--help" message are actually the defaults, even
 if the user provides a value for the given option. (OK, I know; that's a
 bit obsessive...)  More importantly, they also allow me to use the
 previously-specified defaults to replace illegal filename characters
 passed in for replacement-char or space-char.  (We could just die,
 instead, if you don't like this.)  Using Getopt to set the defaults just
 looked prettier, but if there's a reason not to do this, feel free to
 change it back (or let me know and I'll fix the patch).

 Way too much detail on the illegal filename character substitution
   * After spending a lot of time looking through the modifications I
 convinced Chris (over IRC) to make to get something working, I realized
 that the changes lack some of the functionality that the original
 substitution had (specifically grouping of multiple space*charspace*
 sections into one). Therefore, I went back to a substitution very similar
 to Chris's original (from #7025) and started working from there.
   * Made the substitution converting multiple spaces into a single space
 into a global sub
   * Added support for the $separator_char, $replacement_char, and
 $space_char variables
   * Modified the original illegal character cleanup.  My understanding of
 the desired functionality is:
     * Given: optional space characters (_), separator characters (-)
 replacement characters (=), and illegal characters (*)
        * We want to handle situations where some sections of the data are
 missing such that only one space/separator/space exists in the final name,
        * Separator characters at the end of a link name (and any space
 characters before/after them) should be removed, i.e.:
        * Replacement characters may make reading the data easier, but
 multiple contiguous replacement characters would make reading more
 difficult.  Therefore, back-to-back replacement characters should be
 consolidated into a single character.  However, replacement characters
 separated by spaces are probably important and should not be consolidated.
        * In the event that the same character is used as a separator and a
 replacement character or as a space and a replacement character, the
 separator/space consolidation rules will apply.
     * The regex I've included seems to follow these rules.  I have 171
 shows-- some of which have several illegal characters--and it seems to
 work well with them all.  I even modified the data in my database to test
 situations that weren't covered by the existing data.
     * To make this work--while also handling illegal characters at the end
 of the name, I removed the positive look-ahead from the sub.
     * I also used capturing for the space characters surrounding
 separators to allow the use of space-separator-space (i.e. " - ") or just
 separators ("-") without spacing problems.
     * Some of the replacements leave multiple contiguous spaces, for
 example, that are later consolidated.
     * I left separators at the beginning of the link name because it
 seemed more understandable than just discarding them.
     * There are probably far more efficient ways of doing this, so feel
 free to modify as desired.  I'd be happy to help test.

Ticket URL: <http://cvs.mythtv.org/trac/ticket/278>
MythTV <http://www.mythtv.org/>

More information about the mythtv-commits mailing list