[mythtv-users] MythV's mysql d/b - should it be UTF-8?

Michael T. Dean mtdean at thirdcontact.com
Tue May 8 13:30:08 UTC 2007


On 05/08/2007 05:13 AM, Neil Bird wrote:
>    Currently, the default seems to [still] be latin1 as the encoding, not 
> utf-8.  This does make manual/scripted access a tad tricky when there are 
> non-ASCII characters present.
>
>    Now, my understanding of it is that it shouldn't actually make any 
> difference to Myth how the table is configured in that respect, as the text 
> will still be UTF-8 encoded, but I may be wrong.  So:
>
> a) *Does* it make a difference?
>   

Yes!!! Very much so.  If you change your charset on tables, your data 
will become corrupt (either at conversion, during normal usage--because 
of Myth's assumption that you're using a latin1 database, or during a 
future upgrade).  Myth can assume you're using a latin1 database since 
mc.sql sets the default charset for the DB before any tables are created 
and http://svn.mythtv.org/trac/changeset/8922 resets the charset on 
database upgrades to fix cases where DBMS upgrades change the default 
charset.  (In other words, the only way you won't be using a latin1 
database/tables is if you change them yourself, which you shouldn't do 
because Myth was designed to use latin1.)

> b) If not, how can I safely convert my tables from latin1 to UTF-8?  I tried 
> a couple of simple recipes that Mr. Google pointed me to, but nothing was 
> quite right.
>   

N/A

> c) Is the latin1 default correct?  I'm going by the mysql.txt (?) that comes 
> with MythT & contains the d/b creation commands.

Yes.  You /must/ use latin1.  See 
http://svn.mythtv.org/trac/ticket/2775#comment:5 .  Specifically, the line:

----
It is necessary to use latin1 since we use it as 8bit storage for utf-8 
strings to avoid the space penalty of using utf-8 in mysql.
----

If you're having issues with characters, you should fix the Myth code to 
properly handle the characters with a latin1 DB/tables.  If you just 
want utf-8 to make it easier to script access, someone else will have to 
help you with how to retrieve characters properly when the DB is latin1.

Mike


More information about the mythtv-users mailing list