[mythtv-users] Questions on correcting database encoding under Gentoo while still on 0.21-fixes

Tom Dexter digitalaudiorock at gmail.com
Mon Nov 2 20:15:21 UTC 2009


On Mon, Nov 2, 2009 at 2:41 PM, Michael T. Dean <mtdean at thirdcontact.com> wrote:
> On 11/02/2009 01:26 PM, Tom Dexter wrote:
>>
>> I've been running mythtv under Gentoo all along, and have always had
>> the default Gentoo mysql configuration where the default connections
>> etc are utf8.
>>
>> I'm still on 0.21-fixes but wanted to get this addressed now rather
>> than when I decide to upgrade to 0.22, so I did so as per the wiki:
>>
>> http://wiki.mythtv.org/wiki/Fixing_Corrupt_Database_Encoding
>>
>> I've always had the same backedn and same configuration, so there's no
>> reason I can think of that I'd have any "Partial" corruption.
>>
>> After changing the my.cnf and restarting mysql on the backend as per this
>> part:
>>
>>
>> http://wiki.mythtv.org/wiki/Fixing_Corrupt_Database_Encoding#Changing_the_MySQL_server_configuration
>>
>> Luckily before doing the restore I connected to the db and checked the
>> 'status' command output.  It was still defaulting the connection etc
>> to utf8.  Rather than removing the "character-set-server" and
>> "default-character-set" lines in my.cnf as per the instructions, I
>> discovered that I needed to expressly change them from utf8 to latin1.
>>
>
> Yeah, just getting rid of the explicit change to utf8 and using the default
> is rather fragile as there may be my.cnf or ~/.my.cnf or scripts or
> something that change the mysql command to use utf8, still.  I'm guessing,
> though, that USE=utf8 must specify --with-charset=utf8 when compiling, so it
> explicitly changes the MySQL default in the binaries (which has always been
> latin1).  I modified the sed to change character sets to explicitly specify
> latin1.
>

Yea, that's what's happening.  Though the use flag is actually
'latin1'.  I just actually looked at the mysql.eclass code used for
that and found this:

	if mysql_version_is_at_least "4.1" && ! use latin1 ; then
		myconf="${myconf} --with-charset=utf8"
		myconf="${myconf} --with-collation=utf8_general_ci"
	else
		myconf="${myconf} --with-charset=latin1"
		myconf="${myconf} --with-collation=latin1_swedish_ci"
	fi

...so unless latin1 is set it always defaults the binary to utf8.

> This is why it would have been /much/ easier for someone who actually had
> this issue to figure out/write up the fix for it.  :)
>
>>  Once I did that and restarted, everything looked correct.  At that
>> point I went ahead with the restore from the corrected backup file and
>> everything seems ok.  Had I not noticed that, I'm assuming future
>> database writes would have caused a mess of "Partial" corruption.
>>
>> I think (though I'm not certain) that this may be related to the fact
>> that Gentoo has a latin1 use flag for mysql (which is described as
>> "Use LATIN1 encoding instead of UTF8") that is off by default.  I
>> think this causes it to use utf8 by default if not otherwise specified
>> in my.cnf.
>>
>> I also noticed that if I connected to the backend from the frontend
>> (on which I have a minimal mysql install with client code only) using
>> the mysql command line, the default client and connection character
>> sets defaulted to utf8 (however the server character set and db
>> character sets where properly set to latin1).  To be safe I made the
>> same change in the my.cnf file on the frontend to ensure that
>> everything connected with latin1 by default.  Once I did that,
>> connecting to the backend from the frontend showed latin1 for
>> everything.  I assumed that not doing this could possibly affect any
>> database writes made directly to the db from the frontend (if that
>> actually happens).
>>
>> I wanted to make sure that there isn't something wrong with anything
>> I've done there.  I'd think the wiki needs something about this or
>> someone could make quite a big mess, unless they're doing this as part
>> of the 0.22 upgrade...as I understand it, none of those connection
>> defaults etc should matter in 0.22.
>>
>> I was also curious why the instructions don't specify using the mc.sql
>> to create the database...that's what I used.
>
> Mainly because the mc.sql adds a GRANT that may mess up the user's DB
> access.  If you're using mythtv/mythtv as the username/password, mc.sql is
> no problem.
>

Ah...that makes sense.  I did make sure to do a grant afterwards to
allow access from my LAN.

> Also, if you use the mc.sql from 0.22-fixes or above it will set the default
> DB character set to utf8.  If you don't it will use the server default,
> which--after the changes mentioned above--should be latin1.  However, the
> backup /should/ handle it OK (as it specifies CHARSET=latin1 for each CREATE
> TABLE), but it's "safer" to have the DB charset set to latin1, too, just in
> case.
>
> Remember, too, that once you upgrade to 0.22-fixes, you can restore your
> original mysql configuration.
>
> Thanks for the feedback.
>
> Mike

Thanks!  Do you know offhand whether I may have run into problems if I
didn't change the my.cnf on the frontend?  Like I explained above that
appears to create a situation where connections from the frontend to
the (newly corrected) backend would have this by default:

Server characterset:	latin1
Db     characterset:	latin1
Client characterset:	utf8
Conn.  characterset:	utf8

With the frontend's my.cnf changed the above are all latin1.  Would
missing that have caused an issue, or would data just get properly
translated and stored?  Then again, I guess that would only be an
issue for anything (if there is in fact any situation) where the
frontend writes directly to the database.

Tom


More information about the mythtv-users mailing list