[mythtv-users] error when fixing database forced by mythtv upgrade
Michael T. Dean
mtdean at thirdcontact.com
Sun Sep 27 19:48:06 UTC 2009
On 09/27/2009 01:59 PM, Petr Stehlik wrote:
> Michael T. Dean píše v Ne 27. 09. 2009 v 08:51 -0400:
>
>>> BTW, the SELECT * FROM mythconverg.people where name like "Samuel Iva%"
>>> looks as follows:
>>>
>>> "person","name"
>>> 2871,"Samuel Ivaška (Slovenská republika 1983)"
>>> 4094,"Samuel Ivaška (Slovenská republika 1988)"
>>> 644,"Samuel IvaÅ¡ka HrajÃ: VladimÃr Hajdu"
>>> 3686,"Samuel IvaÅ¡ka. HrajÃ: Ferdinand Libant"
>>> 37451,"Samuel IvaÅ¡ka. HrajÃ: Ján Havrila. (Slovenská republika
>>> 1988)"
>>>
>>> I cannot see any duplicate entry there.
>>>
>> The duplication is caused by truncation during conversion due to invalid
>> encoding of some of your entries.
>>
>
> See the following output of the same SELECT ran on the original
> 0.21-fixes - no duplicates:
>
> +--------+-------------------------------------------------------------------+
> | person | name |
> +--------+-------------------------------------------------------------------+
> | 2871 | Samuel Ivaška (Slovenská republika 1983) |
> | 4094 | Samuel Ivaška (Slovenská republika 1988) |
> | 644 | Samuel Ivaška Hrají: Vladimír Hajdu |
> | 3686 | Samuel Ivaška. Hrají: Ferdinand Libant |
> | 37451 | Samuel Ivaška. Hrají: Ján Havrila. (Slovenská republika 1988) |
> +--------+-------------------------------------------------------------------+
> 5 rows in set (0.05 sec)
>
If that's what you're seeing from the mysql command-line client when
queried against your 0.21-fixes database, it proves that every single
one of those lines is corrupt. The "š", the "í", and the "á" characters
should not show up properly in the mysql client unless the column or
connection was actually set up as a UTF-8 column or connection at one
time (when the data was inserted). They should show up as 2
characters. ("š" = "Å¡", "í" = "Ã" (which is 2 chars, but one isn't
visible), and "á" = "á") You'll note that in your query after
restoring the "uncorrupted" backup, they're showing the proper characters.
(You never followed the instructions that used to be on the wiki that
changed the database schema, converting some columns to utf8, to try to
"fix" the OSD data, etc., did you? If so, that's how your data got
corrupted.)
> It's probably the right time to show the output of MySQL status:
>
> Server version: 5.0.51a-24+lenny2 (Debian)
> Server characterset: latin1
> Db characterset: latin1
> Client characterset: latin1
> Conn. characterset: latin1
>
> Weird, now I have the database in a different state than before. It used
> to be (sometime around the 0.22 upgrade, not sure if before or after the
> failed upgrade) as follows:
>
> Server characterset: latin1
> Db characterset: utf-8
> Client characterset: latin1
> Conn. characterset: latin1
>
> This is different from the example on the web...
>
The Db characterset will get changed to utf-8 when you first run a
mythtv trunk program or if you use the mythtv trunk mc.sql to create the
DB (instead of using the 0.21-fixes mc.sql to create the DB). It
/should/ not cause any problems if you're using a good/not-corrupt full
DB backup or creating a brand-new DB.
> Funny, so if now I have all in 'latin1' maybe I could re-try the
> upgrade?
>
It's your data that's bad, not your (current) DB config.
>> Since the original upgrade encoding check failure occurred during the
>> oldrecorded table test, that means that the people and oldprogram tables
>> are not corrupt. Therefore, you have the dreaded partial corruption
>> problem. Unfortunately, since one of the corrupt tables is
>> oldrecorded--which is one of the ones you need to restore during a
>> partial restore--even a partial restore won't work.
>>
> Do I really need those tables? All I want is the setup configuration,
> the recorded programs and the recording rules. Oldrecorded sounds like
> something old that I no longer need.
>
oldrecorded is the recording history. If you want to throw away years
of history and start over with duplicate matching, feel free to throw
away that table.
>> Your database likely became partially corrupt due to either switching
>> your MySQL server from a properly-configured one to an improperly
>> configured one (i.e. moving the database between distros)
>>
> well, I have upgraded Etch to Lenny recently... Does that count as
> moving between distros?
>
All that matters is how the database server was configured--if it was
ever misconfigured at any time during your use of MythTV, data during
that time was corrupted.
>> My recommendation is--for now--stay on 0.21-fixes
>>
> I am back on 0.21-fixes now but I'd like to try that upgrading again.
> Could you please suggest which tables could eventually be truncated
> safely given that I don't need anything old, just the actual recordings
> that are currently saved on the disk? For example the "people" table - I
> didn't know that MythTV collects such information and I never used it
> (and likely never will).
The recommended data to keep when doing a partial restore is that in
record (recording rules), recorded (current recordings), oldrecorded
(recording history), recordedprogram and recordedrating (additional
information about current recordings), and recordedmarkup and
recordedseek (bookmarks, commercial flagging, cutlists, and seektables).
recordedprogram and recordedrating aren't /that/ important (especially
for "watch it and delete it" people), and recordedmarkup and
recordedseek can be completely re-created from scratch (with no data
loss) by running either mythtranscode --buildindex or mythcommflag
--rebuild and then running mythcommflag on every recording.
Unfortunately, though, chances are recordedprogram and, especially,
recordedrating won't cause any issues and recordedmarkup/recordedseek
can not be corrupted, so skipping any of those is unlikely to make any
difference.
Mike
More information about the mythtv-users
mailing list