[mythtv-users] MySQL related BE deadlocks - collective wisdom needed

Warpme warpme at o2.pl
Wed Aug 24 11:46:36 UTC 2011


On 8/5/11 9:44 AM, Warpme wrote:
> On 8/4/11 10:39 PM, Michael T. Dean wrote:
>> Thought I'd mention that Daniel K just pushed a change that reworks
>> reconnections.  It removes the MySQL-library provided auto-reconnect
>> that I just put in (for the exact same reasons MySQL changed their API
>> to disable auto-reconnect--because it can't work with certain
>> application designs, and MythTV has changed since 0.21 to a design where
>> auto-reconnect cannot work).  So there's no need for further testing
>> that patch.
>>
>> We have high hopes that this will work around the frequent MySQL
>> connection drop-out issues some users were seeing.  And I'd still be
>> very interested to hear details if anyone can identify the specific
>> MySQL server/client library configuration that's causing these frequent
>> drop outs.
>>
>> If you're running master, please pull an update (to 4dfcdb8dd0c80 or
>> later), and please report back if it helps.  For those of you on
>> 0.24-fixes, I will try to create a backport patch and post it here for
>> testing later today.  We plan to let this sit in unstable for a bit of
>> testing and to prove it actually works before pushing it to -fixes, 
>> though.
>>
>> Mike
>>
>> _______________________________________________
>> mythtv-users mailing list
>> mythtv-users at mythtv.org
>> http://www.mythtv.org/mailman/listinfo/mythtv-users
>>
> Daniel, Mike,
>
> Milion thx You are working on this issue !
> At some point of time I was almost sure issue is my specific and I 
> can't expect You will loose time on this particular one.
>
> Browsing Internet I can't find any bug reports describing issue like 
> mine. Taking into account mysql popularity - it seems like mine 
> problem root cause isn't within mysqlclient lib itself but rather in 
> environment/app which uses this lib.
>
> Last days I upgrade switch back to MyISAM+upgrade to 5.5.15. With this 
> change - I still have at least deadlock per day (with famous stack 
> trace).
> So by this, 2 days ago I decided to do in one steep full 
> reinstall+upgrade ALL system packages. Now I'm on tests phase.
> As I'm trying to find root cause - I want to test one change at time, 
> so I will apply Daniel's commits as soon as I will have first deadlock.
>
> BTW: What I observe: this lib is reentrant and I see, during hang, 
> other treads successfully accessing DB. It looks like lib hangs are 
> only in context of given thread.
> Looking for hangs in vio_net I found interesting thread on mysql 
> forums: http://bugs.mysql.com/bug.php?20110807-gda10d33id=33384
> It is related to lib crash not hang but with quite similar stack trace 
> - so maybe it is somehow correlated ?
>
> br 
Daniel, Mike,

Sorry for long silence - I was on holidays :-)

3 weeks ago I do full system upgrade.
I have configured 33 testing rec.rules + avg 10-15 user rec.rules. This 
gives 45-50 rec.per day.

Running 20110803-ge41e314 (upgraded OS but no recent Daniel's mysql 
enhancements) for 5 days I observed following:
-so far no 9773 (myth_proto) type deadlocks
-I had one 9792-like (scheduller) deadlock (but with little different 
symptoms)
-trace of above deadlock don't have any references to 
/usr/lib/libmysqlclient_r.so.16

Next I upgrade 20110807-gda10d33 (with 4dfcdb8: Fix SQL reconnection 
logic. Refs #9704. Refs #9773. Refs #9792;).
Running it for 5 day tests:
-no 9773 & 9792 deadlocks
-in those 5 days period I had reported 1 DB successful reconnect.

Next I upgrade to 20110812-g50606cd. This build is running continuously 
since 08/12.
-no any deadlocks
-no any DB reconnects in BE log.

My conclusions:

1. It looks like deadlocks with entries referencing to 
/usr/lib/libmysqlclient_r.so.16 are more likely results of specific OS 
components combination, as after change OS components I wasn't able to 
catch them in 10days. Before upgrade I had them avg. 1 per day.
If I read traces correctly - directly involved components on stack trace 
are: kernel,libpthread,mysql,Qt & myth.
OS upgrade from 08/03 changed: kernel,libpthread(ad part of glibc) & myth.
I think best potential candidate as root cause is glibc. Upgrade was 
from 2.13 to 2.14.
Second candidate is kernel. Upgrade was from 2.6.38.7 to 2.6.39.3.
I plan eventually to do tests with reverted kernel - but I'm not sure 
will I found right time to do this as I'm experimenting on production 
system.

2. It looks like 4dfcdb8: (Fix SQL reconnection logic) solves deadlocks 
like in #9773 & #9792, but reports about successful DB reconnection 
tells that there is still place for improvement.

3. 20110812-g50606cd so far works great for me (10d uptime, 500+ rec. - 
so far no deadlocks nor DB reconnects).
I have plan extend code freeze on this git pull to another 10d and see 
how it goes.
Currently only DB related issues I have with 20110812-g50606cd are 
reflected in following log entries (avg. 1 per day):

2011-08-22 11:07:27.198329 E DB Error (change_program):
Query was:
UPDATE program SET starttime = ?, endtime = ? WHERE chanid = ? AND 
starttime = ?
Bindings were:
:CHANID=12808, :NEWEND=2011-08-24T03:40:00, :NEWSTART=2011-08-23T17:25:00,
:OLDSTART=2011-08-23T16:35:00
Driver error was [2/1062]:
QMYSQL3: Unable to execute statement
Database error was:
Duplicate entry '12808-2011-08-23 17:25:00-0' for key 'PRIMARY'
2011-08-22 11:07:27.239906 E DB Error (change_program):
Query was:
UPDATE program SET starttime = ?, endtime = ? WHERE chanid = ? AND 
starttime = ?
Bindings were:
:CHANID=9305, :NEWEND=2011-08-24T03:40:00, :NEWSTART=2011-08-23T17:25:00,
:OLDSTART=2011-08-23T16:35:00
Driver error was [2/1062]:
QMYSQL3: Unable to execute statement
Database error was:
Duplicate entry '9305-2011-08-23 17:25:00-0' for key 'PRIMARY'

I want to say recent Daniel's mysql related commits are GREAT for making 
myth production/server grade software.

Daniel, really BIG thank You for impressive rightness in problem 
diagnosis and v.quick & 100% effective solution.
This is REALLY impressive, especially taking into account that system in 
question is remote install without direct access and it has many 3rd 
party components.
This is really incredible !.

br



-------------- next part --------------
A non-text attachment was scrubbed...
Name: warpme.vcf
Type: text/x-vcard
Size: 83 bytes
Desc: not available
Url : http://www.mythtv.org/pipermail/mythtv-users/attachments/20110824/3fee13a8/attachment.vcf 


More information about the mythtv-users mailing list