[mythtv-users] Scheduling new recordings requires mythtv restart

Joel mythtv at ebel.mybox.org
Fri Apr 24 08:57:30 UTC 2009


Michael T. Dean wrote:
> The errors you're getting are due to the socket errors.  A lot of people 
> seem to get them.  I don't know that anyone knows any specific reasons 
> why some people get them a lot and others don't.

<short_version>
I started writing this email before I figured out what was going on, so 
here's the short version at the top to prevent TLDR.  If you're curious 
the method followed to arrive at this, feel free to read below.

When mythfilldatabase is running, it's putting a significant load on 
mysql, which persists for a period of time (for me about 22 seconds) 
after mythfilldatabase is done.  Mythbackend is apparently blocking on 
some mysql queries and can't respond to the socket mythfilldatabase 
opened up within the 7 second quick timeout being used.

So, I have three questions/recommendations:
1.  Not being a mysql expert, what are some settings I can give mysql so 
that it will not refuse to respond to the frontend while 
mythfilldatabase is running?

2.  It would be nice to not make the backend block on these mysql 
queries so it can proceed with the socket communication?  I could file a 
bug for this if you like, but I wouldn't know where to begin with a patch.

3.  In case either of these fails, I'd recommend setting the timeout for 
this socket connection to be longer.  Perhaps by just not using the 
quick timeout in this case.

Thanks for your help finding this.  It's been bugging me for a while, so 
I'm glad I got closer to the bottom of it.

Joel

</short_version>

I've taken a closer look at the communication that's going on. 
mythbacked --resched appears to work, where mythfilldatabase fails, so I 
compared the network traffic between the two.  Here are some tcpdumps 
you can open up in wireshark to see what happened:

http://ebel.mybox.org/mythbackend_--resched.dump
http://ebel.mybox.org/mythfilldatabase.dump

Sadly I still don't have a good explanation because from what I can 
tell, the network communication is exactly identical, until mythbackend 
simply chooses not to respond to mythfilldatabase in a timely manner. 
But I can't figure how it can even tell the difference, because the 
payload is bit for bit identical between the two.  I can even telnet to 
port 6543 and send "21      MYTH_PROTO_VERSION 40" and I immediately get 
back "13      ACCEPT[]:[]40"  but for mythfilldatabase, it waits for 
nearly 14 seconds to send back that response, and by then, 
mythfilldatabase has given up and closed the connection after 7 seconds, 
  strangely exactly half the time it took for mythbackend to respond. 
mythfilldatabase tries again and waits seven seconds, only to just miss 
the response by 0.2 seconds.

So here's a summary:

0s: mythfilldatabase opens a connection and sends "21 
MYTH_PROTO_VERSION 40" the server acks the packet

7s: mythfilldatabase times out and sends a FIN.  It opens a new 
connection and sends the same thing again.  The server ACKs the packet. 
  The server then ACKs the FIN on the original connection.

14s: mythbackend finally responds to the first connection with "13 
ACCEPT[]:[]40" but it's too late.  The client sends a RST.
The client also gives up on the second connection and sends a FIN.  the 
server responds with "13      ACCEPT[]:[]40" on the second connection 
0.2 seconds later, and the client sends an RST.

I ran this again, and mythbackend waited 22 seconds before it responded 
to both connections which had both given up by then.  And another time 
it took 21 seconds.  So it seems the backend is blocking on something 
when mythfilldatabase is finishing preventing it from responding to 
mythfilldatabase in a timely fashion.  But what?  The backend logs are 
conspicuously silent for those 22 seconds even with -v all.  Since 
mythbackend --resched responds immediately, I don't understand why it's 
taking so long for mythfilldatabase.

I tried lengthening the quick timout in mythsocket.cpp to 30 seconds, 
and mythfilldatabase successfully requested rescheduling in about 23 
seconds.

Ah.  Now I see what's happening.  mysql is refusing to respond to 
mythbackend's queries in a timely fashion while mythfilldatabase is 
running, and for a period of about 22 seconds (apparently) afterwards. 
once it's freed up to respond, mythbackend is able to catch up and 
respond to the socket.


More information about the mythtv-users mailing list