[mythtv] Slave backend hang after ANN command

Jason Gillis jgillis at acm.org
Wed Apr 6 18:33:39 UTC 2005


I've done a bit more digging into this issue this morning.

I fired up ethereal on both sides and noted that the master backend is
sending an OK message followed by a QUERY_FREESPACE.  These packets are
arriving at the slave backend.

As far as I can tell, the slave is properly creating it's PlaybackSock
object, and pushing that back into the it's list of PlaybackSocks
(reconnectTimeout() completes).  From there, it doesn't seem to notice that
anything has arrived on the socket.

I'm kinda stuck there.  I'm not sure what part of the code is supposed to
detect that there's info available on the socket connection with the master
backend.  Tracing through things hasn't been helping me much.

Can someone provide a high level description of what's supposed to be
happening in the code after a slave backend announces itself?  I'm just not
sure what functions should be getting called next, or what's supposed to go
down.

I haven't been able to get a successful connection today so I haven't been
able to see what is normally supposed to happen.

Thanks,
Jason


----- Original Message ----- 

Hi,

I'm working with CVS as of yesterday (Apr 04 - about 23:00 UTC).  I have two
systems:  A master backend and one slave.

I'm seeing a hang on the slave side when it connects to the master.  I've
been able to debug things enough to see that the slave is sending an "ANN
SlaveBackend" command to the server.  On the slave side, it looks like the
reconnectTimeout() function which sends the announce command completes.  It
just doesn't seem to see the master's QUERY_FREESPACE request that comes
next.

This is mostly reproduceable on my systems.  That is, it is intermittent,
but I have more trouble getting it to work than I do getting it to hang.
I'm not able to tell if it's a slave issue or a master issue, either.  I've
been focusing my effort on the slave as it _looks_ like the master is doing
the right thing.

I've tried to do a little bit of debugging, and did get a backtrace from
gdb, but I'm not sure how helpful it really is:

<<THREAD DUMP SNIPPED OUT OF REPLY>>

I followed the instructions for building a debug flavor of mythtv found at
http://www.mythtv.org/docs/mythtv-HOWTO-20.html#ss20.2, but given the
numerous "No symbol table info available" statements there, I'm not sure
it's worked.  Is that the correct procedure now that the build system has
changed (new options in configure, etc.)?

I also did some searches to see what I could find regarding this on the mail
list, but nothing came up, so I'm either picking bad search terms or finding
something new.  (The discussion about UNKNOWN_COMMAND seemed similar, but
I'm not seeing those.)

I've also attached the "--verbose all" output from both the slave and master
(slave.out and master.out).  When I finally killed the master, it coughed up
the "Mutex destroy failure" message and died completely.

If there's any more useful information I can provide or things I can try,
please let me know.

Master System:
Dual Athlon-MP 1900
1GB ram
PVR-250 for recording only

Slave:
Celeron (p4) 2GHz
512mb ram
/mnt/store mounted from master via NFS
PVR-350 for recording and playback


Jason




More information about the mythtv-dev mailing list