[mythtv] SBE core dump due to MythSocket misuse?
Tom Lichti
tom at redpepperracing.com
Fri Nov 18 15:30:38 UTC 2011
On Fri, Nov 18, 2011 at 12:06 AM, faginbagin <mythtv at hbuus.com> wrote:
> I'm in the process of debugging a deadlock that occurs in my MBE. It seems to be caused by my SBE dumping core after it finishes a recording. I should also mention that I'm running 0.23-fixes, version 26863. However, I think I've found a problem that still exists in the master branch. I hope what I've learned will help the project, so here goes.
>
> My SBE is dumping core in the method MainServer::reconnectTimeout() when it calls MythSocket::Unlock() immediately after a call to MythSocket::DownRef(). The call to MythSocket::DownRef() deletes the MythSocket object because the reference count goes negative. When I compare the 0.23 versions of programs/mythbackend/mainserver.cpp and libs/libmythdb/mythsocket.cpp to the master branch versions (mythsocket.cpp has moved to libs/libmythbase), I see the same code, which I think means the master branch is vulnerable to the same core dump I'm seeing on 0.23. In particular:
>
> The master version of mythsocket.cpp contains the following, starting at line 102:
>
> bool MythSocket::DownRef(void)
> {
> m_ref_lock.lock();
> int ref = --m_ref_count;
> m_ref_lock.unlock();
>
> LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
>
> if (m_cb && ref == 0)
> {
> m_cb = NULL;
> s_readyread_thread->RemoveFromReadyRead(this);
> // thread will downref & delete obj
> return true;
> }
> else if (ref < 0)
> {
> delete this;
> return true;
> }
>
> return false;
> }
>
> The 0.23 version is almost identical. The only difference is the log statement:
> Master version:
> LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
> 0.23 version:
> VERBOSE(VB_SOCKET, LOC + QString("DownRef: %1").arg(m_ref_count));
>
> The master version of programs/mythbackend/mainserver.cpp contains the following, starting at line 6091 in MainServer::reconnectTimeout():
>
> if (!masterServerSock->writeStringList(strlist) ||
> !masterServerSock->readStringList(strlist) ||
> strlist.empty() || strlist[0] == "ERROR")
> {
> masterServerSock->DownRef();
> masterServerSock->Unlock();
> masterServerSock = NULL;
>
> The same code can be found in the 0.23 version starting at line 5217.
>
> I have confirmed that my SBE is in fact deleting the MythSocket object in the MythSocket::DownRef() method by turning on socket logging. Here's a sample of the log output:
>
> 2011-11-17 22:01:07.127 MythSocket(9083638:34): readStringList: Error, timed out after 30000 ms.
> 2011-11-17 22:01:07.132 MythSocket(9083638:34): state change Connected -> Idle
> 2011-11-17 22:01:07.141 MythSocket(9083638:-1): DownRef: -1
> 2011-11-17 22:01:07.148 MythSocket(9083638:-1): delete socket
>
> Then mainserver.cpp calls MythSocket:Unlock() and promptly dumps core. I can think of two fixes:
>
> 1) Swap the two calls to MythSocket:DownRef() and MythSocket:Unlock() in mainserver.cpp.
>
> 2) Change MythSocket:DownRef() so it doesn't delete itself when ref < 0, perhaps by replacing the "delete this;" statement with "s_readyread_thread->RemoveFromReadyRead(this);"? I don't know the code well enough to know if that would be appropriate, but seeing an object delete itself makes me very uncomfortable.
Please post your patches, as I see the same problem with the master
branch. If my master backend is down for whatever reason, my slave
backend will segfault eventually.
Thanks!
Tom
More information about the mythtv-dev
mailing list