[mythtv] SBE core dump due to MythSocket misuse?

Tom Lichti tom at redpepperracing.com
Fri Nov 18 15:30:38 UTC 2011


On Fri, Nov 18, 2011 at 12:06 AM, faginbagin <mythtv at hbuus.com> wrote:
> I'm in the process of debugging a deadlock that occurs in my MBE. It seems to be caused by my SBE dumping core after it finishes a recording. I should also mention that I'm running 0.23-fixes, version 26863. However, I think I've found a problem that still exists in the master branch. I hope what I've learned will help the project, so here goes.
>
> My SBE is dumping core in the method MainServer::reconnectTimeout() when it calls MythSocket::Unlock() immediately after a call to MythSocket::DownRef(). The call to MythSocket::DownRef() deletes the MythSocket object because the reference count goes negative. When I compare the 0.23 versions of programs/mythbackend/mainserver.cpp and libs/libmythdb/mythsocket.cpp to the master branch versions (mythsocket.cpp has moved to libs/libmythbase), I see the same code, which I think means the master branch is vulnerable to the same core dump I'm seeing on 0.23. In particular:
>
> The master version of mythsocket.cpp contains the following, starting at line 102:
>
> bool MythSocket::DownRef(void)
> {
>    m_ref_lock.lock();
>    int ref = --m_ref_count;
>    m_ref_lock.unlock();
>
>    LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
>
>    if (m_cb && ref == 0)
>    {
>        m_cb = NULL;
>        s_readyread_thread->RemoveFromReadyRead(this);
>        // thread will downref & delete obj
>        return true;
>    }
>    else if (ref < 0)
>    {
>        delete this;
>        return true;
>    }
>
>    return false;
> }
>
> The 0.23 version is almost identical. The only difference is the log statement:
> Master version:
>    LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
> 0.23 version:
>    VERBOSE(VB_SOCKET, LOC + QString("DownRef: %1").arg(m_ref_count));
>
> The master version of programs/mythbackend/mainserver.cpp contains the following, starting at line 6091 in MainServer::reconnectTimeout():
>
>    if (!masterServerSock->writeStringList(strlist) ||
>        !masterServerSock->readStringList(strlist) ||
>        strlist.empty() || strlist[0] == "ERROR")
>    {
>        masterServerSock->DownRef();
>        masterServerSock->Unlock();
>        masterServerSock = NULL;
>
> The same code can be found in the 0.23 version starting at line 5217.
>
> I have confirmed that my SBE is in fact deleting the MythSocket object in the MythSocket::DownRef() method by turning on socket logging. Here's a sample of the log output:
>
> 2011-11-17 22:01:07.127 MythSocket(9083638:34): readStringList: Error, timed out after 30000 ms.
> 2011-11-17 22:01:07.132 MythSocket(9083638:34): state change Connected -> Idle
> 2011-11-17 22:01:07.141 MythSocket(9083638:-1): DownRef: -1
> 2011-11-17 22:01:07.148 MythSocket(9083638:-1): delete socket
>
> Then mainserver.cpp calls MythSocket:Unlock() and promptly dumps core. I can think of two fixes:
>
> 1) Swap the two calls to MythSocket:DownRef() and MythSocket:Unlock() in mainserver.cpp.
>
> 2) Change MythSocket:DownRef() so it doesn't delete itself when ref < 0, perhaps by replacing the "delete this;" statement with "s_readyread_thread->RemoveFromReadyRead(this);"? I don't know the code well enough to know if that would be appropriate, but seeing an object delete itself makes me very uncomfortable.

Please post your patches, as I see the same problem with the master
branch. If my master backend is down for whatever reason, my slave
backend will segfault eventually.

Thanks!
Tom


More information about the mythtv-dev mailing list