[mythtv] SBE core dump due to MythSocket misuse?

faginbagin mythtv at hbuus.com
Fri Nov 18 05:06:02 UTC 2011


I'm in the process of debugging a deadlock that occurs in my MBE. It seems to be caused by my SBE dumping core after it finishes a recording. I should also mention that I'm running 0.23-fixes, version 26863. However, I think I've found a problem that still exists in the master branch. I hope what I've learned will help the project, so here goes.

My SBE is dumping core in the method MainServer::reconnectTimeout() when it calls MythSocket::Unlock() immediately after a call to MythSocket::DownRef(). The call to MythSocket::DownRef() deletes the MythSocket object because the reference count goes negative. When I compare the 0.23 versions of programs/mythbackend/mainserver.cpp and libs/libmythdb/mythsocket.cpp to the master branch versions (mythsocket.cpp has moved to libs/libmythbase), I see the same code, which I think means the master branch is vulnerable to the same core dump I'm seeing on 0.23. In particular:

The master version of mythsocket.cpp contains the following, starting at line 102:

bool MythSocket::DownRef(void)
{
    m_ref_lock.lock();
    int ref = --m_ref_count;
    m_ref_lock.unlock();

    LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));

    if (m_cb && ref == 0)
    {
        m_cb = NULL;
        s_readyread_thread->RemoveFromReadyRead(this);
        // thread will downref & delete obj
        return true;
    }
    else if (ref < 0)
    {
        delete this;
        return true;
    }

    return false;
}

The 0.23 version is almost identical. The only difference is the log statement:
Master version:
    LOG(VB_SOCKET, LOG_DEBUG, LOC + QString("DownRef: %1").arg(ref));
0.23 version:
    VERBOSE(VB_SOCKET, LOC + QString("DownRef: %1").arg(m_ref_count));

The master version of programs/mythbackend/mainserver.cpp contains the following, starting at line 6091 in MainServer::reconnectTimeout():

    if (!masterServerSock->writeStringList(strlist) ||
        !masterServerSock->readStringList(strlist) ||
        strlist.empty() || strlist[0] == "ERROR")
    {
        masterServerSock->DownRef();
        masterServerSock->Unlock();
        masterServerSock = NULL;

The same code can be found in the 0.23 version starting at line 5217.

I have confirmed that my SBE is in fact deleting the MythSocket object in the MythSocket::DownRef() method by turning on socket logging. Here's a sample of the log output:

2011-11-17 22:01:07.127 MythSocket(9083638:34): readStringList: Error, timed out after 30000 ms.
2011-11-17 22:01:07.132 MythSocket(9083638:34): state change Connected -> Idle
2011-11-17 22:01:07.141 MythSocket(9083638:-1): DownRef: -1
2011-11-17 22:01:07.148 MythSocket(9083638:-1): delete socket

Then mainserver.cpp calls MythSocket:Unlock() and promptly dumps core. I can think of two fixes:

1) Swap the two calls to MythSocket:DownRef() and MythSocket:Unlock() in mainserver.cpp.

2) Change MythSocket:DownRef() so it doesn't delete itself when ref < 0, perhaps by replacing the "delete this;" statement with "s_readyread_thread->RemoveFromReadyRead(this);"? I don't know the code well enough to know if that would be appropriate, but seeing an object delete itself makes me very uncomfortable.

For myself, I will try what I think is the least dangerous route and just swap the DownRef and Unlock calls.

HTH,
Helen


More information about the mythtv-dev mailing list