Linuxthreads-2.0 bug in interaction of sem_wait() and pthread_cond_wait()

There was a Linux threads bug that affected all versions of Linux through May, 2001. The details are below, along with the acknowledgement of the bug and the proposed fix for future versions of Linux.

I found the bug in May, 2001. Linuxthreads is integrated in glibc-2 (generally in Linux-2.0 and above). This bug does _not_ affect most threads programs in Linux. However, if you are using both sem_wait() and pthread_cond_wait(), then you might want to look at this.

The bug was found when using my TOP-C/C++ for easy parallelization of new or previously existing sequential code. The bug can be fixed by writing your own routines for sem_wait(), sem_post(), etc. If you are one of the few who experience this bug, you are welcome to use .../src/sem-pthread.h in the TOP-C/C++ distribution, which directly implements semaphores on top of other pthread primitives.

I believe I have found a simple bug in the semaphore implementation
of linuxthreads-2.0.6.  Specifically, it is in semaphore.c:sem_wait().

I also checked linuxthreads-2.2.2, and the same bug appears to be present
in oldsemaphore.c:__old_sem_wait().  The bug does not affect
the new sem_wait of linuxthreads-2.2.2.  Even in the old code, the
bug appears _only_ in certain cases when the application mixes
calls to semaphores with calls to mutexes and condition variables.
If semaphore are used in isolation or if semaphores are not used,
then the bug does not appear.

I will describe the bug with reference to linuxthreads-2.0.6.
If you just want the bug fix, you'll find my proposed bug
fix at the end of this message.

I've confirmed the bug by enabling the ASSERT in linuxthreads:queue.h,
and re-compiling.  The bug in my application is then always shadowed
by a triggering of assert.  I first discovered this bug using my TOP-C/C++
system for writing parallel applications for both distributed and shared memory
(http://www.ccs.neu.edu/home/gene/topc.html).  TOP-C/C++ seems to have
good test coverage for finding some of these pthreads bugs.  :-)
						- Gene Cooperman
						  gene@ccs.neu.edu

The logic of semaphore.c indicates that sem->sem_status is used
to EITHER hold the head of a queue of threads waiting on sem,
or ELSE hold a semaphore count, stored as value ((long)count << 1) + 1.

In the event that sem->sem_status holds the head of a queue of
waiting threads, the forward links from a thread th
is given as th->p_nextwaiting.  The last thread in the waiting queue
has:   th->p_nextwaiting == (pthread_descr)0x1
It is the job of sem_wait() to set:  th->p_nextwaiting = 0x1
before suspending the thread.  It is the job of
sem_restart_list() to set: th->p_nextwaiting = NULL
before restarting the thread.

The bug occurs in the case that the semaphore count is 0,
which is indicated by:  sem->sem_status == 1
and sem_post(sem) and sem_wait(sem) are called at approximately
the same time.  The following race condition can occur.

The invocation of sem_wait(sem) causes the local variable
oldstatus to receive the value 1 (corresponding to sem->sem_status).
Assume that sem_post(sem) then executes in its entirety, causing
sem->sem_status to be incremented to the value 3.
Then sem_wait(sem) continues executing.

The result is that sem_wait(sem) will follow the "else" branch and set:
self->p_nextwaiting = (pthread_descr) oldstatus; where oldstatus is 1.
The following call to sem_compare_and_swap() will then return 0, and
one iterates again through the "do" loop.  On the second iteration,
oldstatus is set to 3 (the new value of sem->sem_status),
the "if" branch is followed, newstatus is set to oldstatus - 2 (or to 1),
and sem_compare_and_swap() returns 1.  So, sem_wait(sem) exits with a
return value of 1, as it should.  But now self->p_nextwaiting has
the value of 1 instead of NULL, while the thread is not suspended.

If that thread later calls pthread_cond_wait(cond, mutex) and is suspended,
the condition variable, cond, will have this thread, th, on its wait
queue, and the value of th->p_nextwaiting will be 1.  The code
for enqueue(), dequeue(), etc. assumes that the value of
th->p_nextwaiting is always either NULL or a valid thread descriptor,
and so the code breaks upon encountering the value 1.

For a bug fix, I propose to restore the correct value of NULL
for self->p_nextwaiting before sem_wait(sem) returns.
I propose changing a code fragment in sem_wait()
FROM:
    if (newstatus & 1)
      /* We got the semaphore. */
      return 0;
TO:
    if (newstatus & 1) {
      /* We got the semaphore. */
      self->p_nextwaiting = NULL;
      return 0;
    }

Acknowledgement of Bug Fix

From drepper@redhat.com  Fri May 25 03:40:03 2001
X-Authentication-Warning: otr.mynet: drepper set sender to drepper@redhat.com us
ing -f
To: Gene Cooperman 
Cc: bug-glibc@gnu.org
Subject: Re: bug in linuxthreads: sem_wait/pthread_cond_wait interaction
X-fingerprint: BE 3B 21 04 BC 77 AC F0  61 92 E4 CB AC DD B9 5A
X-fingerprint: e6:49:07:36:9a:0d:b7:ba:b5:e9:06:f3:e7:e7:08:4a
From: Ulrich Drepper 
Date: 25 May 2001 00:38:55 -0700
User-Agent: Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.2 (Thelxepeia)
MIME-Version: 1.0

Gene Cooperman  writes:

> I believe I have found a simple bug in the semaphore implementation
> of linuxthreads-2.0.6.  Specifically, it is in semaphore.c:sem_wait()_
> [...]

I think it's correct.  I've applied the patch now.  Thanks,

-- 
---------------.                          ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \    ,-------------------'   \  Sunnyvale, CA 94089 USA
Red Hat          `--' drepper at redhat.com   `------------------------