Opened 11 years ago

Closed 11 years ago

#1834 closed Bugs (fixed)

boost::condition drops signals with notify_one()

Reported by: Kimon.Hoffmann@… Owned by: Anthony Williams
Milestone: To Be Determined Component: threads
Version: Boost 1.35.0 Severity: Regression
Keywords: Cc:


In a rather simple case of multiple threads waiting on a condition variable which are resumed by multiple calls to notify_one() we observed that since 1.35.0 some of the signals are dropped causing some threads to block forever.

Attached is the (rather minimal) asynchronous queue implementation the we have used for quite some time now together with a simple test application that reproduces the problem.

The problem was observed on several machines, both Single- and Multi-Core running both Windows XP SP2 and Windows Vista Business SP1. In all cases the test application was compiled with the Microsoft Visual Studio.Net 2005 SP1 compiler.

We have observed that the issue is more likely to appear when:

  1. There is some load on the system
  2. The program is executed from within the Visual Sutio 2005 debugger.

A concrete scenario where we could reproduce the problem quite reliably:

  1. Start the task manager
  2. Start the test application within the VS 2005 Debugger
  3. Activate the task manager and monitor the thread count within the process list.

Quite often (~ 2 out of 3 runs) the thread count stops changing a certain number, which marks the point at which all notify_one calls have been made, but not all threads have been awoken as a result. When the debugger is now used to break the application, all the remaining threads are waiting with interruptible_wait. Viewing the element count of the queue reveals that all elements have been successfully inserted into the queue, which is why we assume that some of the associated notifications have been dropped.

While this problem can easily be avoided by replacing the notify_one() call with a call to notify_all(), we still believe that this is a bug in the new threads implementation as this queue (and it's associated unit-tests, which first showed the problem) have worked reliably with previous versions of Boost.

Attachments (1) (3.7 KB) - added by Kimon.Hoffmann@… 11 years ago.
Asynchronous queue implementation and a small test application

Download all attachments as: .zip

Change History (4)

Changed 11 years ago by Kimon.Hoffmann@…

Attachment: added

Asynchronous queue implementation and a small test application

comment:1 Changed 11 years ago by Kimon.Hoffmann@…

I will also test this under Linux tonight and will notify you whether or not I was able to reproduce the problem there.

comment:2 Changed 11 years ago by Kimon.Hoffmann@…

I just retested the bug with the fix commited in revision r44699 of the trunk and was no longer able to reproduce the described behavior.

comment:3 Changed 11 years ago by Anthony Williams

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.