Modify

Opened 8 years ago

Closed 7 years ago

#3448 closed Bugs (fixed)

interprocess_condition (emulated) can exit with inconsistent m_num_waiters value

Reported by: Zachariah L Young <zachariah.l.young@…> Owned by: igaztanaga
Milestone: Boost 1.45.0 Component: interprocess
Version: Boost 1.40.0 Severity: Problem
Keywords: interprocess_condition Cc:

Description

I describe this from the point of view of the 1.39.0 source code, but the problem still exists in the boost development trunk as of today.

Bug:

There is a set of conditions where a process can manage to enter do_timed_wait, increment m_num_waiters, and exit without decrementing it.

Boost 1.39.0

Sequence of events:

We join our hero, Process A (P_A), in boost/interprocess/sync/emulation/interprocess_condition.hpp.

P_A is executing a do_timed_wait(true, lock, abs_time) call, and is spinning at the while loop at line 124.

tout_enabled == true, and abs_time is a microsecond in the future (about to expire but hasn't yet).

Process B, P_A's trusty sidekick, sends a notify_all on the conditional, breaking P_A out of the while loop at line 124.

abs_time arrives (ie, P_A got to line 149 with microsec_clock::universal_time() >= abs_time and timed_out = false).

With these conditions, P_A gets to line 163 and calls the constructor for scoped_lock.

P_A jumps to boost/interprocess/sync/scoped_lock.hpp line 114.

P_A executes mp_mutex->timed_lock(abs_time) at line 115.

P_A jumps to boost/interprocess/sync/emulation/interprocess_condition.hpp line 49.

P_A takes a reading of now at line 56.

P_A finds that (now >= abs_time) at line 58 and is sent packing with a return value of false.

P_A arrives back in boost/interprocess/sync/emulation/interprocess_condition.hpp on line 163.

P_A gets to line 171 and finds lock is false. He panics! He sets timed_out to true and unlock_enter_mut to true, but in his haste to break out of evil Dr. while(1)'s clutches, he forgot to atomically decrement m_num_waiters!

Maniacal laughter can be heard behind him as he tries in vein to acquire the lock on line 214.

"You fool! You fell into my trap!", shouts Dr. while(1). "Process B grabbed that very lock and attempted to free you again! He is at line 56 of this very header file, waiting for a call from you that will never come, and he's holding your precious lock! Your deadlock is complete! HAHAHAHAHAHAH!!"

Attachments (0)

Change History (2)

comment:1 Changed 8 years ago by anonymous

  • Version changed from Boost 1.39.0 to Boost 1.40.0

Confirmed to still exist in Boost 1.40. My test case: 5 threads timed_send'ing ~1024 byte messages into a message_queue with que_size == 1 and msg_size == 1024. Each thread is setup to send every 100 ms, with a 100 ms timeout.

Windows XP, 4 core processor, ~4Gb RAM

The threads all lock within 60 seconds of startup.

Based on the description above, I added this at line 174 of boost/interprocess/sync/emulation/interprocess_condition.hpp:

detail::atomic_dec32(const_cast<boost::uint32_t*>(&m_num_waiters));

It solved the problem.

comment:2 Changed 7 years ago by igaztanaga

  • Milestone changed from Boost 1.41.0 to Boost-1.45.0
  • Resolution set to fixed
  • Status changed from new to closed

Fixed for Boost 1.45 in release branch

Add Comment

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain igaztanaga.
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.