Modify

Opened 5 years ago

Closed 3 months ago

#6787 closed Bugs (fixed)

boost::thread::sleep() hangs if system time is rolled back

Reported by: Artem Gayardo-Matrosov <boost@…> Owned by: viboes
Milestone: Boost 1.65.0 Component: thread
Version: Boost 1.49.0 Severity: Problem
Keywords: Cc:

Description

Steps to reproduce:

  1. Compile and run the following code
    int i = 0;
    while(true)
    { 
        printf("%d\n", ++i);
        boost::this_thread::sleep( boost::posix_time::milliseconds(1000));
    }
    
  1. While it runs, rewind the system time by 1 hour back.

Expected result: The code continues to print every second.

Actual result: The code stops printing anything. I doesn't continue even after I've put the time back to current.

Attachments (0)

Change History (75)

comment:1 Changed 5 years ago by Artem Gayardo-Matrosov <boost@…>

The problem is specific to Linux; it seems to work fine on Windows.

comment:2 Changed 5 years ago by viboes

Owner: changed from Anthony Williams to viboes
Status: newassigned

boost::this_thread::sleep will be deprecated since 1.50. Do you get the same behavior with

boost::this_thread::sleep_for?

comment:3 Changed 5 years ago by Artem Gayardo-Matrosov <boost@…>

I confirm the problem is no longer there with boost::this_thread::sleep_for(), as it uses the monotonic timer from the chrono lib.

comment:4 Changed 5 years ago by viboes

I'm thinking in deprecating the sleep function. Does sleep_for() respond to your needs? If yes, could we close this ticket as don't fixed as there is a better alternative?

comment:5 Changed 5 years ago by viboes

Milestone: To Be Determined
Resolution: wontfix
Status: assignedclosed

comment:6 Changed 4 years ago by Ki Hong Lee <clydelee@…>

This problem is being reoccurring again in the 1.55.0.

  1. environment
    • O/S : linux x86_64 (glibc 2.15)
    • compiler : gcc 4.7.3

comment:7 in reply to:  6 Changed 4 years ago by Ki Hong Lee <clydelee@…>

i'm missed an important point

  1. Problem function boost::this_thread::sleep_for

comment:8 Changed 4 years ago by boost@…

I would still fix boost::sleep() to use a monotonic timer, as it basically means that any calls to sleep() will hang your application randomly, for example if your system uses automatic time synchronization.

comment:9 Changed 4 years ago by viboes

Resolution: wontfix
Status: closedreopened

system_time is not monotonic. Please could you provide an reproducible example that shows the error on linux with sleep_for? I'm really interested in fixing it if I can reproduce it.

comment:10 Changed 4 years ago by Ki Hong Lee <clydelee@…>

Steps to reproduce:

  1. Compile and run the following code
    #include <iostream>
    #include <boost/thread.hpp>
    
    void some_job()
    {
        while (true) {
            boost::this_thread::sleep_for(boost::chrono::seconds(1));
            std::cout << "i'm awaken !" << std::endl;
        }
    }
    
    
    int main()
    {
        boost::thread thrd(some_job);
        thrd.join();
    
        return 0;
    }
    
  1. While it runs, rewind the system time by 1 hour back.

Expected result: thread(some_job) is an infinite loop.

Actual result : thread(some_job) is blocked.

comment:11 Changed 4 years ago by viboes

Would you example work if you use the underlying platform functions? Do you know of an standard library implementation that don't suffer of this problem?

Last edited 4 years ago by viboes (previous) (diff)

comment:12 Changed 4 years ago by Ki Hong Lee <clydelee@…>

new example:

#include <iostream>
#include <boost/thread.hpp>

// 1.55.0
void boost_func()
{
    while (true) {        
        // libs/thread/src/pthread/thread.cpp : 442
        // while( thread_info->sleep_condition.do_wait_for(lk,ts)) {}

        boost::this_thread::sleep_for(boost::chrono::seconds(1));
        std::cout << "i'm boost_func !" << std::endl;
    }
}

// Similar to the 1.51.0 version of the sleep_for
void posix_func()
{
    ::timespec tm, tm1;
    while (true) {
        tm.tv_sec = 1;
        tm.tv_nsec = 0;
        ::nanosleep(&tm, &tm1);
        std::cout << "\ti'm posix_func !" << std::endl;
    }
}

int main()
{
    boost::thread_group threads;
    threads.create_thread(boost_func);
    threads.create_thread(posix_func);

    threads.join_all();

    return 0;
}

comment:13 Changed 4 years ago by viboes

next follows the whole code

        void BOOST_THREAD_DECL sleep_for(const timespec& ts)
        {
            boost::detail::thread_data_base* const thread_info=boost::detail::get_current_thread_data();

            if(thread_info)
            {
              unique_lock<mutex> lk(thread_info->sleep_mutex);
              while( thread_info->sleep_condition.do_wait_for(lk,ts)) {}
            }
            else
            {

              if (boost::detail::timespec_ge(ts, boost::detail::timespec_zero()))
              {

  #   if defined(BOOST_HAS_PTHREAD_DELAY_NP)
  #     if defined(__IBMCPP__)
                BOOST_VERIFY(!pthread_delay_np(const_cast<timespec*>(&ts)));
  #     else
                BOOST_VERIFY(!pthread_delay_np(&ts));
  #     endif
  #   elif defined(BOOST_HAS_NANOSLEEP)
                //  nanosleep takes a timespec that is an offset, not
                //  an absolute time.
                nanosleep(&ts, 0);
  #   else
                mutex mx;
                unique_lock<mutex> lock(mx);
                condition_variable cond;
                cond.do_wait_for(lock, ts);
  #   endif
              }
            }
        }

sleep_for() is one of the predefined thread interruption points. It is not possible to use nanosleep and make it an interruption point.

I would like to separate standard threads and interruptible threads, but this is not yet there.

What can I can do is define two functions sleep_for and sleep_until on a different namespace

namespace non_interruptible {
    template <class Clock, class Duration>
    void sleep_until(const chrono::time_point<Clock, Duration>& t);
template <class Rep, class Period>
    void sleep_for(const chrono::duration<Rep, Period>& d);
}

What do you think?

comment:14 Changed 4 years ago by viboes

Ping!!!

comment:15 Changed 4 years ago by viboes

Milestone: To Be Determined

comment:16 Changed 3 years ago by viboes

It seems that there is an alternative as described in #7665. (Is it possible to initialize the condition variable to use the monotonic clock as described here).

comment:17 Changed 3 years ago by viboes

Unfortunately pthread_condattr_setclock is not available on MacOs?.

comment:19 Changed 3 years ago by viboes

Resolution: fixed
Status: reopenedclosed

comment:20 Changed 3 years ago by Michael Sternberg

The problem still wasn't fixed when boost::this_thread::sleep is running in thread. Consider the following piece of code, sleep still hangs when system is rolled backwards:

#include <stdio.h> #include <boost/thread.hpp> int i = 0; void testfunc() {

while(true) {

printf("%d\n", ++i); boost::this_thread::sleep( boost::posix_time::milliseconds(800));

}

}

int main() {

boost::thread * th = new boost::thread(&testfunc); th->join(); return 0;

}

comment:21 Changed 3 years ago by viboes

Let me know if the problem is still there with non_interruptible::sleep_for.

comment:22 Changed 3 years ago by Matulis

I am also still seeing the same behavior when using sleep_for and no_interruption_point::sleep_for

std::this_thread::sleep_for seems to be working, however I have to add in a _GLIBCXX_USE_NANOSLEEP define due to a bug in my version of GCC.

Kernel: 2.6.32-431.29.2.el6.x86_64 GCC: 4.7.0 OS: CentOS 6.5

comment:23 Changed 3 years ago by Matulis

Testing with the preprocessor variable: BOOST_THREAD_SLEEP_FOR_IS_STEADY

Seems to produce the same results

comment:24 Changed 3 years ago by viboes

Resolution: fixed
Status: closedreopened

Even with boost::this_thread::no_interruption_point::sleep_for?

Well, it is not easy to manage the specific GCC bugs.

comment:25 in reply to:  24 Changed 3 years ago by Matulis

Replying to viboes:

Even with boost::this_thread::no_interruption_point::sleep_for?

Well, it is not easy to manage the specific GCC bugs.

And what I meant with the GCC comment, is that the only way I have gotten the situation to work correctly is if I used std::this_thread::sleep_for().

Yes, even with boost::this_thread::no_interruption_point::sleep_for:

Here is the sample code(modified from an above example)

#include <iostream>
 
#include <boost/thread.hpp>
using namespace std;
 
void testfunc() {
 
	int i = 0;
	while(true) {
	  //Turn system clock back now and loop will no longer print
		std::cout << "Testing: " << i << std::endl;
		boost::this_thread::no_interruption_point::sleep_for( boost::chrono::milliseconds(800));
		i++;
	}
}
 
int main() {
 
	boost::thread th = boost::thread(&testfunc);
	th.join();
 
	return 0;
}
Last edited 3 years ago by viboes (previous) (diff)

comment:26 Changed 3 years ago by viboes

Did you defined _GLIBCXX_USE_NANOSLEEP when using boost::this_thread::no_interruption_point::sleep_for?

comment:27 in reply to:  26 Changed 3 years ago by Matulis

Replying to viboes:

Did you defined _GLIBCXX_USE_NANOSLEEP when using boost::this_thread::no_interruption_point::sleep_for?

I am seeing the same result whether it is defined or not. In regard to the overall problem, can you reproduce the same results on your machine?

comment:28 Changed 3 years ago by viboes

A test example changing itself the date would be welcome.

comment:29 Changed 3 years ago by Matulis

Here is some test code that modifies the system time backwards by 1 minute, however you need to run it as sudo/root, due to the use of stime. I link to boost_chrono, so I don't have to create edge cases for when the hour/day/month changes.

Expected Result: Loop in thread should continuously keep printing out messages

Actual Result: After system time is changed, the loop does not execute until system time catches back up to present time

If I have time later, I could try to integrate it into the unit tests, let me know if that would be useful.

#include <iostream>
#include <time.h>

#include <boost/thread.hpp>
#include <boost/chrono.hpp>




void testfunc() {

        int i = 0;
        while(true) {
                std::cout << "Testing: " << i << std::endl;
                boost::this_thread::no_interruption_point::sleep_for( boost::chrono::milliseconds(1000));
                i++;
        }
}

int main() {

        boost::thread th = boost::thread(&testfunc);

        //Sleep thread to allow other thread to run for a bit before time is changed
        boost::this_thread::sleep_for(boost::chrono::milliseconds(5000));


        boost::chrono::time_point<boost::chrono::system_clock> current = boost::chrono::system_clock::now();

        //Set time back 1 minute
        current.operator -=(boost::chrono::minutes(1));

        //Create std time structure from boost chrono
        std::time_t now = boost::chrono::system_clock::to_time_t(current);

        //Set the system time
        stime(&now);

        th.join();

        return 0;
}

comment:30 Changed 3 years ago by viboes

Thnaks.

comment:31 Changed 3 years ago by viboes

Could you add

#ifndef BOOST_THREAD_SLEEP_FOR_IS_STEADY
#error
#endif

and replace the call to sleep_for by

  timespec ts = boost::detail::to_timespec(boost::chrono::milliseconds(1000));
  nanosleep( &ts, 0);

?

The program must compile and run as you expect. It it is not the case you need to make it work as you expect before.

Have you re built the Boost library after changing your compiler flags? If not do it, otherwise the code will not use nanosleep :(

comment:32 Changed 3 years ago by viboes

Type: BugsSupport Requests

Moved to support, as I believe this is an issue on the configuration on the user side

comment:33 Changed 3 years ago by Matulis

So I added in the requested code, it seems like there was no error on BOOST_THREAD_SLEEP_FOR_IS_STEADY

And the execution worked correctly, with the nanosleep.

However when I recompile my boost libraries with:

./b2 toolset=gcc-4.8.2 cxxflags="-std=c++11 -std=c++0x -v" define="_GLIBCXX_USE_NANOSLEEP" -a

I still get the same issue when using:

boost::this_thread::no_interruption_point::sleep_for( boost::chrono::milliseconds(800));

or

boost::this_thread::sleep_for( boost::chrono::milliseconds(800));

Any ideas? So you are not seeing this issue when running the program as sudo? Since the system time will not change if you don't have elevated permissions.

comment:34 Changed 3 years ago by Matulis

I think the cause may have been an incorrect namespace when calling:

boost::this_thread::no_interruption_point::sleep_for( boost::chrono::milliseconds(1000));

I created the following pull request: github.com/boostorg/thread/pull/45

comment:35 Changed 3 years ago by viboes

You are right. I've accepted the PR.

comment:36 Changed 3 years ago by viboes

Milestone: Boost 1.57.0Boost 1.58.0

comment:37 Changed 3 years ago by viboes

Type: Support RequestsBugs

comment:38 Changed 3 years ago by viboes

Resolution: fixed
Status: reopenedclosed

comment:39 Changed 2 years ago by ueli.marti@…

Seems the problem still subsists with Boost 1.58.0 with boost::this_thread::sleep_for(). It works with boost::this_thread::no_interruption_point::sleep_for(). Tested platform is: ARM, Linux 2.6.32, gcc 4.7.3, boost 1.58.0

Is boost 1.58.0 fixing the problem for boost::this_thread::sleep_for() or not ?

comment:40 Changed 2 years ago by ueli.marti@…

Resolution: fixed
Status: closedreopened

comment:41 Changed 2 years ago by viboes

Well, the fix is to use the no_interruption_point version. Unfortunately, there is no way to fix it for boost::this_thread::sleep_for(), thus the introduced no_interruption_point version.

comment:42 Changed 2 years ago by ueli.marti@…

Ok, i see. Thanks.

comment:43 Changed 2 years ago by viboes

Milestone: Boost 1.58.0To Be Determined

I'm writing a patch that uses pthread_condattr_setclock. As I said I have no platform providing it (I'm using MacOS). The user will need to define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC until I find a way to detect it.

Is there any one that can test it?

comment:45 Changed 2 years ago by ueli.marti@…

I should be able to test it on ARM/Linux. Might take me a couple days however because i'm on other urgent issues. What's the best way to integrate you patch ? Can i just take boost 1.59 and replace condition_variable_fwd.hpp with your version ?

comment:46 Changed 2 years ago by viboes

Hi, I guess this should work. IN this case you can also add directly the define at the beginning of the file

#define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC

comment:47 Changed 2 years ago by viboes

Status: newassigned

comment:48 Changed 2 years ago by ueli.marti@…

I've tested on ARM/Linux. boost::this_thread::sleep_for() hangs forever, even without rolling back system clock.

Integrated the patch the following way:

  • downloaded boost 1.59.0
  • replaced /thread/pthread/condition_variable_fwd.hpp with the version from the patch
  • added #define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC at the beginning of the file
  • rebuilt boost for target
  • rebuilt test program

comment:49 Changed 2 years ago by ueli.marti@…

However, boost::thread::interrupt() allows to termionate boost::this_thread::sleep_for()

comment:50 in reply to:  49 Changed 2 years ago by viboes

Replying to ueli.marti@…:

However, boost::thread::interrupt() allows to termionate boost::this_thread::sleep_for()

Glad to hear that.

comment:51 in reply to:  48 Changed 2 years ago by viboes

Replying to ueli.marti@…:

I've tested on ARM/Linux. boost::this_thread::sleep_for() hangs forever, even without rolling back system clock.

Integrated the patch the following way:

  • downloaded boost 1.59.0
  • replaced /thread/pthread/condition_variable_fwd.hpp with the version from the patch
  • added #define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC at the beginning of the file
  • rebuilt boost for target
  • rebuilt test program

Thanks for testing. Hrr, this are bad news. I will need to have access to a platform providing BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC.

comment:52 Changed 2 years ago by ueli.marti@…

Let me know if i can be of any help, if you want me to try out any changes.

comment:53 in reply to:  52 Changed 2 years ago by viboes

Replying to ueli.marti@…:

Let me know if i can be of any help, if you want me to try out any changes.

Thanks for the proposal. I'm a little bit busy fro two weeks. I'll come back to you then.

comment:54 Changed 2 years ago by viboes

Curiously the patch works for another tester #11377.

comment:55 Changed 2 years ago by viboes

Please could you try to replace in thread/v2/thread.hpp line #94

#ifdef BOOST_THREAD_SLEEP_FOR_IS_STEADY

by

#if defined BOOST_THREAD_SLEEP_FOR_IS_STEADY && ! defined BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC

comment:56 Changed 2 years ago by viboes

Milestone: To Be DeterminedBoost 1.60.0

comment:57 Changed 2 years ago by ueli.marti@…

Tested this

https://github.com/boostorg/thread/commit/8f5de1d7c34ff7248261874f273498a622bf76d4

on ARM/Linux. Problem is still the same: boost::this_thread::sleep_for() hangs if system clock is rolled back, returns if clock rolled forward again. Now a question: Is it enough to #define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC in my code or does it have to be defined when building boost ? In the latter case, where to put this #define ?

Information: With boost 1.59.0 on i386/Ubuntu 12.04 the behavior of boost::this_thread::sleep_for()is correct, no impact when rolling back the system clock. The problem seems specific to ARM.

comment:58 Changed 2 years ago by viboes

You need to define it before including any Boost file.

comment:59 Changed 2 years ago by viboes

Could you test if the pthread interface works in ARM/Linux?

comment:60 Changed 2 years ago by ueli.marti@…

I have tried to #define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC in my code before any #include <boost/...>.

When compiling my code for ARM/Linux, i get "undefined reference to boost::chrono::steady_clock::now()" in boost/thread/v2/thread.hpp on lines 90 and 130. (it compiles ok for i386/Ubunti 12.04).

What do you mean exactly with testing pthread interface in ARM/Linux? Basically it works, i can greate threads, terminate them, join them etc.

comment:61 Changed 2 years ago by viboes

Sorry for the late response.

I meant if you write the same program using directly the pthread set clock monotonic,

  namespace detail {
    inline int monotonic_pthread_cond_init(pthread_cond_t& cond) {

#ifdef BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC
            pthread_condattr_t attr;
            int res = pthread_condattr_init(&attr);
            if (res)
            {
              return res;
            }
            pthread_condattr_setclock(&attr, CLOCK_MONOTONIC);
            res=pthread_cond_init(&cond,&attr);
            pthread_condattr_destroy(&attr);
            return res;
#else
            return pthread_cond_init(&cond,NULL);
#endif
    }
  }

But don't compiling and hanging is not the same.

I guess that that your ARM/linux doesn't supports monotonic clocks.

I need to protect line 130 with BOOST_CHRONO_HAS_CLOCK_STEADY

You can change

        system_clock::time_point c_timeout = system_clock::now() + ceil<nanoseconds>(d);

but no chance to fix the ticket issue on this platform. You will need to use no_interruption_point::sleep_for on this platform which of course is not interruptible.

I will update v2/thread.hpp to make use of system_clock when there is no steady_clock.

comment:63 Changed 2 years ago by ueli.marti@…

I have made some further investigations and made some interesting findings. I didn't report back to you because i'm still not completely finished.
The short story:
The issue seems to be related to glibc/pthread and not the platform and not boost neither.

The long story:
I wrote a test program using directly the pthread api (as you suggested).
I found that on ARM/Linux (platform P), when using CLOCK_REALTIME or default (not using pthread_condattr_setclock() at all), pthread_cond_timedwait() hangs when system clock is rolled back during the wait.
Actually it doesn't really hang, it simply waits until system clock reaches the absolute time specified in the abstime argument of pthread_cond_timedwait().
When using CLOCK_MONOTONIC, pthread_cond_timedwait() returns ETIMEDOUT after the time difference elapsed, independently of the system clock. That's the behavior i expect from this function.

So: CLOCK_MONOTONIC is supported by this platform.
Then, i put my test program on another ARM/Linux platform (platform D) using the same processor but another kernel and another rootfs.

And then it becomes interesting: On platform D, even using CLOCK_REALTIME or default, pthread_cond_timedwait() DOESN'T hang if system clock is rolled back during the wait!!

So i started to check the glibc/pthread versions.
On platform P, i have 2.19
On platform D, i have 2.16

I managed to put version 2.17 on platform P and surprise, pthread_cond_timedwait() DOESN'T hang anymore.

I then went on my Ubuntu 12.04 and checked glibc => 2.15 => pthread_cond_timedwait() DOESN'T hang.
I found an Ubuntu 14.04, glibc 2.19 => pthread_cond_timedwait() HANGS when system clock is rolled back!!

I googled and tried to find any hint on this change of behavior of pthread_cond_timedwait() in the different glibc versions but couldn't find anything.

At this stage, i'm trying to install the latest glibc (2.22) on my Ubuntu 14.04 and check how it behaves.
Then, i think it will be time to contact the glibc maintainers.

In parallel, i plan to make some tests with the standard boost 1.59.0 on the different platforms and different glibc versions.

That's about the state for the moment.

comment:64 Changed 2 years ago by viboes

Thanks for the report.

Please, either do your test with develop branch or wait until boost.1.60 is released as there are some fixes around this issue.

comment:65 Changed 2 years ago by viboes

Resolution: fixed
Status: assignedclosed

comment:66 Changed 2 years ago by ueli.marti@…

I made some tests with develop branch on following platforms:
ARM/Linux, glibc 2.17
ARM/Linux, glibc 2.19
i386/Ubuntu 12.04, glibc 2.17
i386/Ubuntu 14.04, glibc 2.19
i386/Ubuntu 15.04, glibc 2.21

Without any special #define we have the same behavior with boost.1.59 and with develop branch:

  • On all platforms with glibc 2.17, if system clock is rolled back during boost::this_thread::sleep_for() or boost::condition_variable::timedWait(), the sleep/wait terminates after the initially specified duration (expected behavior)
  • On all other platforms (glibc 2.19 or 2.21), if system clock is rolled back during boost::this_thread::sleep_for() or boost::condition_variable::timedWait(), the sleep/wait "hangs", i.e. terminates only when the initial time + duration is expired

With develop branch and #define BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC the behavior is the same for all tested platforms:

  • boost::this_thread::sleep_for() terminates after the initially specified duration (expected behavior)
  • boost::condition_variable::timedWait() does never timeout, even without touching system clock. The wait can be terminated only by signalling the condition variable or by interrupting the thread.

I then tested the native api pthread_cond_timedwait() directly, with pthread_condattr_setclock(CLOCK_MONOTONIC).
On all platforms the behavior is as i expect it: if system clock is rolled back during pthread_cond_timedwait() the wait terminates after the initially specified duration

I will repeat my tests when boost 1.60 is released.

comment:67 Changed 11 months ago by viboes

Resolution: fixed
Status: closedreopened

comment:68 Changed 11 months ago by viboes

Sorry for this. I missed this closed ticket.

comment:69 Changed 7 months ago by viboes

comment:70 Changed 7 months ago by viboes

Milestone: Boost 1.60.0Boost 1.64.0

comment:71 Changed 7 months ago by viboes

I've rolled back the last change as it seems it introduce a regression. I need to check more deeply all the usage of timespace_now :(

https://github.com/boostorg/thread/commit/9bbf9bed80836282263ec0bdea0adf5c1fa3621b

comment:72 Changed 4 months ago by anton.indrawan@…

I encountered the same issue. When I defined BOOST_THREAD_HAS_CONDATTR_SET_CLOCK_MONOTONIC , boost::this_thread::sleep_for blocks indefinitely. Is there any patch for this issue? I use boost 1.63.

comment:73 Changed 4 months ago by anton.indrawan@…

The following patch works for me.

https://github.com/boostorg/thread/commit/c7348b29cf8bfa1272645d04784419d37e1e7db5

Test environment:

  • Ubuntu 14.04 x64
  • Boost 1.63

comment:75 Changed 3 months ago by viboes

Resolution: fixed
Status: reopenedclosed

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain viboes.
The resolution will be deleted.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.