Ticket #5045 (closed Bugs: fixed)
epoll_reactor::update_timeout() uses incorrect interrupter if TIMERFD is not available
|Reported by:||Andrew Mann <amann@…>||Owned by:||chris_kohlhoff|
|Milestone:||To Be Determined||Component:||asio|
|Version:||Boost Development Trunk||Severity:||Showstopper|
Fix: Change the final line of epoll_reactor::update_timeout() from: interrupter_.interrupt();
The linux epoll implementation of ASIO creates a fifo pipe for interrupting the epoll_wait() during exceptional events such as a timed event. Unlike other implementations, the epoll implementation doesn't actually use a write to the pipe as the triggering mechanism. Instead, a single write is performed on construction and the pipe is never read. Subsequent triggering is handled by using epoll_ctl(fd, EPOLL_CTL_MOD, fd, EPOLLIN | EPOLLERR | EPOLLET) to effectively reset the event. While this appears to be undocumented functionality of epoll_ctl(), it does appear to work.
However, since the code never reads from the FIFO pipe and the pipe has a maximum buffer of 65535 bytes in current linux kernels, it's vital that the pipe not continue to be written to. epoll_reactor::update_timeout() violates this in the code path where BOOST_ASIO_HAS_TIMERFD is not defined (linux kernels older than about 2.6.22) by calling interrupter_.interrupt() rather than epoll_reactor::interrupt() - which generates an interrupt event by writing a single byte to the pipe.
After 65534 calls (less on earlier kernels) to update_timeout() on the same reactor the pipe will be full and future calls will be unable to write to the pipe, which causes the interrupt to not occur. In certain cases (such as a write-only service_io thread) this can manifest itself as extremely long periods of non-responsiveness.