Ticket #7744 (closed Bugs: fixed)

Opened 4 years ago

Last modified 4 years ago

make_u32regex() performs insufficient UTF-8 validation

Reported by: anonymous Owned by: johnmaddock
Milestone: To Be Determined Component: regex
Version: Boost 1.52.0 Severity: Problem
Keywords: Cc:


The program below shows a segfault for regular expression ".*\xf6.*". AFAIK the maximum value allowed as leading byte for 4-byte sequences is 0xF4. I would expect an exception.

Regular expression ".*\xe4.*" is created without exception. However 0xE4 starts a 3-byte character and no trailing bytes are present. I would expect an exception here too.

We use Boost 1.52.0 together with ICU 50.1. The behavior is the same in Linux and Windows.

#include <boost/regex/icu.hpp>

int main(void)
    // this line does not throw an exception although this is not valid UTF-8
    // this line segfaults
    return 0;


Change History

comment:1 Changed 4 years ago by johnmaddock

  • Status changed from new to closed
  • Resolution set to fixed

Fixed in Trunk rev #81614


Add a comment

Modify Ticket

Change Properties
<Author field>
as closed
The resolution will be deleted. Next status will be 'reopened'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.