Modify

Ticket #7744 (closed Bugs: fixed)

Opened 17 months ago

Last modified 17 months ago

make_u32regex() performs insufficient UTF-8 validation

Reported by: anonymous Owned by: johnmaddock
Milestone: To Be Determined Component: regex
Version: Boost 1.52.0 Severity: Problem
Keywords: Cc:

Description

The program below shows a segfault for regular expression ".*\xf6.*". AFAIK the maximum value allowed as leading byte for 4-byte sequences is 0xF4. I would expect an exception.

Regular expression ".*\xe4.*" is created without exception. However 0xE4 starts a 3-byte character and no trailing bytes are present. I would expect an exception here too.

We use Boost 1.52.0 together with ICU 50.1. The behavior is the same in Linux and Windows.

#include <boost/regex/icu.hpp>

int main(void)
{
    // this line does not throw an exception although this is not valid UTF-8
    boost::u32regex(boost::make_u32regex(".*\xe4.*"));
    // this line segfaults
    boost::u32regex(boost::make_u32regex(".*\xf6.*"));
    return 0;
}

Attachments

Change History

comment:1 Changed 17 months ago by johnmaddock

  • Status changed from new to closed
  • Resolution set to fixed

Fixed in Trunk rev #81614

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.