Modify

Ticket #12818 (closed Bugs: fixed)

Opened 7 weeks ago

Last modified 4 weeks ago

regex: badly needs fuzzing

Reported by: dvyukov@… Owned by: johnmaddock
Milestone: Boost 1.64.0 Component: regex
Version: Boost 1.63.0 Severity: Problem
Keywords: Cc: kcc@…, aizatsky@…, mmoroz@…

Description

Hello,

I've applied libFuzzer ( http://tutorial.libfuzzer.info) to regexp library and found 5 heap-buffer-overflows, stack overflow, assert failure, use of uninitialized data, SIGSEGV, infinite loop, undefined shift, invalid enum value and a bunch of memory leaks in just half an hour:

SUMMARY: AddressSanitizer?: heap-buffer-overflow boost/regex/v4/perl_matcher.hpp:132:10 in char const* boost::re_detail_106300::re_skip_past_null<char>(char const*)

SUMMARY: AddressSanitizer?: heap-buffer-overflow boost/regex/v4/perl_matcher.hpp:221:29 in gnu_cxx::normal_iterator<char const*, std::string> boost::re_detail_106300::re_is_set_member<gnu_cxx::normal_iterator<char const*, std::string>, char, boost::regex_traits<char, boost::cpp_regex_traits<char> >, unsigned int>(gnu_cxx::normal_iterator<char const*, std::string>, gnu_cxx::normal_iterator<char const*, std::string>, boost::re_detail_106300::re_set_long<unsigned int> const*, boost::re_detail_106300::regex_data<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, bool)

SUMMARY: AddressSanitizer?: heap-buffer-overflow /sanitizer_common_interceptors.inc:278 in interceptor_strlen

SUMMARY: AddressSanitizer?: heap-buffer-overflow boost/regex/v4/perl_matcher.hpp:166:19 in gnu_cxx::normal_iterator<char const*, std::string> boost::re_detail_106300::re_is_set_member<gnu_cxx::normal_iterator<char const*, std::string>, char, boost::regex_traits<char, boost::cpp_regex_traits<char> >, unsigned int>(gnu_cxx::normal_iterator<char const*, std::string>, gnu_cxx::normal_iterator<char const*, std::string>, boost::re_detail_106300::re_set_long<unsigned int> const*, boost::re_detail_106300::regex_data<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, bool)

a.out: boost/regex/v4/perl_matcher_common.hpp:606: bool boost::re_detail_106300::perl_matcher<gnu_cxx::normal_iterator<const char *, std::basic_string<char> >, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<const char *, std::basic_string<char> > > >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_backref() = __gnu_cxx::__normal_iterator<const char *, std::basic_string<char> >, Allocator = std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<const char *, std::basic_string<char> > > >, traits = boost::regex_traits<char, boost::cpp_regex_traits<char> >?: Assertion `r.first != r.second' failed.

SUMMARY: MemorySanitizer?: use-of-uninitialized-value boost/regex/v4/perl_matcher.hpp:166:13 in std::1::wrap_iter<char const*> boost::re_detail_106300::re_is_set_member<std::__1::__wrap_iter<char const*>, char, boost::regex_traits<char, boost::cpp_regex_traits<char> >, unsigned int>(std::1::wrap_iter<char const*>, std::1::wrap_iter<char const*>, boost::re_detail_106300::re_set_long<unsigned int> const*, boost::re_detail_106300::regex_data<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, bool)

SUMMARY: AddressSanitizer?: heap-buffer-overflow ./boost/regex/v4/basic_regex_parser.hpp:2599:68 in boost::re_detail_106300::basic_regex_parser<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::parse_perl_extension()

boost/regex/v4/basic_regex_parser.hpp:2599:68: runtime error: load of value 56794092, which is not a valid value for type 'boost::re_detail_106300::syntax_element_type'

Direct leak of 4096 byte(s) in 1 object(s) allocated from:

SUMMARY: AddressSanitizer?: stack-overflow ./boost/regex/v4/basic_regex_creator.hpp:1054 in boost::re_detail_106300::basic_regex_creator<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::create_startmap(boost::re_detail_106300::re_syntax_base*, unsigned char*, unsigned int*, unsigned char)

SUMMARY: AddressSanitizer?: SEGV

ALARM: working on the last Unit for 17 seconds

boost/regex/v4/basic_regex_parser.hpp:904:49: runtime error: shift exponent 325804978 is too large for 32-bit type 'unsigned int'

Full reports and triggering inputs for each bug are attached.

Test that I used is simply:

int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {

try {

std::string str((char*)Data, Size); boost::regex e(str); boost::match_results<std::string::const_iterator> what; boost::regex_match(str, what, e, boost::match_default | boost::match_partial);

} catch (const std::exception&) {} return 0;

}

I would suggest to rerun the fuzzer after fixing these bugs as fuzzer was mostly choking on the existing bugs as they are easy to trigger.

Also it can make sense to set up continuous fuzzing using  https://github.com/google/oss-fuzz which will automatically test latest code.

Attachments

regexp.crashes.zip Download (17.9 KB) - added by anonymous 7 weeks ago.

Change History

Changed 7 weeks ago by anonymous

comment:1 Changed 7 weeks ago by anonymous

  • Summary changed from regexp: badly needs fuzzing to regex: badly needs fuzzing

comment:2 follow-up: ↓ 3 Changed 6 weeks ago by anonymous

I strongly suggest running static analysis over regex before trying to find bugs by fuzzing. Coverity is free for open source and IME works well to identify the source of vulnerabilities, rather than providing N crashers that exercise some M<N bugs which need to be reverse-engineered with gdb.

comment:3 in reply to: ↑ 2 Changed 6 weeks ago by choller@…

Replying to anonymous:

I strongly suggest running static analysis over regex before trying to find bugs by fuzzing. Coverity is free for open source and IME works well to identify the source of vulnerabilities, rather than providing N crashers that exercise some M<N bugs which need to be reverse-engineered with gdb.

Working as a Security Engineer I strongly disagree. Fuzzing is far more valuable than static analysis (if testcases are provided by the fuzzer) because static analysis often causes lots of false positives and its results are often hardly actionable for developers (we have tried both with Firefox). The OP already attached a set of testcases, the only thing that is better than that is patches for fixes. You should setup some fuzzing CI as it was suggested already and oss-fuzz even offers the resources for free.

comment:4 Changed 4 weeks ago by johnmaddock

  • Status changed from new to closed
  • Resolution set to fixed
  • Milestone changed from To Be Determined to Boost 1.64.0

Confirmed and fixed in develop: most of the issues you identified were duplicates of a couple of core issues, but further fuzzing with a dictionary revealed many more. Thanks for the heads up on this!

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.