Opened 5 years ago

#8831 new Feature Requests

Reuse capacity from user containers in order to prevent superfluous allocations

Reported by: Evgeny Panasyuk <evgeny.panasyuk@…> Owned by: Marshall Clow
Milestone: To Be Determined Component: string_algo
Version: Boost 1.54.0 Severity: Optimization
Keywords: Cc:

Description

Inside boost::algorithm::split, new variable of container type is created, and then swapped with result when fully filled.

See http://www.boost.org/doc/libs/1_54_0/boost/algorithm/string/iter_find.hpp , there is code like:

SequenceSequenceT Tmp(itBegin, itEnd);
Result.swap(Tmp);

Maybe that was done in pursuit of strong exception safety guarantee - but I don't see much value for it in that case, because split is supposed to replace values in original container - https://svn.boost.org/trac/boost/ticket/5915 . I think basic guarantee would be enough.

Often SequenceSequenceT is container like std::vector, which already has capacity from previous usages, which can be reused avoiding costly allocations. For example:

Result.assign(itBegin, itEnd);

Maybe that would require stricter requirements on SequenceSequenceT, or maybe overload or traits specialization can be used for common things like std::vector and boost::container::vector or as customization point.

Here is proof-of-concept which avoids allocations showing speed difference: http://coliru.stacked-crooked.com/view?id=bf5dd9f2d9d20d61470e73a6b2940333-9a9914b3e2b7ed07c206d6accecccdb6

On my machine I have following results:

start
end
0.85 s

64000000
start
end
1.55 s

64000000

I.e. version with allocations is ~1.8x slower.

Maybe other algorithms have similar issues - I haven't checked.

Attachments (1)

boost_string_algo_reuse_capacity.cpp (2.0 KB) - added by Evgeny Panasyuk <evgeny.panasyuk@…> 5 years ago.

Download all attachments as: .zip

Change History (1)

Changed 5 years ago by Evgeny Panasyuk <evgeny.panasyuk@…>

Note: See TracTickets for help on using tickets.