Modify

Ticket #5735 (closed Feature Requests: fixed)

Opened 3 years ago

Last modified 2 years ago

proto should force functions to be inline

Reported by: mgaunard Owned by: eric_niebler
Milestone: To Be Determined Component: proto
Version: Boost 1.49.0 Severity: Problem
Keywords: Cc:

Description

Some functions in Boost.Proto should really always be inlined; unfortunately, they don't always properly get inlined, sometimes due to compiler bugs (I can think of problems with a certain GCC version on 64-bit for example).

I suggest that Proto does what it can to force those functions to be inline (using attribute((always_inline)) on GCC and forceinline with MSVC). This will also have the effect of yielding warnings when a function can't be inlined, and at least with GCC, functions will be inlined even in debug mode.

Attachments

proto_forceinline.tar.gz Download (17.9 KB) - added by mgaunard 2 years ago.
Adds BOOST_FORCEINLINE in relevant parts of Boost.Proto

Change History

comment:1 Changed 3 years ago by eric_niebler

  • Status changed from new to closed
  • Resolution set to invalid

This is none of Proto's business. If there is a compiler bug leading it to poorly inline, then it sould be fixed in the compiler, not worked around in Proto. Forcing inlining might actually degrade performance for compilers that have good inlining heuristics.

comment:2 Changed 3 years ago by mgaunard

Since Proto is used to build many performance sensitive libraries and applications, and there is no downside to implementing this feature, I don't understand why you're discarding this request so easily.

Proto is a library for others to build upon, each having their own uses, requirements, compilers and toolchains. No compiler is perfect, and inlining is certainly not a perfect feature and is very dependent of the flags used -- which themselves may have other negative effects. That means that you cannot rely on the compiler to do it reliably, especially when large codebases are involved. There are always inlining issues, different ones with each version of GCC for example, including recent and widely deployed ones.

Some Proto constructs make no sense to not be inlined, so it seems like a good idea to tell the compiler that yes, they should always be inlined, and if it cannot for some reason, it should emit an error. It is also important in order to get decent performance when running the code in debug mode or in modes that are not too aggressive with inlining, such as size-optimized builds, popular on embedded systems.

I believe doing this would only improve the quality of Proto and make people more eager to rely on it.

I think the following paper, in section 9, sums up the importance of inlining with expression templates pretty well:  http://arxiv.org/pdf/1104.1729

comment:3 follow-up: ↓ 5 Changed 3 years ago by eric_niebler

Wow, the paper is from this year, and they think that Blitz++ and uBLAS represent the bleeding edge of ET technology. That's laughable.

But the perf measurements are interesting. I can't see where they say what compiler they're using for their perf benchmarks, can you? If they're using that buggy GCC compler version, then yes, it will grossly screw up the results. They talk about cache issues, but only consider data size and not code size.

Joel Falcou is the one closest to this issue. He's filed specific bugs against Proto (function X should be changed in way Y for better inlining), which I have addressed promptly. He hasn't noticed any problems that would require any across-the-board change like the one you're suggesting. Or if he has, he hasn't said anything about it. If you can show me a specific case where a (non-buggy, mainstream) compiler generates slow Proto code because of failed inlining, I'd be more easily convinced.

But you should bring this up on the Proto list and make sure Joel sees it. I'd be curious to hear his thoughts.

comment:4 Changed 3 years ago by mgaunard

  • Type changed from Bugs to Feature Requests

Yes, that paper is "not terribly good", yet it has a couple of wise things in it, and the conclusion is interesting, which is why I put it there. (it's from Thomas Heller's advisor, by the way)
I unfortunately don't have more information on how they got those results.

I haven't personally run into problems with inlining and Boost.Proto, since I haven't done things that were very performance critical with it yet that required me to look at the produced asm, but I have run into issues in NT2 (in the non-ET parts) where some very small and trivial functions didn't get inlined on some GCC versions or settings, which was catastrophic to performance (hundreds of times slower).
Some other NT2 users ran into similar problems on MSVC, albeit on different functions.

I believe putting force inline attributes to functions is an even more reliable solution than making code "compiler-friendly", since this way we can have a guarantee that things will be inlined rather than analyzing asm and finding out a workaround on a case by case basis.

I'm only suggesting that a BOOST_PROTO_FORCE_INLINE be put instead of inline in the cases where it appears to be a good idea. Is that considered to be too intrusive in the source code?

In any case I'll get Joel in the loop.

comment:5 in reply to: ↑ 3 Changed 3 years ago by anonymous

Replying to eric_niebler:

Wow, the paper is from this year, and they think that Blitz++ and uBLAS represent the bleeding edge of ET technology. That's laughable.

Yeah quite sad ...

As for the problem at hand, I have mgaunard handy in my office, and as he said, we got some sporadic MSVC 10 with some size aggressive optimization. Now, I dont think we need an all-over-your-face approach but just have a force inline macro handy that me/you can apply where and when necessary.

comment:6 Changed 2 years ago by mgaunard

I suggest at least proto::value(), proto::child, make_expr and related transforms be marked force inline. Looking at assembly, it happens relatively often that boost::proto::value is not inlined, even though it definitely should always be.

boost/config.hpp now contains a BOOST_FORCEINLINE macro that can be used for this purpose.

Changed 2 years ago by mgaunard

Adds BOOST_FORCEINLINE in relevant parts of Boost.Proto

comment:7 Changed 2 years ago by mgaunard

The attached patch adds BOOST_FORCEINLINE to the following functions:

  • as_lvalue, ignore_unused, dont_care
  • as_child, as_expr
  • proto_base, address_of_hack
  • child, child_c, value
  • make_expr, unpack_expr
  • operators and operators defined by extends, extends constructor
  • built-in generators
  • the transform operator() overloads that give dummy state and data
  • call, make, when and pass_through transforms

This patch adds a significant number of BOOST_FORCEINLINE, but all of them make sense, because the functions in question are lightweight wrappers or only forward non-recursively to another function.

comment:8 Changed 2 years ago by eric_niebler

  • Status changed from closed to reopened
  • Version changed from Boost 1.47.0 to Boost 1.49.0
  • Resolution invalid deleted

This is fixed on trunk in [75578]. I'll merge this to release after the tests have cycles. Thanks, Mathias.

comment:9 Changed 2 years ago by eric_niebler

  • Status changed from reopened to closed
  • Resolution set to fixed

(In [76026]) merge [75578] from trunk, fixes #5735

View

Add a comment

Modify Ticket

Change Properties
<Author field>
Action
as closed
The resolution will be deleted. Next status will be 'reopened'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.