Boost.Graph Version 2

Since its initial release in September of 2000, the Boost Graph Library has grown considerably and gained many users. In this time we have noticed certain usability issues, found new techniques and libraries that replace old ones, and extended it in ways not originally considered. It is time to revisit the design of the library to create an even better Boost Graph Library.

We're evolving Boost.Graph in a separate part of the Boost sandbox at  http://svn.boost.org/svn/boost/sandbox/graph-v2/

Here are some ideas for changes we would like to make in version 2 of the library. Additional suggestions are welcome:

  • Namespaces: move everything into the namespace boost::graph
  • Named parameters: use the Boost.Parameter library instead of the BGL named parameters mechanism.
  • Documentation: Port to Quickbook (part of ImprovingBoostDocs)
  • Usability: Try to address some of the usability issues that keep coming up:
    • Property maps: Create a facility that makes it easier to create external property maps for a graph.
    • Simple graph types: From Andrew Sutton's user-friend graphs Summer of Code project
    • Named vertices: From the Parallel BGL
  • Should we consider bringing in parts of the  Parallel BGL?

The following sections are intended to record notes, ideas and perhaps some discussion for the BGL.

Property Maps

One of the first things on my (Andrew Sutton) list is revisiting the property map library. It's an incredibly powerful little library that, despite its ubiquity in Boost.Graph, doesn't seem to get the respect it deserves. Anything that can completely abstract the ability to read and write data is just cool. Fortunately, I don't actually have to change anything except a little cosmetic restructuring like giving it its own directory. I'm also looking to add a new type of property map (const_property_map) that simply returns the same value. Also part of this effort is completely rewriting the docs (ugh).

We also need to be clear on the best ways to use the library. Despite the fact that its so useful, almost all of its uses in Boost.Graph tend to be custom property maps that simply wrap the original ones. This is mostly done because we have to provide a graph object for a lot of function calls in the library. This seems to actually be a useful pattern, wrapping or re-defining simple or basic property maps for a specific application.

Named Parameters

During my (Andrew Sutton) work on the BGL this summer, I actually experimented with the new parameter library for a couple of the graph measures that I had implemented - and actually ran into a couple of issues. There are actually two "modes" for using named parameters. The first mode uses named parameters to overload default arguments to a function. In other words, the Boost.Parameter function simply delegates to a single dispatch function passing either user-supplied or the default values. The second mode is using the named parameter to select which function is actually called based on which arguments are actually supplied. The second mode is actually problematic because it becomes nearly impossible to document in any consistent fashion.

My feeling is that some of the older interfaces use the old named parameter code to do similar things - function selection based on how the function is called. I have a gut feeling that most of the algorithms in the BGL can actually be written without named parameters at all. It may be that we have to change some of the algorithm names a little bit, but I really don't have any problems with that.

In my experience there are basically three reasons that an algorithm has named parameters:

  • To supply pre-allocated storage of exterior properties - This can act as an optimization or as a means of getting extra information out of an algorithm.
  • To provide default values - Sometimes there are just a lot of parameters to an algorithm and all of them have defaults.
  • To emulate other types of graphs - For example, some graphs are naturally unweighted, but some algorithms require a weight map. If you want to use the algorithm, you need to provide edge weights (see Floyd-Warshall). This also happens for every graph that needs to, but doesn't, provide built-in vertex indices (e.g., adjacency lists with non-vector vertex storage).
  • To select different computations - All shortest path algoriths, for example, take a predecessor map and a distance map that enable (marginally) different behaviors if supplied. These could be provided as parameters to a visitor instead.

My suggestion is to look very carefully at every algorithm that uses named parameters and try to determine why and if they can be written without them. My feeling is that one function should do one thing. It's a lot easier to document that way too.