C++ Reference: ebert_graph

Note: This documentation is automatically generated.

A few variations on a theme of the "star" graph representation by Ebert, as described in J. Ebert, "A versatile data structure for edge-oriented graph algorithms." Communications of the ACM 30(6):513-519 (June 1987). http://portal.acm.org/citation.cfm?id=214769

In this file there are three representations that have much in common. The general one, called simply EbertGraph, contains both forward- and backward-star representations. The other, called ForwardEbertGraph, contains only the forward-star representation of the graph, and is appropriate for applications where the reverse arcs are not needed.

The point of including all the representations in this one file is to capitalize, where possible, on the commonalities among them, and those commonalities are mostly factored out into base classes as described below. Despite the commonalities, however, each of the three representations presents a somewhat different interface because of their different underlying semantics. A quintessential example is that the AddArc() method, very natural for the EbertGraph representation, cannot exist for an inherently static representation like ForwardStaticGraph.

Many clients are expected to use the interfaces to the graph objects directly, but some clients are parameterized by graph type and need a consistent interface for their underlying graph objects. For such clients, a small library of class templates is provided to give a consistent interface to clients where the underlying graph interfaces differ. Examples are the AnnotatedGraphBuildManager<> template, which provides a uniform interface for building the various types of graphs; and the TailArrayManager<> template, which provides a uniform interface for applications that need to map from arc indices to arc tail nodes, accounting for the fact that such a mapping has to be requested explicitly from the ForwardStaticGraph and ForwardStarGraph representations.

There are two base class templates, StarGraphBase, and EbertGraphBase; their purpose is to hold methods and data structures that are in common among their descendants. Only classes that are leaves in the following hierarchy tree are eligible for free-standing instantiation and use by clients. The parentheses around StarGraphBase and EbertGraphBase indicate that they should not normally be instantiated by clients:
                     (StarGraphBase)                       |
                       /         \                         |
                      /           \                        |
                     /             \                       |
                    /               \                      |
           (EbertGraphBase)     ForwardStaticGraph         |
            /            \                                 |
           /              \                                |
      EbertGraph     ForwardEbertGraph                     |
In the general EbertGraph case, the graph is represented with three arrays. Let n be the number of nodes and m be the number of arcs. Let i be an integer in [0..m-1], denoting the index of an arc.
   * head_[i] contains the end-node of arc i,
   * head_[-i-1] contains the start-node of arc i. Note that in two's-complement arithmetic, -i-1 = ~i.

   * head_[~i] contains the end-node of the arc reverse to arc i,
   * head_[i] contains the start-node of the arc reverse to arc i.
   Note that if arc (u, v) is defined, then the data structure also stores
   (v, u).
   Arc ~i thus denotes the arc reverse to arc i.
   This is what makes this representation useful for undirected graphs and for
   implementing algorithms like bidirectional shortest paths.
   Also note that the representation handles multi-graphs. If several arcs
   going from node u to node v are added to the graph, they will be handled as
   separate arcs.

Now, for an integer u in [0..n-1] denoting the index of a node:

   * first_incident_arc_[u] denotes the first arc in the adjacency list of u.
   * going from an arc i, the adjacency list can be traversed using
   j = next_adjacent_arc_[i].

The EbertGraph implementation has the following benefits:

   * It is able to handle both directed or undirected graphs.
   * Being based on indices, it is easily serializable. Only the contents
   of the head_ array need to be stored. Even so, serialization is
   currently not implemented.
   * The node indices and arc indices can be stored in 32 bits, while
   still allowing to go a bit further than the 4-gigabyte
   limitation (48 gigabytes for a pure graph, without capacities or
   * The representation can be recomputed if edges have been loaded from
   * The representation can be recomputed if edges have been loaded from
   external memory or if edges have been re-ordered.
   * The memory consumption is: 2 * m * sizeof(NodeIndexType)
   + 2 * m * sizeof(ArcIndexType)
   + n * sizeof(ArcIndexType)
   plus a small constant.

The EbertGraph implementation differs from the implementation described in

[Ebert 1987] in the following respects:
   * arcs are represented using an (i, ~i) approach, whereas Ebert used
   (i, -i). Indices for direct arcs thus start at 0, in a fashion that is
   compatible with the index numbering in C and C++. Note that we also tested
   a (2*i, 2*i+1) storage pattern, which did not show any speed benefit, and
   made the use of the API much more difficult.
   * because of this, the 'nil' values for nodes and arcs are not 0, as Ebert
   first described. The value for the 'nil' node is set to -1, while the
   value for the 'nil' arc is set to the smallest integer representable with
   ArcIndexSize bytes.
   * it is possible to add arcs to the graph, with AddArc, in a much simpler
   way than described by Ebert.
   * TODO(user) although it is already possible, using the
   GroupForwardArcsByFunctor method, to group all the outgoing (resp.
   incoming) arcs of a node, the iterator logic could still be improved to
   allow traversing the outgoing (resp. incoming) arcs in O(out_degree(node))
   (resp. O(in_degree(node))) instead of O(degree(node)).
   * TODO(user) it is possible to implement arc deletion and garbage collection
   in an efficient (relatively) manner. For the time being we haven't seen an
   application for this.

The ForwardEbertGraph representation is like the EbertGraph case described

above, with the following modifications:
   * The part of the head_[] array with negative indices is absent. In its
   place is a pointer tail_ which, if assigned, points to an array of tail
   nodes indexed by (nonnegative) arc index. In typical usage tail_ is NULL
   and the memory for the tail nodes need not be allocated.
   * The array of arc tails can be allocated as needed and populated from the
   adjacency lists of the graph.
   * Representing only the forward star of each node implies that the graph
   cannot be serialized directly nor rebuilt from scratch from just the head_
   array. Rebuilding from scratch requires constructing the array of arc
   tails from the adjacency lists first, and serialization can be done either
   by first constructing the array of arc tails from the adjacency lists, or
   by serializing directly from the adjacency lists.
   * The memory consumption is: m * sizeof(NodeIndexType)
   + m * sizeof(ArcIndexType)
   + n * sizeof(ArcIndexType)
   plus a small constant when the array of arc tails is absent. Allocating
   the arc tail array adds another m * sizeof(NodeIndexType).

The ForwardStaticGraph representation is restricted yet farther

than ForwardEbertGraph, with the benefit that it provides higher performance to those applications that can use it.
   * As with ForwardEbertGraph, the presence of the array of arc
   tails is optional.
   * The outgoing adjacency list for each node is stored in a
   contiguous segment of the head_[] array, obviating the
   next_adjacent_arc_ structure entirely and ensuring good locality
   of reference for applications that iterate over outgoing
   adjacency lists.
   * The memory consumption is: m * sizeof(NodeIndexType)
   + n * sizeof(ArcIndexType)
   plus a small constant when the array of arc tails is absent. Allocating
   the arc tail array adds another m * sizeof(NodeIndexType).