C++ Reference: min_cost_flow

This documentation is automatically generated.

An implementation of a cost-scaling push-relabel algorithm for the min-cost flow problem.

In the following, we consider a graph G = (V,E) where V denotes the set of nodes (vertices) in the graph, E denotes the set of arcs (edges). n = |V| denotes the number of nodes in the graph, and m = |E| denotes the number of arcs in the graph.

With each arc (v,w) is associated a nonnegative capacity u(v,w) (where 'u' stands for "upper bound") and a unit cost c(v,w). With each node v is associated a quantity named supply(v), which represents a supply of fluid (if >0) or a demand (if <0). Furthermore, no fluid is created in the graph so sum_{v in V} supply(v) = 0.

A flow is a function from E to R such that: a) f(v,w) <= u(v,w) for all (v,w) in E (capacity constraint). b) f(v,w) = -f(w,v) for all (v,w) in E (flow antisymmetry constraint). c) sum on v f(v,w) + supply(w) = 0 (flow conservation).

The cost of a flow is sum on (v,w) in E ( f(v,w) * c(v,w) ) [Note: It can be confusing to beginners that the cost is actually double the amount that it might seem at first because of flow antisymmetry.]

The problem to solve: find a flow of minimum cost such that all the fluid flows from the supply nodes to the demand nodes.

The principles behind this algorithm are the following: 1/ handle pseudo-flows instead of flows and refine pseudo-flows until an epsilon-optimal minimum-cost flow is obtained, 2/ deal with epsilon-optimal pseudo-flows.

1/ A pseudo-flow is like a flow, except that a node's outflow minus its inflow can be different from its supply. If it is the case at a given node v, it is said that there is an excess (or deficit) at node v. A deficit is denoted by a negative excess and inflow = outflow + excess. (Look at ortools/graph/max_flow.h to see that the definition of preflow is more restrictive than the one for pseudo-flow in that a preflow only allows non-negative excesses, i.e., no deficit.) More formally, a pseudo-flow is a function f such that: a) f(v,w) <= u(v,w) for all (v,w) in E (capacity constraint). b) f(v,w) = -f(w,v) for all (v,w) in E (flow antisymmetry constraint).

For each v in E, we also define the excess at node v, the algebraic sum of all the incoming preflows at this node, added together with the supply at v. excess(v) = sum on u f(u,v) + supply(v)

The goal of the algorithm is to obtain excess(v) = 0 for all v in V, while consuming capacity on some arcs, at the lowest possible cost.

2/ Internally to the algorithm and its analysis (but invisibly to the client), each node has an associated "price" (or potential), in addition to its excess. It is formally a function from E to R (the set of real numbers.). For a given price function p, the reduced cost of an arc (v,w) is: c_p(v,w) = c(v,w) + p(v) - p(w) (c(v,w) is the cost of arc (v,w).) For those familiar with linear programming, the price function can be viewed as a set of dual variables.

For a constant epsilon >= 0, a pseudo-flow f is said to be epsilon-optimal with respect to a price function p if for every residual arc (v,w) in E, c_p(v,w) >= -epsilon.

A flow f is optimal if and only if there exists a price function p such that no arc is admissible with respect to f and p.

If the arc costs are integers, and epsilon < 1/n, any epsilon-optimal flow is optimal. The integer cost case is handled by multiplying all the arc costs and the initial value of epsilon by (n+1). When epsilon reaches 1, and the solution is epsilon-optimal, it means: for all residual arc (v,w) in E, (n+1) * c_p(v,w) >= -1, thus c_p(v,w) >= -1/(n+1) >= 1/n, and the solution is optimal.

A node v is said to be *active* if excess(v) > 0. In this case the following operations can be applied to it:
   - if there are *admissible* incident arcs, i.e. arcs which are not saturated,
   and whose reduced costs are negative, a PushFlow operation can
   be applied. It consists in sending as much flow as both the excess at the
   node and the capacity of the arc permit.
   - if there are no admissible arcs, the active node considered is relabeled, This is implemented in Discharge, which itself calls PushFlow and Relabel.

Discharge itself is called by Refine. Refine first saturates all the admissible arcs, then builds a stack of active nodes. It then applies Discharge for each active node, possibly adding new ones in the process, until no nodes are active. In that case an epsilon-optimal flow is obtained.

Optimize iteratively calls Refine, while epsilon > 1, and divides epsilon by alpha (set by default to 5) before each iteration.

The algorithm starts with epsilon = C, where C is the maximum absolute value of the arc costs. In the integer case which we are dealing with, since all costs are multiplied by (n+1), the initial value of epsilon is (n+1)*C. The algorithm terminates when epsilon = 1, and Refine() has been called. In this case, a minimum-cost flow is obtained.

The complexity of the algorithm is O(n^2*m*log(n*C)) where C is the value of the largest arc cost in the graph.

IMPORTANT: The algorithm is not able to detect the infeasibility of a problem (i.e., when a bottleneck in the network prohibits sending all the supplies.) Worse, it could in some cases loop forever. This is why feasibility checking is enabled by default (FLAGS_min_cost_flow_check_feasibility=true.) Feasibility checking is implemented using a max-flow, which has a much lower complexity. The impact on performance is negligible, while the risk of being caught in an endless loop is removed. Note that using the feasibility checker roughly doubles the memory consumption.

The starting reference for this class of algorithms is: A.V. Goldberg and R.E. Tarjan, "Finding Minimum-Cost Circulations by Successive Approximation." Mathematics of Operations Research, Vol. 15, 1990:430-466. http://portal.acm.org/citation.cfm?id=92225

Implementation issues are tackled in: A.V. Goldberg, "An Efficient Implementation of a Scaling Minimum-Cost Flow Algorithm," Journal of Algorithms, (1997) 22:1-29 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

A.V. Goldberg and M. Kharitonov, "On Implementing Scaling Push-Relabel Algorithms for the Minimum-Cost Flow Problem", Network flows and matching: First DIMACS implementation challenge, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, (1993) 12:157-198. ftp://dimacs.rutgers.edu/pub/netflow/submit/papers/Goldberg-mincost/scalmin.ps and in: U. Bunnagel, B. Korte, and J. Vygen. “Efficient implementation of the Goldberg-Tarjan minimum-cost flow algorithm.” Optimization Methods and Software (1998) vol. 10, no. 2:157-174. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

We have tried as much as possible in this implementation to keep the notations and namings of the papers cited above, except for 'demand' or 'balance' which have been replaced by 'supply', with the according sign changes to better accommodate with the API of the rest of our tools. A demand is denoted by a negative supply.

TODO(user): See whether the following can bring any improvements on real-life problems. R.K. Ahuja, A.V. Goldberg, J.B. Orlin, and R.E. Tarjan, "Finding minimum-cost flows by double scaling," Mathematical Programming, (1992) 53:243-266. http://www.springerlink.com/index/gu7404218u6kt166.pdf

An interesting general reference on network flows is: R. K. Ahuja, T. L. Magnanti, J. B. Orlin, "Network Flows: Theory, Algorithms, and Applications," Prentice Hall, 1993, ISBN: 978-0136175490, http://www.amazon.com/dp/013617549X

Keywords: Push-relabel, min-cost flow, network, graph, Goldberg, Tarjan, Dinic, Dinitz.