Mathematical background for PDLP

This page contains mathematical background for developers and advanced users of PDLP, a linear and quadratic programming solver available in OR-Tools. It serves as reference for parts of the code and is not intended to be read on its own. Interested readers should first familiarize themselves with the paper, "Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient", then look at the code, then come back to this document when the code references it.

Primal

PDLP considers the following convex quadratic programming problem,

$$ \begin{align} \min_x & \, c^Tx + \frac{1}{2}x^TQx \\ \text{s.t.}\; & l^{c} \le Ax \le u^{c} \\ & l^{v} \le x \le u^{v} \end{align} $$

where $A$ is an $m \times n$ matrix and $Q$ is a diagonal non-negative $n \times n$ matrix¹. The upper bound vectors $u^{c}$ and $u^{v}$ have entries in $\mathbb{R} \cup \{ \infty\}$, and the lower bound vectors $l^{c}$ and $l^{v}$ have entries in $\mathbb{R} \cup \{ -\infty\}$. We assume $l^{c} \le u^{c}$ and $l^{v} \le u^{v}$.

Dual

Let $a \in \mathbb{R}$. Let $[a]_+$ denote its positive part and $[a]_-$ denote its negative part, that is, $a = [a]_+ - [a]_-$. When applied to a vector, the positive and negative parts are computed element-wise.

The dual of the earlier primal problem is over $x \in \mathbb{R}^n$, $y \in \mathbb{R}^m$, and $r \in \mathbb{R}^n$. The vector $y$ contains dual multipliers on the linear constraints ($l^{c} \le Ax \le u^{c}$). The vector $r$, called the reduced costs, contains dual multipliers on the bound constraints ($l^{v} \le x \le u^{v}$).

$$ \begin{align} \max_{x, y, r} & \, -\frac{1}{2}x^TQx + \left((l^{c})^T[y]_+ - (u^{c})^T[y]_- \right) + \left((l^{v})^T[r]_+ - (u^{v})^T[r]_- \right) \\ \text{s.t.}\; & Qx + c - A^Ty = r \end{align} $$

When $Q = 0$, $x$ can be dropped from the dual, recovering LP duality.

Dual variable bounds

We say that $y$ satisfies the dual variable bounds if the $y$-term in the objective is finite, that is:

$$ y_i \geq 0 \qquad \text{if }u^c_i = \infty, \\ y_i \leq 0 \qquad \text{if }l^c_i = -\infty. $$

Derivation using conjugate duality

Preliminaries

Let $a \in \mathbb{R} \cup \{-\infty\}$ and $b \in \mathbb{R} \cup \{\infty\}$ with $b \ge a$ and consider the interval $[a, b] \subseteq \mathbb{R} \cup \{-\infty, \infty\}$.

Let $\mathcal{I}_{[a, b]} : \mathbb{R} \to \mathbb{R} \cup \{ \infty\}$ be the indicator function of the interval, that is, $\mathcal{I}_{[a, b]}(x)$ is zero when $x \in [a, b]$ and $\infty$ otherwise.

Define $p(y; a, b): \mathbb{R} \to \mathbb{R} \cup \{\infty\}$ as:

$$ p(y; a, b) = \begin{cases} ay & \text{ if } y < 0 \\ 0 & \text{ if } y = 0 \\ by & \text{ if } y > 0 \end{cases}. $$

When $a$ or $b$ are infinite, follow standard extended real arithmetic.

Basic result: $p(y; a, b) = (\mathcal{I}_{[a, b]})^*(y)$ where $(\cdot)^*$ denotes the convex conjugate.

For vectors $l \subseteq (\mathbb{R} \cup \{-\infty\})^n$ and $u \subseteq (\mathbb{R} \cup \{\infty\})^n$, the indicator function $\mathcal{I}_{[l, u]} : \mathbb{R}^n \to \mathbb{R} \cup \{ \infty\}$ is defined analogously as $\mathcal{I}_{[l,u]}(x) = \sum_{i=1}^n \mathcal{I}_{[l_i, u_i]}(x_i)$, and similarly $p(y; l, u) = (\mathcal{I}_{[l, u]})^*(y) = \sum_{i=1}^n p(y_i; l_i, u_i)$.

Derivation

Introducing auxiliary variables $\tilde a \in \mathbb{R}^m$ and $\tilde x \in \mathbb{R}^n$, we restate the primal problem as:

$$ \begin{align} \min_{x, \tilde x, \tilde a} & \, c^Tx + \frac{1}{2}x^TQx + \mathcal{I}_{[l^c,u^c]}(\tilde a) + \mathcal{I}_{[l^v,u^v]}(\tilde x) \\ \text{s.t.}\; & \tilde a = Ax \\ & \tilde x = x \end{align} $$

Dualizing the equality constraints, we obtain:

$$ \min_{x, \tilde x, \tilde a} \max_{y, r} c^Tx + \frac{1}{2}x^TQx + y^T\tilde a - y^TAx + r^T\tilde x - r^Tx + \mathcal{I}_{[l^c,u^c]}(\tilde a) + \mathcal{I}_{[l^v,u^v]}(\tilde x) $$

Exchanging min with max and regrouping:

$$ \max_{y, r} \min_{x, \tilde x, \tilde a} c^Tx + \frac{1}{2}x^TQx - y^TAx - r^Tx + \left( \mathcal{I}_{[l^c,u^c]}(\tilde a) + y^T\tilde a \right) + \left(\mathcal{I}_{[l^v,u^v]}(\tilde x) + r^T\tilde x \right) $$

The joint minimization over $x$, $\tilde x$, and $\tilde a$ decomposes. For $x$ we see that a minimizer, if one exists, satisfies $Qx + c - A^Ty = r$, in which case the minimum value is $-\frac{1}{2} x^TQx$. For $\tilde x$ and $\tilde a$ we apply the definition of convex conjugates with minor adjustments for signs.

This yields the dual:

$$ \begin{align} \max_{x, y, r} & \, -\frac{1}{2}x^TQx - p(-y, l^c, u^c) - p(-r, l^v, u^v) \\ \text{s.t.}\; & Qx + c - A^Ty = r \end{align} $$

By expanding the definition of $p$, we obtain the dual stated at the top.

Saddle-point formulation

Primal-dual hybrid gradient (see Chambolle and Pock) targets a primal problem of the form

$$ \begin{align} \min_x f(x) + g(x) + h(Kx) \end{align} $$

which by conjugate duality is equivalent to the saddle-point problem

$$ \begin{align} \min_x \max_y f(x) + g(x) + y^TKx - h^*(y) \end{align} $$

PDLP coerces the convex quadratic programming problem into this form by setting:

$f(x) = 0$
$g(x) = c^T x + \frac{1}{2} x^T Q x + \mathcal{I}_{[l^v, u^v]}(x)$
$h(a) = \mathcal{I}_{[-u^c,-l^c]}(a)$
$K = -A$

As derived earlier, $h^*(y) = p(y; -u^c,-l^c)$ is a piecewise linear convex function. Both $g$ and $h^*$ can take infinite values, which effectively limits the domains of $x$ and $y$ respectively.

Note that the proximal operator of $g$ is computable in closed form under PDLP's assumption that $Q$ is diagonal, using the fact that $g$ is separable and the following property that holds for any functions $f_1, f_2$:

$$ f_2(t) = f_1(t) + \frac{\mu}{2} t^2 \Rightarrow \mathrm{prox}_{\lambda f_2}(t) = \mathrm{prox}_{\frac{\lambda}{1 + \lambda \mu} f_1}\left( \frac{t}{1 + \lambda \mu} \right). $$

For a proof of this fact, see for example Theorem 6.13 in First-Order Methods in Optimization. The resulting expression is given by

$$ \begin{equation} \mathrm{prox}_{\lambda g}(x) = \mathrm{proj}_{[l^v, u^v]}\left( (I + \lambda Q)^{-1} (x - \lambda c) \right) \end{equation} $$

Reduced costs, dual residuals, and the corrected dual objective

The saddle point formulation only explicitly works with $(x,y)$; the reduced costs $r$ are implicit. In order to return $(x,y,r)$ when solving the saddle-point formulation, we define $r$ as $r = Qx + c - A^Ty$. The corrected dual objective is the objective value of the dual problem, and always gives a lower bound on the objective value, but is $-\infty$ whenever there is a non-zero reduced cost on an infinite bound.

The reduced costs and dual objective reported by PDLP were changed between OR-tools versions 9.7 and 9.8. To check which version applies, check the comment describing SolverResult::reduced_costs in primal_dual_hybrid_gradient.h and see if it mentions version 9.8.

Version 9.7 and earlier

In order to have a more meaningful dual value when the corrected dual objective is $-\infty$, we also report a dual objective which ignores the infinite terms in the objective value. The dual residuals are the values of $r$ from infinite terms in the corrected dual objective, with 0 in the other components, and the reduced costs returned by PDLP are the values of $r$ from the finite terms in the corrected dual objective, with 0 in the other components (so that $r = \mbox{residuals} + \mbox{reduced costs}$).

Version 9.8 and later

In order to have a more meaningful dual value when the corrected dual objective is $-\infty$, we also report a dual objective which replaces the infinite terms in the objective value with finite ones as follows: if one of the bounds is finite, that bound is used instead of the infinite one; otherwise zero is used for the bound. This choice preserves the concavity of the dual objective, but does not necessarily give a lower bound on the objective value. The dual residuals are the values of $r$ from infinite terms in the corrected dual objective. The reduced costs returned by PDLP are $r$.

Treatment of some variable bounds as infinite

In both versions, if the solver option handle_some_primal_gradients_on_finite_bounds_as_residuals is true (the default), additional variable bounds may be treated as infinite when computing the dual objective and dual residuals. In particular, if $|x_i - l^v_i| > |x_i|$, $l^v_i$ is treated as if it were infinite, and similarly if $|x_i - u^v_i| > |x_i|$, $u^v_i$ is treated as if it were infinite.

Note that handle_some_primal_gradients_on_finite_bounds_as_residuals does not affect the iterates computed; it only affects dual objective and residuals used in termination tests and reported statistics.

Rescaling

Suppose we are given a diagonal (column) rescaling matrix $C$ and a diagonal (row) rescaling matrix $R$ with positive entries on the diagonal. Applying rescaling as in ShardedQuadraticProgram::RescaleQuadraticProgram, we obtain the following transformed problem:

$$ \begin{align} \min_{\tilde x} & \, (Cc)^T{\tilde x} + \frac{1}{2}{\tilde x}^T(CQC){\tilde x} \\ \text{s.t.}\; & Rl^{c} \le RAC\tilde x \le Ru^{c} \\ & C^{-1}l^{v} \le \tilde x \le C^{-1}u^{v} \end{align} $$

A solution to the original problem is recovered as $x = C\tilde x$. If $\tilde y$ is a dual solution and $\tilde r$ are reduced costs for the transformed problem, then $y = R\tilde y$ is a dual solution and $r = C^{-1}\tilde r$ are reduced costs for the original problem (derivation omitted).

Infeasibility identification

A certificate of primal infeasibility is a point $(y, r) \in \mathbb{R}^m \times \mathbb{R}^n$ satisfying:

$$ \begin{equation} \left((l^{c})^T[y]_+ - (u^{c})^T[y]_- \right) + \left((l^{v})^T[r]_+ - (u^{v})^T[r]_- \right) > 0 \\ -A^T y = r . \end{equation} $$

The existence of such a point implies that the primal problem doesn't have a solution.

Similarly, a certificate of dual infeasibility is a point $x \in \mathbb{R}^n$ such that:

$$ \begin{equation} Qx = 0 \\ c^T x < 0 \\ (Ax)_i \begin{cases} = 0 & \text{if }l^c_i , u^c_i \in \mathbb{R}, \\ \geq 0 & \text{if }l^c_i \in \mathbb{R}, u^c_i = \infty, \\ \leq 0 & \text{if }l^c_i = -\infty, u^c_i \in \mathbb{R}, \end{cases} \\ x_i \begin{cases} = 0 & \text{if }l^v_i , u^v_i \in \mathbb{R}, \\ \geq 0 & \text{if }l^v_i \in \mathbb{R}, u^v_i = \infty, \\ \leq 0 & \text{if }l^v_i = -\infty, u^v_i \in \mathbb{R}. \end{cases} \end{equation} $$

Note that the certificates for a linear program can be obtained by setting $Q=0$.

A symmetric positive semidefinite objective matrix $S$ can be converted into this form by factoring $S$ as $S = R^T R$ (for example, a Cholesky factorization), introducing additional variables $z$ defined by constraints $R x - z = 0$, so that $x^T S x = z^T z$. ↩