Mathematical background for PDLP

AI-generated Key Takeaways

This page provides mathematical background on PDLP, a linear and quadratic programming solver in OR-Tools, intended for developers and advanced users.
PDLP addresses convex quadratic programming problems, offering solutions for both primal and dual formulations, with details on dual variable bounds and derivation using conjugate duality.
It explains the saddle-point formulation used in the primal-dual hybrid gradient method, the computation of reduced costs, dual residuals, and the corrected dual objective.
Rescaling techniques for the problem are discussed, along with methods for identifying primal and dual infeasibility through certificates.

This page contains mathematical background for developers and advanced users of PDLP, a linear and quadratic programming solver available in OR-Tools. It serves as reference for parts of the code and is not intended to be read on its own. Interested readers should first familiarize themselves with the paper, "Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient", then look at the code, then come back to this document when the code references it.

Primal

PDLP considers the following convex quadratic programming problem,

$$ \begin{align} \min_x & \, c^Tx + \frac{1}{2}x^TQx \\ \text{s.t.}\; & l^{c} \le Ax \le u^{c} \\ & l^{v} \le x \le u^{v} \end{align} $$

where $A$ is an $m \times n$ matrix and $Q$ is a diagonal non-negative $n \times n$ matrix¹. The upper bound vectors $u^{c}$ and $u^{v}$ have entries in $\mathbb{R} \cup \{ \infty\}$, and the lower bound vectors $l^{c}$ and $l^{v}$ have entries in $\mathbb{R} \cup \{ -\infty\}$. We assume $l^{c} \le u^{c}$ and $l^{v} \le u^{v}$.

Dual

Let $a \in \mathbb{R}$. Let $[a]_+$ denote its positive part and $[a]_-$ denote its negative part, that is, $a = [a]_+ - [a]_-$. When applied to a vector, the positive and negative parts are computed element-wise.

The dual of the earlier primal problem is over $x \in \mathbb{R}^n$, $y \in \mathbb{R}^m$, and $r \in \mathbb{R}^n$. The vector $y$ contains dual multipliers on the linear constraints ($l^{c} \le Ax \le u^{c}$). The vector $r$, called the reduced costs, contains dual multipliers on the bound constraints ($l^{v} \le x \le u^{v}$).

$$ \begin{align} \max_{x, y, r} & \, -\frac{1}{2}x^TQx + \left((l^{c})^T[y]_+ - (u^{c})^T[y]_- \right) + \left((l^{v})^T[r]_+ - (u^{v})^T[r]_- \right) \\ \text{s.t.}\; & Qx + c - A^Ty = r \end{align} $$

When $Q = 0$, $x$ can be dropped from the dual, recovering LP duality.

Dual variable bounds

We say that $y$ satisfies the dual variable bounds if the $y$-term in the objective is finite, that is:

$$ y_i \geq 0 \qquad \text{if }u^c_i = \infty, \\ y_i \leq 0 \qquad \text{if }l^c_i = -\infty. $$

Derivation using conjugate duality

Preliminaries

Let $a \in \mathbb{R} \cup \{-\infty\}$ and $b \in \mathbb{R} \cup \{\infty\}$ with $b \ge a$ and consider the interval $[a, b] \subseteq \mathbb{R} \cup \{-\infty, \infty\}$.

Let $\mathcal{I}_{[a, b]} : \mathbb{R} \to \mathbb{R} \cup \{ \infty\}$ be the indicator function of the interval, that is, $\mathcal{I}_{[a, b]}(x)$ is zero when $x \in [a, b]$ and $\infty$ otherwise.

Define $p(y; a, b): \mathbb{R} \to \mathbb{R} \cup \{\infty\}$ as:

$$ p(y; a, b) = \begin{cases} ay & \text{ if } y < 0 \\ 0 & \text{ if } y = 0 \\ by & \text{ if } y > 0 \end{cases}. $$

When $a$ or $b$ are infinite, follow standard extended real arithmetic.

Basic result: $p(y; a, b) = (\mathcal{I}_{[a, b]})^*(y)$ where $(\cdot)^*$ denotes the convex conjugate.

For vectors $l \subseteq (\mathbb{R} \cup \{-\infty\})^n$ and $u \subseteq (\mathbb{R} \cup \{\infty\})^n$, the indicator function $\mathcal{I}_{[l, u]} : \mathbb{R}^n \to \mathbb{R} \cup \{ \infty\}$ is defined analogously as $\mathcal{I}_{[l,u]}(x) = \sum_{i=1}^n \mathcal{I}_{[l_i, u_i]}(x_i)$, and similarly $p(y; l, u) = (\mathcal{I}_{[l, u]})^*(y) = \sum_{i=1}^n p(y_i; l_i, u_i)$.

Derivation

Introducing auxiliary variables $\tilde a \in \mathbb{R}^m$ and $\tilde x \in \mathbb{R}^n$, we restate the primal problem as:

$$ \begin{align} \min_{x, \tilde x, \tilde a} & \, c^Tx + \frac{1}{2}x^TQx + \mathcal{I}_{[l^c,u^c]}(\tilde a) + \mathcal{I}_{[l^v,u^v]}(\tilde x) \\ \text{s.t.}\; & \tilde a = Ax \\ & \tilde x = x \end{align} $$

Dualizing the equality constraints, we obtain:

$$ \min_{x, \tilde x, \tilde a} \max_{y, r} c^Tx + \frac{1}{2}x^TQx + y^T\tilde a - y^TAx + r^T\tilde x - r^Tx + \mathcal{I}_{[l^c,u^c]}(\tilde a) + \mathcal{I}_{[l^v,u^v]}(\tilde x) $$

Exchanging min with max and regrouping:

$$ \max_{y, r} \min_{x, \tilde x, \tilde a} c^Tx + \frac{1}{2}x^TQx - y^TAx - r^Tx + \left( \mathcal{I}_{[l^c,u^c]}(\tilde a) + y^T\tilde a \right) + \left(\mathcal{I}_{[l^v,u^v]}(\tilde x) + r^T\tilde x \right) $$

The joint minimization over $x$, $\tilde x$, and $\tilde a$ decomposes. For $x$ we see that a minimizer, if one exists, satisfies $Qx + c - A^Ty = r$, in which case the minimum value is $-\frac{1}{2} x^TQx$. For $\tilde x$ and $\tilde a$ we apply the definition of convex conjugates with minor adjustments for signs.

This yields the dual:

$$ \begin{align} \max_{x, y, r} & \, -\frac{1}{2}x^TQx - p(-y, l^c, u^c) - p(-r, l^v, u^v) \\ \text{s.t.}\; & Qx + c - A^Ty = r \end{align} $$

By expanding the definition of $p$, we obtain the dual stated at the top.

Saddle-point formulation

Primal-dual hybrid gradient (see Chambolle and Pock) targets a primal problem of the form

$$ \begin{align} \min_x f(x) + g(x) + h(Kx) \end{align} $$

which by conjugate duality is equivalent to the saddle-point problem

$$ \begin{align} \min_x \max_y f(x) + g(x) + y^TKx - h^*(y) \end{align} $$

PDLP coerces the convex quadratic programming problem into this form by setting:

$f(x) = 0$
$g(x) = c^T x + \frac{1}{2} x^T Q x + \mathcal{I}_{[l^v, u^v]}(x)$
$h(a) = \mathcal{I}_{[-u^c,-l^c]}(a)$
$K = -A$

As derived earlier, $h^*(y) = p(y; -u^c,-l^c)$ is a piecewise linear convex function. Both $g$ and $h^*$ can take infinite values, which effectively limits the domains of $x$ and $y$ respectively.

Note that the proximal operator of $g$ is computable in closed form under PDLP's assumption that $Q$ is diagonal, using the fact that $g$ is separable and the following property that holds for any functions $f_1, f_2$:

$$ f_2(t) = f_1(t) + \frac{\mu}{2} t^2 \Rightarrow \mathrm{prox}_{\lambda f_2}(t) = \mathrm{prox}_{\frac{\lambda}{1 + \lambda \mu} f_1}\left( \frac{t}{1 + \lambda \mu} \right). $$

For a proof of this fact, see for example Theorem 6.13 in First-Order Methods in Optimization. The resulting expression is given by

$$ \begin{equation} \mathrm{prox}_{\lambda g}(x) = \mathrm{proj}_{[l^v, u^v]}\left( (I + \lambda Q)^{-1} (x - \lambda c) \right) \end{equation} $$

Reduced costs, dual residuals, and the corrected dual objective

The saddle point formulation only explicitly works with $(x,y)$; the reduced costs $r$ are implicit. In order to return $(x,y,r)$ when solving the saddle-point formulation, we define $r$ as $r = Qx + c - A^Ty$. The corrected dual objective is the objective value of the dual problem, and always gives a lower bound on the objective value, but is $-\infty$ whenever there is a non-zero reduced cost on an infinite bound.

The reduced costs and dual objective reported by PDLP were changed between OR-tools versions 9.7 and 9.8. To check which version applies, check the comment describing SolverResult::reduced_costs in primal_dual_hybrid_gradient.h and see if it mentions version 9.8.

Version 9.7 and earlier

In order to have a more meaningful dual value when the corrected dual objective is $-\infty$, we also report a dual objective which ignores the infinite terms in the objective value. The dual residuals are the values of $r$ from infinite terms in the corrected dual objective, with 0 in the other components, and the reduced costs returned by PDLP are the values of $r$ from the finite terms in the corrected dual objective, with 0 in the other components (so that $r = \mbox{residuals} + \mbox{reduced costs}$).

Version 9.8 and later

In order to have a more meaningful dual value when the corrected dual objective is $-\infty$, we also report a dual objective which replaces the infinite terms in the objective value with finite ones as follows: if one of the bounds is finite, that bound is used instead of the infinite one; otherwise zero is used for the bound. This choice preserves the concavity of the dual objective, but does not necessarily give a lower bound on the objective value. The dual residuals are the values of $r$ from infinite terms in the corrected dual objective. The reduced costs returned by PDLP are $r$.

Treatment of some variable bounds as infinite

In both versions, if the solver option handle_some_primal_gradients_on_finite_bounds_as_residuals is true (the default), additional variable bounds may be treated as infinite when computing the dual objective and dual residuals. In particular, if $|x_i - l^v_i| > |x_i|$, $l^v_i$ is treated as if it were infinite, and similarly if $|x_i - u^v_i| > |x_i|$, $u^v_i$ is treated as if it were infinite.

Note that handle_some_primal_gradients_on_finite_bounds_as_residuals does not affect the iterates computed; it only affects dual objective and residuals used in termination tests and reported statistics.

Rescaling

Suppose we are given a diagonal (column) rescaling matrix $C$ and a diagonal (row) rescaling matrix $R$ with positive entries on the diagonal. Applying rescaling as in ShardedQuadraticProgram::RescaleQuadraticProgram, we obtain the following transformed problem:

$$ \begin{align} \min_{\tilde x} & \, (Cc)^T{\tilde x} + \frac{1}{2}{\tilde x}^T(CQC){\tilde x} \\ \text{s.t.}\; & Rl^{c} \le RAC\tilde x \le Ru^{c} \\ & C^{-1}l^{v} \le \tilde x \le C^{-1}u^{v} \end{align} $$

A solution to the original problem is recovered as $x = C\tilde x$. If $\tilde y$ is a dual solution and $\tilde r$ are reduced costs for the transformed problem, then $y = R\tilde y$ is a dual solution and $r = C^{-1}\tilde r$ are reduced costs for the original problem (derivation omitted).

Infeasibility identification

A certificate of primal infeasibility is a point $(y, r) \in \mathbb{R}^m \times \mathbb{R}^n$ satisfying:

$$ \begin{equation} \left((l^{c})^T[y]_+ - (u^{c})^T[y]_- \right) + \left((l^{v})^T[r]_+ - (u^{v})^T[r]_- \right) > 0 \\ -A^T y = r . \end{equation} $$

The existence of such a point implies that the primal problem doesn't have a solution.

Similarly, a certificate of dual infeasibility is a point $x \in \mathbb{R}^n$ such that:

$$ \begin{equation} Qx = 0 \\ c^T x < 0 \\ (Ax)_i \begin{cases} = 0 & \text{if }l^c_i , u^c_i \in \mathbb{R}, \\ \geq 0 & \text{if }l^c_i \in \mathbb{R}, u^c_i = \infty, \\ \leq 0 & \text{if }l^c_i = -\infty, u^c_i \in \mathbb{R}, \end{cases} \\ x_i \begin{cases} = 0 & \text{if }l^v_i , u^v_i \in \mathbb{R}, \\ \geq 0 & \text{if }l^v_i \in \mathbb{R}, u^v_i = \infty, \\ \leq 0 & \text{if }l^v_i = -\infty, u^v_i \in \mathbb{R}. \end{cases} \end{equation} $$

Note that the certificates for a linear program can be obtained by setting $Q=0$.

A symmetric positive semidefinite objective matrix $S$ can be converted into this form by factoring $S$ as $S = R^T R$ (for example, a Cholesky factorization), introducing additional variables $z$ defined by constraints $R x - z = 0$, so that $x^T S x = z^T z$. ↩

Mathematical background for PDLP Stay organized with collections Save and categorize content based on your preferences.