Linear Assignment Solver

Overview

This section describes the linear assignment solver, a specialized solver for the simple assignment problem, which can be faster than either the MIP or CP-SAT solver. However, the MIP and CP-SAT solvers can handle a much wider array of problems, so in most cases they are the best option.

Cost matrix

The costs for workers and task are given in the table below.

0 1 2 3
0 90 76 75 70
1 35 85 55 65
2 125 95 90 105
3 45 110 95 115

The following sections present a Python program that solves an assignment problem using the linear assignment solver.

Import the library

The code that imports the required library is shown below.

from ortools.graph import pywrapgraph

Create the data

The code that creates the data for the problem is shown below.

def create_data_array():
  cost = [[90, 76, 75, 70],
          [35, 85, 55, 65],
          [125, 95, 90, 105],
          [45, 110, 95, 115]]
  return cost

The array is the cost matrix, whose i, j entry is the cost for worker i to perform task j. There are four workers, corresponding to the rows of the matrix, and four tasks, corresponding to the columns.

Create the solver

The program uses the linear assignment solver, a specialized solver for the assignment problem. The following code creates the solver.

from ortools.graph import pywrapgraph
import time

def main():
  cost = create_data_array()
  rows = len(cost)
  cols = len(cost[0])

  assignment = pywrapgraph.LinearSumAssignment()
The program imports the Python wrapper pywrapgraph, and then uses the method LinearSumAssignment to create the solver.

Note: The linear assignment solver only accepts integer values for the weights and values. The section Using a solver with non-integer data shows how to use the solver if your data contains non-integer values.

To compare how long different solvers take to solve the same problem, all the programs in this section have a timer (which uses the time package). The following code creates the timer.

if __name__ == "__main__":
  start_time = time.clock()
  main()
  print()
  print("Time =", time.clock() - start_time, "seconds")

Add the costs to the solver

The following code adds the costs to the solver by looping over workers and tasks.

  for worker in range(rows):
    for task in range(cols):
      if cost[worker][task]:
        assignment.AddArcWithCost(worker, task, cost[worker][task])

Invoke the solver

The following code invokes the solver and displays the solution.

  solve_status = assignment.Solve()
  if solve_status == assignment.OPTIMAL:
    print('Total cost = ', assignment.OptimalCost())
    print()
    for i in range(0, assignment.NumNodes()):
      print('Worker %d assigned to task %d.  Cost = %d' % (
            i,
            assignment.RightMate(i),
            assignment.AssignmentCost(i)))
  elif solve_status == assignment.INFEASIBLE:
    print('No assignment is possible.')
  elif solve_status == assignment.POSSIBLE_OVERFLOW:
    print('Some input costs are too large and may cause an integer overflow.')
The output below shows the optimal assignment of workers to tasks.

$ python my_projects/linear_sum_assignment.py
Total cost = 265

Worker 0 assigned to task 3.  Cost = 70
Worker 1 assigned to task 2.  Cost = 55
Worker 2 assigned to task 1.  Cost = 95
Worker 3 assigned to task 0.  Cost = 45

Time = 0.000147 seconds
The following graph shows the solution as the dashed edges in the graph. The numbers next to the dashed edges are their costs. The total wait time of this assignment is the sum of the costs for the dashed edges, which is 265.

Workers Tasks 0 1 2 3 0 1 2 3 70 55 95 45

In graph theory, a set of edges in a bipartite graph that matches every node on the left with exactly one node on the right is called a perfect matching.

The entire program

Here is the entire program.

from __future__ import print_function
from ortools.graph import pywrapgraph
import time

def main():
  cost = create_data_array()
  rows = len(cost)
  cols = len(cost[0])

  assignment = pywrapgraph.LinearSumAssignment()
  for worker in range(rows):
    for task in range(cols):
      if cost[worker][task]:
        assignment.AddArcWithCost(worker, task, cost[worker][task])
  solve_status = assignment.Solve()
  if solve_status == assignment.OPTIMAL:
    print('Total cost = ', assignment.OptimalCost())
    print()
    for i in range(0, assignment.NumNodes()):
      print('Worker %d assigned to task %d.  Cost = %d' % (
            i,
            assignment.RightMate(i),
            assignment.AssignmentCost(i)))
  elif solve_status == assignment.INFEASIBLE:
    print('No assignment is possible.')
  elif solve_status == assignment.POSSIBLE_OVERFLOW:
    print('Some input costs are too large and may cause an integer overflow.')
def create_data_array():
  cost = [[90, 76, 75, 70],
          [35, 85, 55, 65],
          [125, 95, 90, 105],
          [45, 110, 95, 115]]
  return cost
if __name__ == "__main__":
  start_time = time.clock()
  main()
  print()
  print("Time =", time.clock() - start_time, "seconds")

Solution when workers can't perform all tasks

In the previous example, we assumed that all workers can perform all tasks. But this not always the case — an worker might be unable to perform one or more tasks for various reasons. However, it is easy to modify the program above to handle this.

As an example, suppose that worker 0 is unable to perform task 3. To modify the program to take this into account, make the following changes:

  1. Change the 0, 3 entry of the cost matrix to the string 'NA'. (Any string will work.)
    cost = [[90, 76, 75, 'NA'],
            [35, 85, 55, 65],
            [125, 95, 90, 105],
            [45, 110, 95, 115]]
  2. In the section of the code that assigns costs to the solver, add the line if cost[worker][task] != 'NA':, as shown below.
    for worker in range(0, rows):
        for task in range(0, cols):
          if cost[worker][task] != 'NA':
            assignment.AddArcWithCost(worker, task, cost[worker][task])
    The added line prevents any edge whose entry in the cost matrix is 'NA' from being added to the solver.

After making these changes and running the modified code, you see the following output:

Total cost = 276

Worker 0 assigned to task 1.  Cost = 76
Worker 1 assigned to task 3.  Cost = 65
Worker 2 assigned to task 2.  Cost = 90
Worker 3 assigned to task 0.  Cost = 45

Notice that the total cost is higher now than it was for the original problem. This is not surprising, since in the original problem the optimal solution assigned worker 0 to task 3, while in the modified problem that assignment is not allowed.

To see what happens if more workers are unable to perform tasks, you can replace more entries of the cost matrix with 'NA', to denote additional workers who can't perform certain tasks:

cost = [[90, 76, 'NA', 'NA'],
        [35, 85, 'NA', 'NA'],
        [125, 95, 'NA','NA'],
        [45, 110, 95, 115]]
When you run the program this time, you get a negative result:
No assignment is possible.

This means there is no way to assign workers to tasks so that each worker performs a different task. You can see why this is so by looking at the graph for the problem (in which there are no edges corresponding to values of 'NA' in the cost matrix).

Worker Task 0 1 2 3 0 1 2 3

Since the nodes for the three workers 0, 1, and 2 are only connected to the two nodes for tasks 0 and 1, it not possible to assign distinct tasks to these workers.

The Marriage Theorem

There is a well-known result in graph theory, called The Marriage Theorem, which tells us exactly when you can assign each node on the left to a distinct node on the right, in a bipartite graph like the one above. Such an assignment is called a perfect matching. In a nutshell, the theorem says this is possible if there is no subset of nodes on the left (like the one in the previous example) whose edges lead to a smaller set of nodes on the right.

More precisely, the theorem says that a bipartite graph has a perfect matching if and only if for any subset S of the nodes on the left side of the graph, the set of nodes on the right side of the graph that are connected by an edge to a node in S is at least as large as S.