Benders Decomposition
John Mitchell

See Nemhauser and Wolsey, section II.3.7 and III.5.4, for more information.

Benders Decomposition reduces a mixed integer optimization problem with p continuous variables and n integer variables to one with just one continuous variable, and still n integer variables, but typically with an enormous number of constraints.

The initial problem is

maxx,y      cTx  +   hTy
subject to  Ax   +    Gy  ≤   b
                          n       p
              x  ∈ X  ⊆ ℤ +,y ∈ ℝ +
(1)

where b m and c, h, A, and G are all dimensioned appropriately.

For each possible choice of x X, we could find the best choice for y by solving a linear program, so we could regard y as a function of x. We can then replace the contribution of y to the objective by a scalar variable representing the value of the best choice for a given x. We start out with a crude approximation to this value, and then generate a sequence of dual solutions to tighten up the approximation.

Let x X. We denote the value of the best choice for y by zLP (x). We have

zLP(x) :=  maxy        hTy
           subject to   Gy   ≤   b - Ax
                                 p
                         y  ∈   ℝ+
(2)

By LP duality, we can also write

                               T
zLP (x) =  minu       (b - Ax )Tu
           subject to        G  u  ≥   h
                                u  ∈   ℝm+
(3)

Note that the feasible region Q for the dual problem does not depend on x. We denote the extreme points and extreme rays of Q as K and J respectively:

extreme  points:  uk,  k ∈ K
  extreme  rays:  rj,  j ∈ J

If the inner product between (b - Ax) and any ray rj is negative then z LP (x) = -∞. Equivalently, in this situation, problem (2) is infeasible, so x does not allow a feasible solution to the original problem (1). Thus, we have the valid constraints

(b - Ax )Trj ≥ 0  for j ∈ J
(4)

that must be satisfied by any x that is feasible in (1).

If x satisfies (4) then the value of zLP (x) is given by

                       T k
zLP (x) = mki∈nK (b - Ax ) u .

Thus, problem (1) can be written equivalently as

                  T
maxx,t           c x + t
subject to  (b - Ax )T uk  ≥  t   for k ∈ K
            (b - Ax )Trj  ≥  0   for j ∈ J
                t ∈ ℝ, x  ∈  X
(5)

This problem has fewer variables than the original formulation (1), but it may have a huge number of constraints. Thus, these constraints are generated as needed, as cutting planes.

Let ˆK K and Ĵ J denote the current known extreme points and extreme rays of Q, respectively. The current relaxation of (1) and (5) is then

maxx,t           cTx + t
subject to  (b - Ax )T uk  ≥  t   for k ∈ ˆK
                     T j                 ˆ
            (b - Ax ) r   ≥  0   for j ∈ J
                t ∈ ℝ, x  ∈  X
(6)

The scalar variable t represents an estimate of zLP (x).

The algorithm can then be written:

1.
Determine (possibly empty) initial sets  ˆ
K of extreme points and Ĵ of extreme rays of Q.
2.
Solve problem (6), giving solution x and corresponding t.
3.
Determine zLP (x) by solving (3).
4.
If zLP (x) = -∞, an extreme ray of Q has been found. Add the extreme ray to Ĵ and return to Step 2.
5.
If zLP (x) < t and finite, an extreme point of Q has been found. Add the extreme point to ˆK and return to Step 2.
6.
If zLP (x) = t then x solves (1), with optimal y equal to the solution to (2) with x = x.