The simplex method
In this series (18 parts)
- What is optimization and why ML needs it
- Convex sets and convex functions
- Optimality conditions: first order
- Optimality conditions: second order
- Line search methods
- Least squares: the closed-form solution
- Steepest descent (gradient descent)
- Newton's method for optimization
- Quasi-Newton methods: BFGS and L-BFGS
- Conjugate gradient methods
- Constrained optimization and Lagrangian duality
- KKT conditions
- Penalty and barrier methods
- Interior point methods
- The simplex method
- Frank-Wolfe method
- Optimization in dynamic programming and optimal control
- Stochastic gradient descent and variants
A factory scheduling problem
A factory makes two products. Each product uses machine time and labor, but in different amounts. The factory has limited resources. How do you maximize profit?
| Machine hours per unit | Labor hours per unit | Profit per unit | |
|---|---|---|---|
| Product A | 3 | 2 | $5 |
| Product B | 1 | 4 | $4 |
| Available | 12 hours | 16 hours |
You could try every combination of products A and B, but the constraints carve out a region of feasible options. That region is a polygon. The optimal answer always sits at a corner.
The feasible region
graph TD A["Constraint 1: 3A + 1B <= 12 (machine hours)"] B["Constraint 2: 2A + 4B <= 16 (labor hours)"] C["Constraint 3: A >= 0, B >= 0"] A --> D["Feasible region: a polygon (intersection of half-planes)"] B --> D C --> D D --> E["Optimal solution is at a corner (vertex) of this polygon"]
The simplex method walks along the edges of this polygon, checking one corner at a time, always moving to a better corner until no better neighbor exists. It never needs to search the interior.
Now let’s see the math behind this.
Prerequisites
You should understand Lagrangian duality and the basics of linear algebra, especially matrix operations. Knowing what a convex set is will help you see why the feasible region of an LP has a nice structure.
Linear programs
A linear program (LP) minimizes a linear objective subject to linear constraints. Every LP can be written in standard form:
where is an matrix with (more variables than constraints), , and .
Any LP with inequality constraints can be converted to this form by adding slack variables. For example:
Converting to standard form
graph TD A["Original LP with inequalities"] --> B["Add slack variable for each <= constraint"] B --> C["All constraints become equalities"] C --> D["Standard form: min c^T x, Ax = b, x >= 0"]
Why standard form matters
The constraints define a flat surface (affine subspace) and intersects it with the non-negative orthant. The result is a polytope: a bounded region with flat faces, edges, and vertices. The simplex method exploits the fact that the optimal solution to an LP always occurs at a vertex of this polytope (assuming the optimum exists and is finite).
Basic feasible solutions
A basic feasible solution (BFS) is a vertex of the feasible polytope. Here is the formal definition.
Given with being , choose columns of that form a nonsingular submatrix . Call these the basic columns, and the remaining columns are nonbasic. Set the nonbasic variables to zero and solve for the basic variables . If , then this is a basic feasible solution.
The simplex method moves from one BFS to an adjacent one (they share basic variables) by swapping one variable in and one variable out of the basis.
The simplex tableau
The simplex method is most easily tracked with a tableau. Given a current basis , the tableau organizes:
- The basic variables and their values
- The reduced costs for each nonbasic variable
- The constraint coefficients after elimination
The algorithm:
- Optimality check: If all reduced costs for nonbasic variables, the current BFS is optimal. Stop.
- Pivot selection (entering variable): Choose a nonbasic variable with . This variable will enter the basis.
- Ratio test (leaving variable): Among basic variables, find which one hits zero first as increases. This is the minimum ratio test.
- Pivot: Swap the entering and leaving variables. Update the tableau.
- Go to step 1.
Simplex iteration at a glance
graph TD A["Current vertex (BFS)"] --> B["Check: all reduced costs >= 0?"] B -- Yes --> C["Optimal! Stop."] B -- No --> D["Pick entering variable (negative reduced cost)"] D --> E["Ratio test: find leaving variable"] E --> F["Pivot: move to adjacent vertex"] F --> A
Example 1: Solving a 2-variable LP by simplex
Problem:
subject to:
(We maximize here. To fit the “minimize” convention, negate the objective: .)
Step 1: Add slack variables.
Variables: . All .
Step 2: Initial BFS.
Set . Then . Basis = .
Objective value: .
Step 3: Compute reduced costs.
For maximization, we want to increase a variable with a positive coefficient in the objective (equivalently, negative reduced cost in the min formulation). has coefficient and has coefficient . Choose (largest coefficient, Dantzig’s rule).
Step 4: Ratio test.
Increase . From constraint 1: . From constraint 2: . Minimum ratio: .
So leaves the basis, enters. New basis = .
Pivot: Divide row 1 by 6:
Update row 2: subtract row 1 from row 2:
New BFS: .
Objective: .
Step 5: Check optimality.
Reduced cost for : (still improvable in max sense).
Reduced cost for : (not improvable).
Actually let me redo this more carefully with the tableau approach for the max problem.
Update the objective row. Original objective: . After substituting :
has a positive coefficient (), so we can improve. Enter .
Step 6: Ratio test for .
Row 1: , so .
Row 2: , so .
Minimum ratio: . So leaves, enters. New basis = .
Pivot: Multiply row 2 by :
Update row 1: subtract new row 2:
New BFS: .
Objective: .
Check optimality. Update the objective:
Substitute :
Both nonbasic variables () have negative coefficients. No improvement possible. Optimal!
Solution: . Optimal value: .
Verification: ✓. ✓. Both constraints tight.
Vertices of the feasible region
The feasible region of the LP above is a polygon in 2D. Its vertices are the points where pairs of constraint boundaries intersect while remaining feasible:
| Vertex | Objective | ||
|---|---|---|---|
| Origin | 0 | 0 | 0 |
| -axis | 4 | 0 | 20 |
| Intersection | 3 | 1.5 | 21 |
| -axis | 0 | 3 | 12 |
The simplex method visited: . It walked along two edges of the polygon to reach the optimum.
The simplex method moves along vertices of the feasible polytope. Starting from (0,0), it moves to (4,0) then to the optimal vertex (3,2) where the objective 3x + 2y = 13 is maximized.
Example 2: Starting with a full tableau
Problem:
subject to:
Add slacks :
Initial tableau:
| Basis | RHS | |||||
|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 0 | 0 | 5 | |
| 1 | 0 | 0 | 1 | 0 | 4 | |
| 0 | 1 | 0 | 0 | 1 | 3 | |
| -1 | -2 | 0 | 0 | 0 | 0 |
BFS: , .
Iteration 1: Most negative reduced cost: with . Enter .
Ratios: , no entry for (coefficient is 0), . Minimum: 3. leaves.
Pivot on row 3, column . Row 3 stays (already coefficient 1):
| Basis | RHS | |||||
|---|---|---|---|---|---|---|
| 1 | 0 | 1 | 0 | -1 | 2 | |
| 1 | 0 | 0 | 1 | 0 | 4 | |
| 0 | 1 | 0 | 0 | 1 | 3 | |
| -1 | 0 | 0 | 0 | 2 | 6 |
BFS: , improvement.
Wait, in the min formulation: . At : . The tableau shows because we track or we track the objective differently. Let me keep the convention consistent: the bottom row tracks , so (reduced cost terms).
After the pivot: . Current value: .
Iteration 2: has reduced cost . Enter .
Ratios: , . Minimum: 2. leaves.
Pivot on row 1, column :
| Basis | RHS | |||||
|---|---|---|---|---|---|---|
| 1 | 0 | 1 | 0 | -1 | 2 | |
| 0 | 0 | -1 | 1 | 1 | 2 | |
| 0 | 1 | 0 | 0 | 1 | 3 | |
| 0 | 0 | 1 | 0 | 1 | -8 |
Correcting the row: after entering , the reduced costs become . All reduced costs non-negative. Optimal!
Solution: . Objective: .
Check: ✓, ✓, ✓.
Example 3: Identifying degeneracy
Problem:
subject to:
The optimum is with objective . But notice something: at , all three constraints are active (, , ). We have three active constraints but only two variables. This means the vertex is degenerate: more constraints are active than necessary.
Why degeneracy matters:
In a degenerate BFS, some basic variables equal zero. During a pivot, the ratio test might give a ratio of zero, meaning the objective does not improve. The simplex method can cycle: it pivots through a sequence of degenerate bases that all represent the same point, never making progress.
How to handle it:
- Bland’s rule: Always choose the entering and leaving variables with the smallest index. This guarantees termination.
- Lexicographic rule: Break ties in the ratio test using a lexicographic comparison.
- Perturbation: Add tiny random perturbations to to make the problem non-degenerate.
In practice, cycling is extremely rare. Most solvers use Bland’s rule as a fallback.
Let us trace through. Starting from with slacks :
Iteration 1: Enter (reduced cost ). Ratios: , . leaves. New BFS: .
Iteration 2: Enter (reduced cost ). Ratios: , . Tie! Both and give ratio 2. Pick by Bland’s rule (smaller index).
New BFS: . All slacks are zero. This is a degenerate vertex (three active constraints, but basis has only two non-slack variables plus one slack at zero).
Iteration 3: Check reduced costs. Both and have non-negative reduced costs. Optimal.
Solution: , objective .
Complexity
Worst case: The simplex method can take exponentially many steps. The classic Klee-Minty cube example constructs a polytope in dimensions where the simplex method (with Dantzig’s rule) visits all vertices.
Average case: Smoothed analysis by Spielman and Teng showed that the simplex method runs in polynomial expected time when the input is slightly perturbed. In practice, for an LP with constraints and variables, the simplex method typically takes to iterations.
Each iteration costs for a dense tableau update, but sparse implementations can be much faster.
This is why the simplex method remains competitive with interior point methods despite its exponential worst case: the bad cases are pathological and never show up in real problems.
Duality in the simplex method
The simplex method implicitly maintains dual variables. At a BFS with basis , the dual variables are:
The reduced cost of a nonbasic variable is:
When all reduced costs are non-negative, is dual feasible and complementary slackness holds. The primal and dual objectives are equal, confirming optimality. This is a direct application of Lagrangian duality.
Big-M and two-phase method
Sometimes finding an initial BFS is not obvious (when the original constraints are equalities or inequalities). Two approaches:
Big-M method: Add artificial variables with huge cost to the objective. If the optimal solution has all , the original problem is feasible.
Two-phase method:
- Phase I: Minimize the sum of artificial variables. If the minimum is zero, you have a BFS for the original problem.
- Phase II: Use that BFS to start the simplex method on the original problem.
The two-phase method is numerically more stable because it avoids the large coefficient .
When to use simplex vs interior point
| Scenario | Recommended |
|---|---|
| Small LP (< 1000 variables) | Simplex |
| Large sparse LP | Interior point |
| Need many solves with small changes | Simplex (warm-starting) |
| Quadratic or conic program | Interior point |
| Need exact vertex solution | Simplex |
Simplex excels at warm-starting: if you change one constraint or one cost coefficient, the previous basis is often still nearly optimal, and a few pivots suffice. Interior point methods essentially start from scratch each time.
Python implementation sketch
import numpy as np
def simplex_tableau(c, A, b):
"""Solve min c^T x s.t. Ax <= b, x >= 0 using the simplex method."""
m, n = A.shape
# Add slack variables
tableau = np.zeros((m + 1, n + m + 1))
tableau[:m, :n] = A
tableau[:m, n:n+m] = np.eye(m)
tableau[:m, -1] = b
tableau[-1, :n] = c # objective row (for min)
basis = list(range(n, n + m)) # slack variables
while True:
# Find entering variable (most negative reduced cost)
reduced_costs = tableau[-1, :-1]
j = np.argmin(reduced_costs)
if reduced_costs[j] >= -1e-10:
break # optimal
# Ratio test
col = tableau[:m, j]
rhs = tableau[:m, -1]
ratios = np.full(m, np.inf)
for i in range(m):
if col[i] > 1e-10:
ratios[i] = rhs[i] / col[i]
i = np.argmin(ratios)
if ratios[i] == np.inf:
raise ValueError("Unbounded")
# Pivot
pivot = tableau[i, j]
tableau[i] /= pivot
for k in range(m + 1):
if k != i:
tableau[k] -= tableau[k, j] * tableau[i]
basis[i] = j
x = np.zeros(n)
for i, var in enumerate(basis):
if var < n:
x[var] = tableau[i, -1]
return x, tableau[-1, -1]
What comes next
The simplex method walks along edges of the feasible polytope. The Frank-Wolfe method borrows a similar idea for nonlinear objectives: at each step, it solves a linear subproblem over the constraint set and takes a step toward that solution. This yields sparse iterates and works well for large structured problems.