I wrote this attempt at an equationless explanation of the principles of convex optimization a while back, partly because I promised it, and partly because I’d intended to contribute to the Carnival of Mathematics. I had it sitting in the queue, intending to revise and fact check it before posting, but … it seems like I’m going to be shitting work for a while, so here it goes with caveats. It’s based on an optimization course I took two terms ago.
“…in fact, the great watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity” - R. Tyrrell Rockafellar, in SIAM Review, 1993
Convex optimization problems take the form of minimizing some convex cost function
over a feasible set defined as the points which satisfy
inequalities
where
are convex, and
affine equalities
. Standard linear programs are one example of convex optimization problems, and the most powerful general class of convex optimization problem that I saw as a standard ‘class’ of programs in the course is the semidefinite program, to minimize an affine cost function given linear equalities
and the component wise inequalities
where the
are symmetric matrices. Semidefinite programs (SDPs) subsume several popular classes of convex optimization problems.
Newton’s algorithm for the unconstrained minimization of differentiable objective functions is extremely easy to use: you construct a minimizing sequence
, and as
,
, a minimizer of the cost. Of course, this is given some limiting assumptions on your function. The first cool thing about convex problems is that convergence to such a minimizer is guaranteed, and under some stronger but reasonable conditions, the minimizer is unique.
Constrained convex optimization problems, say minimizing
subject to the inequality constraints
can be approximated by unconstrained convex problems of the form
where
is a smooth convex barrier function which approximates the indicator function
when
and
when
, and such that the approximation improves as
. Intuitively, then, as
, the solution to the unconstrained convex optimization problem will approach that of the original. The problem is, given that you want an answer that is accurate to within
of the true optimal cost, how do you determine
? Is it even possible?
Digression to cool fact number two about convex optimization problems– there exists a dual problem to the primal problem, namely the problem of maximizing a dual function
. This dual function is concave, so in the same way that a convex function always has a minimum value,
always has a maximal value. Furthermore, this maximal value is pretty darn useful, in that it provides a lower bound on the optimal value of the primal problem. If certain technical conditions are satisfied (and they usually are, from what I’ve seen, in practical problems), the optimal values of the primal and dual problems are identical.
Now we have all the pieces to construct an interior point solver for convex optimization problems. First, we need a starting value of
and a strictly feasible (all the inequality constraints are strictly satisfied) starting guess for x,
, then we solve the first approximating unconstrained optimization problem using Newton’s method. We then increase
and solve the next approximate optimization problem, and repeat until we decide that the solution to the approximation problem is close enough to the optimal value of the original problem. How do we do this? Well, we choose our
cleverly so that each guess
also gives us a guess for a feasible point in the dual problem
. Since we know that
is an underestimate of the optimal value and
is an overestimate, we can bound the error at each step, and simply stop when this error has decreased sufficiently. This barrier method is one of a class of interior point methods– so called because the constructed minimizing sequence is strictly in the interior of the feasible regions, as opposed to say the simplex method, where the minimizing sequence is on the vertices of some polyhedron. The path
traced out by the successive approximations is known as the central path.
Algorithmically and theoretically, interior point methods are impressive. Theoretically because they show that not only can a linear program be solved in polynomial time, all convex optimization problems can (at least in principle– I imagine it might be difficult to construct appropriate barrier functions for different classes of programs). Algorithmically because they allow you to solve convex optimization problems in just roughly 40 times the amount of time it’d take to solve a least squares problem of equivalent size (the cost of the algorithm is dominated by computation of the product
, and it takes only about 40 newton steps to solve a convex problem, largely independent of the problem size).
Pretty incredible! For more details, check out Boyd and Vandenberghe’s freely available textbook on convex optimization.