Convex Optimization and the Barrier Method

April 1st, 2008 ~ Posted in: Mathematics

I wrote this attempt at an equationless explanation of the principles of convex optimization a while back, partly because I promised it, and partly because I’d intended to contribute to the Carnival of Mathematics. I had it sitting in the queue, intending to revise and fact check it before posting, but … it seems like I’m going to be shitting work for a while, so here it goes with caveats. It’s based on an optimization course I took two terms ago.

“…in fact, the great watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity” - R. Tyrrell Rockafellar, in SIAM Review, 1993

Convex optimization problems take the form of minimizing some convex cost function f_0 over a feasible set defined as the points which satisfy m inequalities f_i(x) \leq 0 where f_i are convex, and p affine equalities a_i^Tx + b_i = 0. Standard linear programs are one example of convex optimization problems, and the most powerful general class of convex optimization problem that I saw as a standard ‘class’ of programs in the course is the semidefinite program, to minimize an affine cost function given linear equalities Ax = b and the component wise inequalities F(x) = F_0 + x_1 F_1 + \ldots + x_n F_n \leq 0 where the F_i are symmetric matrices. Semidefinite programs (SDPs) subsume several popular classes of convex optimization problems.

Newton’s algorithm for the unconstrained minimization of differentiable objective functions is extremely easy to use: you construct a minimizing sequence x_{k+1} = x_k - t_k \left(\nabla^2 f\right)^{-1} \nabla f(x_k) , and as  k \rightarrow \infty, x_k \rightarrow x^\star, a minimizer of the cost. Of course, this is given some limiting assumptions on your function. The first cool thing about convex problems is that convergence to such a minimizer is guaranteed, and under some stronger but reasonable conditions, the minimizer is unique.

Constrained convex optimization problems, say minimizing f_0 subject to the inequality constraints f_i(x) \leq 0 can be approximated by unconstrained convex problems of the form f_0(x) + 1/t \sum_i I(f_i(x)) where I/t is a smooth convex barrier function which approximates the indicator function  f(x) = 0 when x \leq 0 and  f(x) = \infty when  x > 0, and such that the approximation improves as t \rightarrow \infty. Intuitively, then, as t \rightarrow \infty , the solution to the unconstrained convex optimization problem will approach that of the original. The problem is, given that you want an answer that is accurate to within \epsilon of the true optimal cost, how do you determine t? Is it even possible?

Digression to cool fact number two about convex optimization problems– there exists a dual problem to the primal problem, namely the problem of maximizing a dual function g. This dual function is concave, so in the same way that a convex function always has a minimum value, g always has a maximal value. Furthermore, this maximal value is pretty darn useful, in that it provides a lower bound on the optimal value of the primal problem. If certain technical conditions are satisfied (and they usually are, from what I’ve seen, in practical problems), the optimal values of the primal and dual problems are identical.

Now we have all the pieces to construct an interior point solver for convex optimization problems. First, we need a starting value of t and a strictly feasible (all the inequality constraints are strictly satisfied) starting guess for x, x_0, then we solve the first approximating unconstrained optimization problem using Newton’s method. We then increase t and solve the next approximate optimization problem, and repeat until we decide that the solution to the approximation problem is close enough to the optimal value of the original problem. How do we do this? Well, we choose our I cleverly so that each guess x_k also gives us a guess for a feasible point in the dual problem \lambda_k. Since we know that g(\lambda_k) is an underestimate of the optimal value and f_0(x_k) is an overestimate, we can bound the error at each step, and simply stop when this error has decreased sufficiently. This barrier method is one of a class of interior point methods– so called because the constructed minimizing sequence is strictly in the interior of the feasible regions, as opposed to say the simplex method, where the minimizing sequence is on the vertices of some polyhedron. The path x_k traced out by the successive approximations is known as the central path.

Algorithmically and theoretically, interior point methods are impressive. Theoretically because they show that not only can a linear program be solved in polynomial time, all convex optimization problems can (at least in principle– I imagine it might be difficult to construct appropriate barrier functions for different classes of programs). Algorithmically because they allow you to solve convex optimization problems in just roughly 40 times the amount of time it’d take to solve a least squares problem of equivalent size (the cost of the algorithm is dominated by computation of the product \left(\nabla^2 f\right)^{-1} \nabla f, and it takes only about 40 newton steps to solve a convex problem, largely independent of the problem size).

Pretty incredible! For more details, check out Boyd and Vandenberghe’s freely available textbook on convex optimization.

This entry was posted on Tuesday, April 1st, 2008 at 1:16 pm and is filed under Mathematics. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply