TOC

Introduction

Convex optimization is a subfield of mathematical optimization that deals with the minimization of convex functions over convex sets. Convex optimization problems have a unique global minimum, which is a highly desirable property for optimization algorithms. The field has many applications in areas such as machine learning, signal processing, statistics, finance, and operations research.

Usually the problem is to find an x that minimize a function f0(x) with the constraints of fi(x)0,i=1,...m and hi(x)=0,i=1,..p. Mathematically, the optimization problem is written as follows:

minimize f0(x) subject to fi(x)0,i=1,..m and hi(x)=0,i=1,...p

We call xRn the optimization variable and the function f0:RnR the objective function. The inequalities fi(x)0 are the inequality constraints and the corresponding functions fi:RnR are the inequality constraint functions. Without the constraint (m = p = 0) the problem is unconstraint.

The set of points for which all the functions are defined is called the domain. A point xD is feasible if it satisfies the constraints. If there is at least one feasible point, the problem is feasible, and infeasible otherwise. The set of all feasible points is the feasible set or constraint set. The optimal value p is defined as:

p=inf{f0(x)fi(x)0,i=1...m,hi(x)=0,i=1...p

If p=, the problem is unbounded below. We say x is an optimal point if x is feasible and f0(x)=p. The set of all optimal points is denoted:

Xopt={xf0(x)=p,fi(x)0,i=1,...m,hi(x)=0,i=1..p}

If the optimal point exists, the problem is solvable. The unbounded below problem is not solvable. A feasible point x with f0(x)p+ϵ,ϵ>0 is called ϵsuboptimal and the set is the ϵsuboptimal set.

A local optimal point is a feasible point that solves the minimum problem f0 over nearby points in the feasible set:

f0(x)=inf{f0(z)fi(z)0,i=1..m,hi(z)=0,i=1..p,∣∣zx2R},R>0

In other words, x minimize f0(z) subject to fi(z)0,i=1..m,hi(z)=0,i=1...p,∣∣zx2R.

The standard form of an optimization is when you arrange the functions and inequalities so that the right hand size is 0. For example, for an equality constraint fi(x)=gi(x), we can turn it into hi(x)=0 where hi(x)=fi(x)gi(x). Similarly, fi(x)0 can be written as fi(x)0. The box constraints mixini can be expressed as mixi0,xini0. The maximization problem f0(x) can be written as minimizing f0.

Equivalence

The two problems are equivalent if from the solution of one, the solution of the other is easily found and vice versa. One way to make an equivalent problem is to scale the functions and constraints:

minimize f(x)=α0f0(x) subject to fi(x)=αifi(x)0,i=1..m, hi(x)=βihi(x)=0,i=1...p where αi>0,i=0..m,βi0,i=1...p

As a result of the scaling, the feasible sets of the problem and the original one is the same. There are some other ways of transformation that make equivalent problems.

  • Change of variables: when we substitute variable x=ϕ(z) the problem becomes an equivalent one:

minimize f0(z) subject to fi(z)0,i=1..m, hi(z)=0,i=1..p

If x solves the problem, then z=ϕ1(x) solves the problem and vice versa.

  • Transformation of objective and constraint functions: let fi(x)=ϕi(fi(x)),i=0..m,hi(x)=ϕi(hi(x)),i=1..p. We come up with an equivalent problem:

minimize f0(x) subject to fi(x)0,i=1..m,hi(x)=0,i=1...p.

For example the unconstrained Euclidean norm minimization problem ∣∣Axb2 is equivalent to (Axb)T(Axb). This makes it easier to solve since the latter objective is differential for all x (since it is quadratic) and the latter is not differential at Axb=0.

  • Adding slack variables so that the inequalities become equalities. We have: minimize f0(x) subject to si0,i=1,...m,fi(x)+si=0,i=1...m,hi(x)=0,i=1...p.

  • Transform the equality constraints into x=ϕ(z) so that the problem becomes: minimize f0(x)=f0(ϕ(z)) subject to fi(z)=fi(ϕ(z))0,i=1..m

  • Introduce new equality constraint yi=Aix+bi,i=0...m: minimize f0(y0) subject to fi(yi)0,i=1...m,yi=Aix+bi,i=0...m,hi(x)=0,i=1...p

  • Optimizing sequentially over variables. For example, we can optimize over y first, then x: infx,yf(x,y)=infxg(x) with g(x)=infyf(x,y).

  • By changing into epigraph form: minimize t subject to f0(x)t0,fi(x)0,i=1...m,hi(x)=0,i=1...p

  • By restricting the domain of the new unconstraint problem, For example we have a new unconstraint objective: to minimize F(x) but with the feasible set restraint by previous inequalities.

Convex optimization

From the standard optimization problem, the convex optimization problem requires extra conditions:

  • The objective function must be convex

  • The inequality constraint functions must be convex

  • The equality constrain functions hi(x)=aiTxbi must be affine

Since the domain of the problem is convex, the feasible set of a convex optimization problem is also convex.

For the convex optimization problem, any local optimal point is also global. We prove this by contradictory. If x is feasible and locally optimal, then f0(x)=inf{f0(z)zfeasible,∣∣zx2R},R0. Suppose that x is not globally optimal, there would exists a feasible y such that f0(y)<f0(x). This would make ∣∣yx2>R since otherwise f0(x)f0(y). Consider a point z=(1θ)x+θy,θ=R2∣∣yx2. Then ∣∣zx2=R/2<R. By convexity of the feasible set, z is feasible. By convexity of f0: f0(z)(1θ)f0(x)+θf0(y)<f0(x). This is contradictory, so there is no other feasible y with f0(y)<f0(x) so x is indeed global.

Linear optimization

When the objective and constraint functions are all affine, the problem is called a linear program. A general linear program has the form:

minimize cTx+d subject to Gxh,Ax=b where GRmxn,ARpxn. Notice that the constant d is not necessary.

Screen Shot 2023-04-26 at 16 19 00

Image: geometric interpretation of an LP. The point x is optimal, it is the point farthest possible in the direction of -c.

There are two cases of LP that are popular:

  • A standard form LP: minimize cTx subject to Ax=b,x0.

  • An inequality form LP: minimize cTx subject to Axb.

Here is how to convert LPs to standard form:

  • The first step is to introduce slack variables si for the inequalities: minimize cTx+d, subject to Gx+s=h,Ax=b,s0.

  • The second step is to express the variable x as the difference of two nonnegative variables y and z (z=yz,y,z0): minimize cTycTz+d subject to GyGz+s=h,AyAz=b,y0,z0,s0.

Quadratic optimization

The convex optimization problem is called a quadratic program if the objective function is (convex) quadratic and the constraint functions are affine:

minimize (1/2)xTPx+qTx+r subject to Gxh,Ax=b where PS+n,GRmxn,ARpxn. The convex quadratic function is minimized over a polyhedron.

Screen Shot 2023-04-26 at 16 43 31

Image: Geometric illustration of QP. The point x is optimal.

Geometric programming

Geometric programming is a family of optimization problems that naturally are not convex but can be transformed into a convex problem, by changeing of variables and transformation of objective and constraint functions.

A function f:RnR with domain R++n:f(x)=cx1a1x2a2...xnan where c > 0 and aiR is called a monomial function. A sum of monomials is called a posynomial function of K terms: f(x)=k=1Kckx1a1kx2a2k...xnank where ck>0.

Then the optimization problem of the form: minimize f0(x) subject to fi(x)1,i=1...m,hi(x)=1,i=1...p where f0,...fm are posynomials and h1,...hp are monomials, is called the geometric program (GP). The domain of the problem is D=R++n,x>0.

  • To transform the GP into a convex optimization problem, we first change the variables:

Let’s yi=logxi so xi=eyi. If f(x)=cx1a1x2a2...xnan then f(x)=f(ey1,...,eyn)=c(ey1)a1...(eyn)na=eaTy+b where b = log c. This turns the monomial function into the exponential of an affine function. Now we do similarly for the posynomial: f(x)=k=1KeakTy+bk where ak=(a1k,...ank) and bk=logck. The posynomial also becomes a sum of exponentials of affine functions. The geometric program then can be: minimize k=1K0ea0kTy+b0k subject to k=1KieaikTy+bik1,i=1...m,egiTy+hi=1,i=1...p where aikRn,i=0...m and giRn,i=1...p.

  • The second step is to transform the objective and constraint functions by taking logarithm:

minimize f0(y)=log(k=1K0ea0kTy+b0k) subject to fi(y)=log(k=1KieaikTy+bik)0,i=1...m,li(y)=giTy+hi=0,i=1..p