Table of Contents Author Guidelines Submit a Manuscript
Journal of Applied Mathematics
Volume 2012, Article ID 946893, 21 pages
http://dx.doi.org/10.1155/2012/946893
Research Article

An Interior Point Method for Solving Semidefinite Programs Using Cutting Planes and Weighted Analytic Centers

1Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455, USA
2Department of Mathematics and Statistics, Northern Arizona University, Flagstaff, Arizona 86011-5717, USA

Received 11 October 2011; Revised 10 May 2012; Accepted 24 May 2012

Academic Editor: James Buchanan

Copyright © 2012 John Machacek and Shafiu Jibrin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

We investigate solving semidefinite programs (SDPs) with an interior point method called SDP-CUT, which utilizes weighted analytic centers and cutting plane constraints. SDP-CUT iteratively refines the feasible region to achieve the optimal solution. The algorithm uses Newton’s method to compute the weighted analytic center. We investigate different stepsize determining techniques. We found that using Newton's method with exact line search is generally the best implementation of the algorithm. We have also compared our algorithm to the SDPT3 method and found that SDP-CUT initially gets into the neighborhood of the optimal solution in less iterations on all our test problems. SDP-CUT also took less iterations to reach optimality on many of the problems. However, SDPT3 required less iterations on most of the test problems and less time on all the problems. Some theoretical properties of the convergence of SDP-CUT are also discussed.

1. Introduction

We consider the semidefinite programming problem (SDP) as in [1]: minimize𝑐𝑇𝑥,(1.1)subjectto𝐴(𝑗)(𝑥)0for𝑗=1,,𝑞,(1.2) where 𝐴(𝑗)(𝑥)=𝐴0(𝑗)+𝑛𝑖=1𝑥𝑖𝐴𝑖(𝑗). Also, 𝑥𝑛, 𝑐𝑛, and each 𝐴𝑖(𝑗) is an 𝑚𝑗×𝑚𝑗 symmetric matrix. The constraint 𝐴(𝑗)(𝑥)0 is known as a linear matrix inequality or LMI. For a given matrix 𝐴, 𝐴0 denotes that 𝐴 is positive semidefinite. The notation 0 is used when a matrix is positive definite. Semidefinite Programs (SDPs) are convex optimization problems [2], and they generalize many other convex optimization problems including linear programming. SDPs have applications in engineering and combinatorial optimization and other fields (see [2, 3]).

The dual of problem (1.1) is the optimization problem [1]: maximize𝑞𝑗=1𝐴0(𝑗)𝑍𝑗(1.3)subjectto𝑞𝑗=1𝐴𝑖(𝑗)𝑍𝑗=𝑐𝑖𝑍for𝑖=1,,𝑛(1.4)𝑗0for𝑗=1,,𝑞.(1.5)

In the above, 𝐴𝐵 denotes the matrix dot product. If 𝐴=[𝑎𝑖𝑗] and 𝐵=[𝑏𝑖𝑗] are matrices of size 𝑚, then 𝐴𝐵=𝑚𝑖=1𝑚𝑗=1𝑎𝑖𝑗𝑏𝑖𝑗=𝐓𝐫[𝐴𝐵𝑇], where Tr denotes the trace. We let ={𝑥𝐴(𝑗)(𝑥)0 for all𝑗=1,,𝑞} the feasible region and int()={𝑥𝐴(𝑗)(𝑥)0 for all 𝑗=1,,𝑞} the interior of the feasible region. The set int() is precisely the set of all strictly feasible points. Let 𝑝 be the optimal solution to the primal problem (1.1), and let 𝑑 be the optimal solution to the dual problem (1.3). The duality gap 𝑝𝑑 is zero if either SDP (1.1)-(1.2) or its dual (1.3)–(1.5) is strictly feasible [2].

Several interior point methods have been developed for solving SDPs (see [2, 4, 5]). When developing our interior point method in this paper, we assume a strictly feasible interior point 𝑥0 is known. There exist methods for finding feasible or near feasible points (see [68]). The algorithm we develop uses cutting planes and weighted analytic centers, and it iteratively refines the feasible region until the weighted analytic center approaches the optimal solution. We call our algorithm SDP-CUT. The cutting plane technique was pioneered by Gomory [9], Kelley [10], and also Cheney and Goldstein [11] for solving integer programming problems. An algorithm similar to our SDP-CUT for linear programs (LPs) was proposed by Renegar in his 1988 paper [12].

SDP-CUT is implemented using Newton’s method and different line search techniques. We found that Newton’s method with exact line search is the best implementation of our algorithm. The effects of a weight vector 𝑤+ on the algorithm are also studied. A larger weight yields a faster and more accurate solution in theory, but in practice, too large a weight may cause numerical errors. We also experienced numerical errors in computing the Hessian matrices in SDP-CUT, when solving very large problems.

Since finding an interior point for an SDP problem is equivalent to solving another SDP problem, we decided to consider test problems with a known interior point and used that point as a starting point for SDP-CUT. We find SDPT3 to be an ideal method for comparison with SDP-CUT because it is known to be efficient and it allows the user to input a starting point. We found SDP-CUT was closer to the actual solution than SDPT3 for the initial iterations on all our test problems. SDP-CUT seems to slow down during later iterations to reach optimality. On the other hand, SDPT3 took less time on all the problems and less iterations on most of them. All codes were written in MATLAB version 7.9.0.

2. Weighted Analytic Center

This section discusses weighted analytic center as given in [13] and how to compute it. Other notions of weighted center for semidefinite programming are described in [14].

Given a weight vector 𝜔𝑛+, the weighted barrier function for our system of LMIs (1.2) is defined as follows: 𝜙𝜔(𝑥)=𝑞𝑗=1𝜔𝑗𝐴logdet(𝑗)(𝑥)if𝐴(𝑗)(𝑥)0𝑗=1,,𝑞,+otherwise.(2.1)

Note that as 𝑥 approaches the boundary of the feasible region, 𝜙𝜔(𝑥) approaches . We assume the set {diag(𝐴1(1),,𝐴1(𝑞)),diag(𝐴2(2),,𝐴2(𝑞)),diag(𝐴𝑛(1),,𝐴𝑛(𝑞))} is linearly independent. The function 𝜙𝜔 is analytic and strictly convex over (see [13, 15]). The weighted analytic center is defined to be 𝑥ac(𝜔)=argmin{𝜙𝜔(𝑥)𝑥𝑛}. We call 𝑥ac(𝟏) the analytic center, where 𝟏=[1,1,,1]. In the case of linear constraints and some other more general LMIs, each weight pushes the analytic center away from the boundary of the corresponding constraint. The gradient 𝜙𝜔(𝑥) and the Hessian 2𝜙𝜔(𝑥) are given by the following: for 𝑖=1,,𝑛 and 𝑘=1,,𝑛𝜙𝜔(𝑥)𝑖=𝑞𝑗=1𝜔𝑗𝐴(𝑗)(𝑥)1𝐴𝑖(𝑗),2𝜙𝜔(𝑥)𝑖𝑘=𝑞𝑗=1𝜔𝑗𝐴(𝑗)(𝑥)1𝐴𝑖(𝑗)𝑇𝐴(𝑗)(𝑥)1𝐴𝑘(𝑗).(2.2) We will use the following specific example, SDP (2.3), to describe our method: SDP:minimize𝑧=𝑥1+2𝑥2𝐴subjectto(1)(𝑥)=2001+𝑥11000+𝑥2=01102𝑥1𝑥2𝑥21𝐴0,(1)(2)(𝑥)=0000+𝑥11001+𝑥2=𝑥10001𝑥200𝑥10.(2)(2.3)

Weighted analytic centers for the LMI system (1.2) can be computed with the WAC-NEWTON algorithm described below. (Algorithm 1).

alg1
Algorithm 1: WAC-NEWTON.

An efficient way to calculate the search vector 𝑠𝑙 in WAC-NEWTON is by finding the Cholesky factorization of 2𝜙𝜔(𝑦𝑙) and using it to solve the linear system 2𝜙𝜔(𝑦𝑙)𝑠𝑙=𝜙𝜔(𝑦𝑙). This is how 𝑠𝑙 is calculated in the SDP-CUT algorithm to be discussed in the next section and throughout this paper. Figure 1 shows the contours of 𝜙𝜔(𝑥), the effect of the weight vector 𝜔=[3,1] on the barrier function, and the weighted analytic center. The figure uses the feasible region defined by SDP (2.3) and the associated barrier function. The weight of 3 on constraint (1) pushes the analytic center away from the boundary of constraint (1).

fig1
Figure 1: 𝜙𝜔(𝑥) contours and 𝑥ac(𝜔) with various weights 𝜔.

3. The SDP-CUT Algorithm

This section describes the development of SDP-CUT. We also discuss WAC-NEWTON*, which implements Newton’s method for finding the weighted analytic center for the new system defined in SDP-CUT. The section finishes with algorithms for computing the Newton stepsize (CONSTANT, ELS, and BACKTRACKING).

Refer again to the example SDP (2.3). Here, we illustrate one iteration of SDP-CUT. Figure 2 shows the setup. We have the feasible region, an initial point 𝑥0=[1.2633,0.2292]𝑇, and a cutting constraint of the form 𝑐𝑇𝑥𝑘𝑐𝑇𝑥+𝜖0 that accompanies the point. Figure 3 shows the movement of the point 𝑥0 to 𝑥1=𝑥ac(𝟏), the (weighted) analytic center of our new feasible region made up of the the cutting constraint and the original LMI constraints. Figure 3 also has the contour lines of the barrier function. The weight on the cutting constraint can be changed from 1 to other larger values.

946893.fig.002
Figure 2: The first cutting constraint and an initial point 𝑥0.
946893.fig.003
Figure 3: The weighted analytic center of new feasible region.

Given a system of LMI’s (1.2) and an objective function as in the primal SDP problem (1.1), we can numerically solve the problem by iteratively reducing the feasible set. Denote the current feasible region determined by our system of LMIs by 𝑘, and suppose we know a strictly feasible point 𝑥𝑘 in 𝑘. Initially, we set 0=. We can find a new feasible region 𝑘+1𝑘, where for all𝑥𝑘+1, 𝑐𝑇𝑥𝑐𝑇𝑥𝑘. We do this by adding a new constraint cut 𝑐𝑇𝑥𝑘𝑐𝑇𝑥+𝜖0. The 𝜖 is added to 𝑐𝑇𝑥𝑘 to ensure that 𝑥𝑘 is “strictly feasible” in the new system. Given weight 𝑤+, we define a new barrier function to account for this new constraint: 𝜙𝑤𝑐(𝑥)=𝑤log𝑇𝑥𝑘𝑐𝑇𝑥+𝜖𝑞𝑗=1𝐴logdet(𝑗).(𝑥)(3.1) The gradient 𝜙𝑤(𝑥) and the Hessian 2𝜙𝑤(𝑥) for this new barrier function are given by the following: for 𝑖=1,,𝑛 and 𝑘=1,,𝑛: 𝜙𝑤(𝑥)𝑖=𝑤𝑐𝑖𝑐𝑇𝑥𝑘𝑐𝑇𝑥+𝜖𝑞𝑗=1𝐴(𝑗)(𝑥)1𝐴𝑖(𝑗),(3.2)2𝜙𝑤(𝑥)𝑖𝑘=𝑤𝑐𝑖𝑐𝑘𝑐𝑇𝑥𝑘𝑐𝑇𝑥+𝜖2+𝑞𝑗=1𝐴(𝑗)(𝑥)1𝐴𝑖(𝑗)𝑇𝐴(𝑗)(𝑥)1𝐴𝑘(𝑗).(3.3)

Remark 3.1. Note that in the definition of 𝜙𝑤(𝑥) in (3.1), weight [1,1,,1] is used on the original LMI constraints. The weight 𝑤 is used only on the new cutting constraint in order to push our point toward the optimal solution. (see Algorithm 2).

alg2
Algorithm 2: SDP-CUT.

If successful, SDP-CUT terminates with an optimal solution 𝑥cut and optimal objective function value 𝑝cut. Rather than moving the plane in Step 1 of SDP-CUT, the point 𝑥𝑘 could also be moved in the direction of 𝑐 to obtain a different starting point 𝑥𝑘𝛾𝑐 for some small 𝛾>0, instead of 𝑥𝑘. However, this can be problematic as we approach the optimal solution and the feasible region gets small. The point 𝑥𝑘𝛾𝑐 may pass over the entire remaining feasible region and thus fail to be feasible. If instead, we move the plane as originally suggested in SDP-CUT, 𝑥𝑘 will always be strictly feasible. Note that since the objective is to maximize 𝑐𝑇𝑥 and 𝑥𝑘 is an interior point, then 𝑐 is a feasible direction from 𝑥𝑘.

We denote by WAC-NEWTON*, the WAC-NEWTON algorithm applied to 𝜙𝑤(𝑥) (3.1) for determining 𝑥𝑘+1 in the SDP-CUT algorithm. WAC-NEWTON* will return the weighted analytic center of the current feasible region, which will be our next iterate for SDP-CUT. In WAC-NEWTON*, the stepsize 𝛼𝑙 can be computed in a variety of different ways as discussed in the next section.

3.1. Line Searches: Computing the Newton Stepsize in WAC-NEWTON* Algorithm

We describe different options for computing the Newton stepsizes in the WAC-NEWTON* algorithm. The algorithm first computes the direction vector 𝑠𝑙. The Newton stepsize 𝛼𝑙 determines how far we should move in the direction of 𝑠𝑙 from the point 𝑦𝑙.

The pure Newton’s method uses a constant stepsize 𝛼𝑙=1. We will refer to this method of choosing the stepsize as “CONSTANT.” The CONSTANT algorithm has the advantage of not using computational time in deciding what stepsize to use. The disadvantage of Newton’s method with CONSTANT is that it usually results in the need to perform more iterations of WAC-NEWTON*, and it is possible to move out the feasible region. To get 𝛼𝑙 with exact line search, we solve the one-dimensional optimization problem: minimize{𝑔(𝛼)𝛼>0},(3.4) where 𝑔(𝛼)=𝜙𝑤(𝑦𝑙+𝛼𝑠𝑙). Solving (3.4) is relatively easy and may cut the number of WAC-NEWTON* iterations in SDP-CUT, which are computationally harder in comparison 𝑔(𝛼)=𝜙𝑤𝑦𝑙+𝛼𝑠𝑙𝑐=𝑤log𝑇𝑥𝑘𝑐𝑇𝑦𝑙+𝛼𝑠𝑙+𝜖𝑞𝑗=1𝐴logdet(𝑗)𝑦𝑙+𝛼𝑠𝑙.(3.5) Let 𝑎𝑙=𝑤log[𝑐𝑇𝑥𝑘𝑐𝑇(𝑦𝑙+𝛼𝑠𝑙)+𝜖]. Then 𝑔(𝛼)=𝑎𝑙𝑞𝑗=1𝐴logdet(𝑗)𝑦𝑙+𝛼𝑠𝑙=𝑎𝑙𝑞𝑗=1𝐴logdet(𝑗)𝑦𝑙+𝑛𝑖=1𝛼𝑠𝑙𝑖𝐴𝑖(𝑗).(3.6) Let 𝑀0(𝑗)=𝐴(𝑗)(𝑦𝑙) and 𝑀𝑠(𝑗)=𝑛𝑖=1(𝑠𝑙)𝑖𝐴𝑖(𝑗). Then 𝑔(𝛼)=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)+𝛼𝑀𝑠(𝑗).(3.7) Also, 𝑔(𝛼)=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)+𝛼𝑀𝑠(𝑗)=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)1/2𝑀𝐼+𝛼0(𝑗)1/2𝑀𝑠(𝑗)𝑀0(𝑗)1/2𝑀0(𝑗)1/2=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)1/2𝑀det𝐼+𝛼0(𝑗)1/2𝑀𝑠(𝑗)𝑀0(𝑗)1/2𝑀×det0(𝑗)1/2=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)1/22𝑀det𝐼+𝛼0(𝑗)1/2𝑀𝑠(𝑗)𝑀0(𝑗)1/2=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)𝑀det𝐼+𝛼0(𝑗)1/2𝑀𝑠(𝑗)𝑀0(𝑗)1/2=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)𝑞𝑗=1𝑀logdet𝐼+𝛼0(𝑗)1/2𝑀𝑠(𝑗)𝑀0(𝑗)1/2.(3.8) Let 𝜆𝑖(𝑗) be an eigenvalue of (𝑀0(𝑗))1/2𝑀𝑠(𝑗)(𝑀0(𝑗))1/2. Then 𝑔(𝛼)=𝑎𝑙𝑞𝑗=1𝑀logdet0(𝑗)𝑞𝑚𝑗=1𝑗𝑖=1log1+𝛼𝜆𝑖(𝑗).(3.9) The function 𝑔(𝛼) is convex function over , and the exact stepsize 𝛼𝑙 can be found using the Newton’s method. Equations (3.7) and (3.9) give two different ways of computing 𝑔(𝛼), when finding the exact stepsize. If one decides to use (3.7), the first and second derivatives of 𝑔(𝛼) are required and given by 𝑔(𝛼)=𝑤𝑐𝑇𝑠𝑙𝑐𝑇𝑥𝑘𝑐𝑇𝑦𝑙+𝛼𝑠𝑙+𝜖𝑞𝑗=1𝑀0(𝑗)+𝛼𝑀𝑠(𝑗)1𝑀𝑠(𝑗),𝑔𝑤𝑐(𝛼)=𝑇𝑠𝑙2𝑐𝑇𝑥𝑘𝑐𝑇𝑦𝑙+𝛼𝑠𝑙+𝜖2+(3.10)𝑞𝑗=1𝑀0(𝑗)+𝛼𝑀𝑠(𝑗)1𝑀𝑠(𝑗)𝑇𝑀0(𝑗)+𝛼𝑀𝑠(𝑗)1𝑀𝑠(𝑗)(3.11) If (3.9) is used, the derivatives are simply given by 𝑔(𝛼)=𝑤𝑐𝑇𝑠𝑙𝑐𝑇𝑥𝑘𝑐𝑇𝑦𝑙+𝛼𝑠𝑙+𝜖𝑞𝑚𝑗=1𝑗𝑖=1𝑖(𝑗)1+𝛼𝑖(𝑗),𝑔(3.12)𝑤𝑐(𝛼)=𝑇𝑠𝑙2𝑐𝑇𝑥𝑘𝑐𝑇𝑦𝑙+𝛼𝑠𝑙+𝜖2𝑞𝑚𝑗=1𝑗𝑖=1𝑖(𝑗)21+𝛼𝑖(𝑗)2.(3.13) We will use ELS-MAT to denote the exact line search computations done using (3.7), and ELS-EIG if computations are done from (3.9).

In addition to constant stepsize = 1 and exact line search, another way that was considered in computing the Newton stepsize was backtracking. This method involves starting with a stepsize >= 1 and then, decreasing the stepsize until a stopping condition is met (see [5, 16]). This technique guarantees a sufficient decrease in 𝑔(𝛼), often starting from 𝛼=1, in practice. (see Algorithm 3).

alg3
Algorithm 3: Backtracking.

It is known that Newton’s method converges quadratically close to the solution. In our case, WAC-NEWTON* converges rapidly for starting points that are not close to the boundary of the feasible region 𝑘. Sometimes, we encounter difficulty when the starting point is too close to the boundary. Note that each time we make a new cut, our starting point is near the boundary. It is also the case, when SDP-CUT iterates approaches optimality. When close to the boundary, our direction vector 𝑠 often is very small, and thus a stepsize of 𝛼=1 may not make much progress. So, in these cases, the CONSTANT stepsize may spend many iterations, while making little progress, and each iteration wasting gradient and Hessian computations. Using the ELS algorithm, we find the proper stepsize and move out of this problem area in just one iteration. Consider the plots in Figure 4, which show the situation described above. Here, again, we are using SDP Example 2.3 and the point shown in Figure 2.

fig4
Figure 4: Need for different stepsizes.

As we can see from Figure 4 on the left, the best choice of stepsize is much greater than 1. In Figure 4 on the right, we can see that we have now moved into an area where a stepsize of 1 is reasonable. BACKTRACKING will not help with this problem. BACKTRACKING only helps when the optimal stepsize is less than 1. If we try to adapt BACKTRACKING to help in the circumstance described above, we must initialize the BACKTRACKING stepsize to a large number, which creates two problems. First, when the optimal stepsize is close to 1, or smaller, we will waste time backtracking. Secondly, if we use a large initial stepsize, this may cause BACKTRACKING to send our point outside the feasible region, causing our algorithm to diverge. As it will be further shown in the next section, it does appear that ELS has a positive effect on SDP-CUT, while BACKTRACKING does not. Figure 5 has plots comparing how far CONSTANT and ELS move our point towards the optimal solution at each iteration. The plots highlight the “wasted iteration” problem with CONSTANT, which occurs when the optimal stepsize ismuch great than 1.

fig5
Figure 5: The “wasted iteration” problem.

4. Experiment I: SDP-CUT Implementations

We have implemented SDP-CUT in four ways, varying in how the Newton stepsize is computed: CONSTANT, ELS-MAT, ELS-EIG, and BACKTRACKING. We will compare the performance of these four implementations against various test problems. For each problem, the SDP-CUT parameters were set at 𝜖=106, STOL=1012, WTOL=1010, 𝑤=7 and MAX=100.

4.1. Performance as 𝑞 Varies

Here, we left the number of variables constant at 𝑛=2 and varied the number of constraints 𝑞 from 2 to 10. The matrices have their sizes fixed at 𝑚𝑗=5 for all 𝑗. For each value of 𝑞, 10 SDP problems were randomly generated, and SDP-CUT was used to solve each problem. The number of SDP-CUT iterations (which we also call WAC-NEWTON* iterations) and run time for SDP-CUT were recorded for each problem. Figures 6 and 7 give plots of the averages with 𝑞 on the horizontal axis. The vertical axis in Figure 6 contains the total SDP-CUT iterations. In Figure 7, the vertical axis gives the total run time of SDP-CUT in seconds.

946893.fig.006
Figure 6: SDP-CUT iterations against the number of constraints 𝑞.
946893.fig.007
Figure 7: Time as it varies with the number of constraints 𝑞.

Figure 6 shows there is not a correlation between the number of LMI constraints and the number of SDP-CUT iterations (WAC-NEWTON* iterations) needed to find the optimal solution. Figure 6 also shows that ELS-EIG and ELS-MAT both effectively reduce the number of SDP-CUT iterations needed, while BACKTRACKING has about the same number of iterations as CONSTANT. Figure 7 shows that in the case of 𝑛=2, ELS-MAT and ELS-EIG run the fastest, while BACKTRACKING runs far slower than anything else. Time increases as 𝑞 increases due to the gradient and Hessian computations. As we can see in formulas (3.2) and (3.3), for each constraint, we must perform matrix inverses, dot products, and multiplications. This makes each iteration computationally harder, but does not affect the iterations needed as is seen in Figure 6. BACKTRACKING’s time increases the most with 𝑞 due to the eigenvalue computation needed for the stopping condition. BACKTRACKING performs these extra computations, but SDP-CUT does not benefit from them, and consequently BACKTRACKING has the highest run time.

4.2. Performance as 𝑛 Varies

Here, we left the number of constraints constant at 𝑞=3 and varied the number of variables 𝑛 from 2 to 10. The matrices sizes were set at 𝑚𝑗=5 for all 𝑗. For each value of 𝑛, 10 SDP problems were randomly generated, and SDP-CUT was used to solve each problem. The number of SDP-CUT iterations (WAC-NEWTON* iterations) and run time for SDP-CUT were recorded for each problem. Figures 8, 9, and 10 are plots of the averages with 𝑛 on the horizontal axis. The vertical axis contains the SDP-CUT iterations (WAC-NEWTON* iterations) needed in Figure 8. The vertical gives the run time of SDP-CUT in seconds in Figure 9. There is also a third plot, Figure 10, which shows 𝑛 on the horizontal axis and the average time required to perform an iteration of SDP-CUT on the vertical axis.

946893.fig.008
Figure 8: SDP-CUT iterations against the number of variables 𝑛.
946893.fig.009
Figure 9: SDP-CUT time as it varies with the number of variables 𝑛.
946893.fig.0010
Figure 10: Time required to perform an iteration of SDP-CUT.

Figure 8, again, shows that ELS-EIG and ELS-MAT are effective in reducing the number of SDP-CUT iterations needed, while BACKTRACKING fails to reduce the number of iterations needed. Figure 8 also shows a positive correlation between the number of SDP-CUT iterations (WAC-NEWTON* iterations) and 𝑛. From Figure 9, we see that the exact line search reduces the total time needed for our algorithm, especially as 𝑛 gets larger. This is important, because as 𝑛 grows, the number of SDP-CUT iterations and the time required per iteration both increase (see Figure 10). For small 𝑛, CONSTANT is quicker per SDP-CUT iteration, but as 𝑛 grows there is no noticeable difference in the time per iteration between CONSTANT or either ELS implementation. This is because the computations needed for the optimization problem (3.4) becomes so small compared to the computations need to compute 𝜙𝑤(𝑥) and 2𝜙𝑤(𝑥).

4.3. Performance as 𝑚 Varies

Here, we left the number of constraints constant at 𝑞=2 and the number of variables fixed at 𝑛=2. We varied the matrix sizes 𝑚𝑗=𝑚 from 5 to 60. The number of SDP-CUT iterations (WAC-NEWTON* iterations) and run time were recorded for each problem. Below are plots of the averages with 𝑚 on the horizontal axis. The vertical axis in Figure 11 contains the total SDP-CUT iterations (WAC-NEWTON* iterations) used to solve the problem. In Figure 12, the vertical axis gives the total run time of SDP-CUT in seconds. There is also a third plot, Figure 13, which shows 𝑚 on the horizontal axis and the time required to perform an iteration of SDP-CUT on the vertical axis.

946893.fig.0011
Figure 11: SDP-CUT iterations with matrix size.
946893.fig.0012
Figure 12: SDP-CUT time as it varies with matrix size.
946893.fig.0013
Figure 13: Time required to perform an iteration of WAC-NEWTON*.

Figure 11, again, shows that ELS reduces the required number of SDP-CUT iterations (WAC NEWTON* iterations). However, as the size of the constraint matrices becomes large, the computations needed for the exact line search grow. Figure 12 shows the ELS-MAT is the fastest in the range of matrix sizes we tested. Unlike our other experiments, CONSTANT beats ELS-EIG. The reason for this is that the time per iteration of ELS-EIG grows very rapidly as is seen in Figure 13. This occurs because the computation of eigenvalues becomes increasingly difficult as the matrix grows in size.

ELS appears to be the best implementation of SDP-CUT. ELS is very effective in reducing the number of WAC NEWTON* iterations needed. ELS-MAT outperformed ELS-EIG in general. The eigenvalues allow for fast computation of the gradient and Hessian of 𝑔(𝛼) once the eigenvalues are obtained (see (3.12) and (3.13)). However, the large cost of computing the eigenvalues outweighs this benefit, especially as the matrices become large. ELS-MAT uses a slower means of computing the gradient and Hessian of 𝑔(𝛼), but has no eigenvalue calculation overhead (see (3.10) and (3.11)). The slowest operation needed is matrix inversion. Matrix inversion becomes harder as the matrices get large, but it scales much better than finding the eigenvalues. It is conceivable that very large matrix situations could arise in which both ELS-EIG and ELS-MAT are too expensive; in this case it may be best to use CONSTANT. We were unable to find any cases in our test problems in which BACKTRACKING was beneficial. From now onwards, SDP-CUT will always refer to SDP-CUT with ELS-MAT implementation.

4.4. Convergence and the Weight 𝑤, Leveling Off Effect

Increasing the weight 𝑤 will decrease the number of SDP-CUT iterations needed to find the optimal solution. A larger weight will also allow us to get closer to the optimal solution. However, if 𝑤 becomes too large, numerical errors will prevent convergence. Here, we will discuss the role 𝑤 plays in the rate of convergence, as well as how large we can make 𝑤 before SDP-CUT fails. Figure 4 graphically demonstrates the convergence of SDP-CUT on our example (SDP (2.3)) as the weight 𝑤 varies.

From the plots in Figure 14, we can visibly see that by increasing the weight, SDP-CUT moves toward the optimal solution faster. In attempts to see how fast SDP-CUT converges and what effect the weight has, we performed the following experiment. We ran SDP-CUT on the example and kept track of the iterates 𝑥𝑘 at each iteration. Below is a plot of 𝑥𝑥𝑘 shown on a log scale against iterations as we varied the weight 𝑤=1,3,5,7,9, where 𝑥 is the optimal solution. We see in Figure 15 that the distance from the estimate to the actual solution decreases exponentially, and linearly when shown on the log scale. We found that in most cases our implementation of SDP-CUT breaks down around 𝑤=10.

fig14
Figure 14: Convergence of SDP-CUT for 𝑤=1 and 𝑤=5.
946893.fig.0015
Figure 15: Convergence for various weights 𝑤 on a log scale.

As we can see, increasing the weight means quicker convergence. However, Figure 15 only shows the first 10 iterations. As we approach the optimal solution, the rate of convergence slows and “levels off.” This is caused by the feasible region becoming very small and by moving the cutting constraint back by 𝜖, and thus the SDP-CUT iterates stay in approximately the same place. This problem is demonstrated in Figure 16. We see that larger weights allow us to get closer to the optimal solution, but even with a larger weight our convergence “levels off.”

946893.fig.0016
Figure 16: The “leveling off” effect.

One way to get closer to the optimal solution before the weighted analytic centers start to “levels off,” is to use a smaller 𝜖. The difficulty with decreasing 𝜖 is if 𝜖 becomes too small, our SDP-CUT iterates come too close to the boundary, giving rise to “numerical problems,” especially in computing the Hessian matrices. We found that SDP-CUT worked with =106, but failed when 𝜖=107 in example SDP (2.3). The “numerical problems” are due to the fact that near the boundary, one of the LMI constraint matrices is near-singular, and we need to compute matrix inverses. Figure 17 shows what the “leveling off” effect looks like over the feasible region of SDP problem (2.3). Notice that the sequence of iterates 𝑥𝑘, from SDP-CUT, reaches a limit before it reaches the boundary of the region.

946893.fig.0017
Figure 17: The “leveling off” effect zoomed in on feasible region.

5. Experiment II: SDP-CUT versus SDPT3

In this section, the algorithms SDP-CUT and the well-known SDPT3 [17] method are compared on a variety of SDP test problems. Since SDPT3 is known to be efficient, we also used it to find the best possible estimation of the actual optimal objective function values. Another reason for using SDPT3 is its flexibility to allow the user to input a starting point.

For the experiment, we tested 20 SDP problems and solved each with SDP-CUT and SDPT3. Each SDP problem was randomly generated, where the dimension 𝑛 and number of constraints 𝑞 are random integers on the intervals [2,20] and [1,20] respectively. The size 𝑚𝑗 of each LMI is a random integer on [1,10]. The values of 𝑛 and 𝑞 are listed in the table below. The values of 𝑚𝑗 are not given because they are too many. For example, SDP 1 has 𝑚=[7,5,9,8,5,1]. The problems were generated in a way that makes the origin, an interior point for all of them. For all problems, we used tolerances STOL=103, WTOL=106, 𝜖=107, and a weight of 𝑤=7 was used with SDP-CUT. For the first 10 test problems, the iteration limit was MAX=5. For the remaining 10 problem, the iteration limit is raised to MAX=100. We record the total iterations performed for each algorithm as well as the time taken. The last two columns of the table below show the error in the calculated optimal objective function value compared to the actual value 𝑝actual obtained from SDPT3 with no restriction on the number of iterations. In the table below, the suffixescut and SDPT3 are used to distinguish between values pertaining to SDP-CUT and SDPT3, respectively. 𝑁, 𝑇, and 𝐸 are used to denote the number of iterations, time, and absolute error, respectively. For example, 𝐸cut=|𝑝cut𝑝actual|.

We see in Table 1 that after 5 iterations, SDP-CUT gives a more accurate answer in all the first 10 test problems. However, if allowed to run for more iterations all the algorithms successfully found the optimal solutions, but SDPT3 took less number of iterations to reach optimality in 6 out of the 10 problems. Also SDPT3, took less time in all the 20 test problems. We believe the difference in times is partly because SDPT3 has over the years been optimized to run efficiently, while SDP-CUT is at a beginning of its development. It is interesting to note that SDP-CUT took fewer iterations in 4 out of the 10 problems.

tab1
Table 1: Data for experiment II: SDP-CUT versus SDPT3.

We need to do more work to make SDP-CUT more efficient. We noticed that the Hessian matrix 2𝜙𝑤(𝑥) in SDP-CUT is prone to errors in the case of problems that are very large. In those cases, 2𝜙𝑤(𝑥) loses its positive definiteness and thus its Cholesky factorization is not possible. That is the main reason why we could not run SDP-CUT on the SDP test problems at SDPLIB [18]. For example, SDP-CUT was not successfully in solving 𝑡𝑟𝑢𝑠𝑠2 of SDPLIB, due to numerical problems when computing the Hessian matrices. This problem has 𝑛=58 variables and 𝑞=133 constraints. However, SDP-CUT was successful in solving 𝑡𝑟𝑢𝑠𝑠1, but with 𝑤=5.

6. Convergence of SDP-CUT

Let {𝑥𝑘} be the sequence of estimates for 𝑥 generated by SDP-CUT. Let 𝑘 be the feasible region after 𝑘-cuts. Thus, 𝑥𝑘 is the weighted analytic center of 𝑘 for all 𝑘>0. Let {𝜕𝑘} be the sequence defined by 𝜕𝑘=𝑐𝑇(𝑥𝑘𝑥𝑘+1).

The following theorem shows that when ||𝑐|| and ||𝑥𝑘𝑥𝑘+1|| are both small, then 𝜕𝑘 is also small. Note that when 𝜕𝑘 becomes very small, SDP-CUT is no longer making progress on the objective function values. This is especially true in the case of “leveling off” effect discussed in Section 4. This also indicates that our stopping criterion 𝑥𝑘𝑥𝑘+1<STOL does not always mean convergence to the optimal solution.

Theorem 6.1. If 𝑥𝑘𝑥𝑘+1<STOL, then 𝜕𝑘<𝑐STOL.

Proof. Assume 𝑥𝑘𝑥𝑘+1<STOL and use the Cauchy-Schwartz Inequality 𝜕𝑘=𝑐𝑇𝑥𝑘𝑥𝑘+1𝜕𝑘𝑥𝑐𝑘𝑥𝑘1𝜕𝑘<𝑐STOL.(6.1)

How close we can get to the optimal solution depends on the parameters 𝑤 and 𝜖. As 𝜖 becomes smaller we are able to obtain smaller regions, which allows SDP-CUT to get closer to the optimal solution. For a given region, a larger 𝑤 causes SDP-CUT push the estimate of the optimal solution 𝑥𝑘 further in the 𝑐 direction. For any choice of iterate 𝑥𝑘, if we consider its limit as , we should be able to push 𝑥𝑘 as close to the optimal solution 𝑥 as we like. In practice, both the decrease 𝜖0 and 𝑤 cause numerical problems. A question we propose is: in the limit as 𝑤, how many iterations of SDP-CUT are needed for convergence?

Since the cutting plane is a linear constraint, it is possible that only one iteration is needed as 𝑤 (at least in some cases). Consider the following SDP (LP) example: SDP:minimize𝑥2subjectto1𝑥101+𝑥101𝑥201+𝑥20.(6.2)

The feasible region of this example is given in Figure 18. Consider the point (0,0) in the feasible region and the cutting constraint through this point. Also, consider the corresponding barrier function 𝜙𝑤(𝑥) (3.1) and its gradient 𝜙𝑤(𝑥) (3.2). At the weighted analytic center, the gradient is zero. This gives 11𝑥111+𝑥1𝑤=0,𝜖𝑥2+11𝑥211+𝑥2=0.(6.3) It is clear from Figure 18 that the weighted analytic center has a negative 𝑥2 value. So, the above equations show that the weighted analytic center is 𝑥1𝑥=0,2=𝜖𝜖2+2𝑤+𝑤2.2+𝑤(6.4) We see that the weighted analytic center (𝑥1,𝑥2) converges to the optimal solution (0,1) as 𝑤. This example suggests the following conjecture.

946893.fig.0018
Figure 18: Feasible region of LP and the cut through the point (0,0).

Conjecture 6.2. SDP-CUT converges to 𝑥 as 𝑤.

6.1. Scaling the Vector 𝑐

Theorem 6.1 shows the dependence of 𝜕𝑘 on ||𝑐||. Here, we will consider the effect of scaling 𝑐 on SDP-CUT. If we scale 𝑐 by 𝜈>1 such that we are now trying to optimize 𝜈𝑐𝑇𝑥, we do not change the optimal solution 𝑥. We will compare how close SDP-CUT gets when optimizing for various values of 𝜈. Figure 19 is a plot of the results ||𝑥𝑥cut|| on the vertical axis and the scaling factor 𝜈 is on the horizontal axis. We used example SDP (2.3) with STOL=1012, MAX=100, 𝑤=10. We can see in Figure 19 that scaling the objective vector 𝑐 by a factor 𝜈>1 does have a positive effect on the accuracy of SDP-CUT.

946893.fig.0019
Figure 19: The effect of scaling 𝑐.

7. Conclusion

We have developed and studied a new interior point method, SDP-CUT, for solving SDPs. We found that implementing SDP-CUT using Newton’s method with exact line search, in general, gives the best performance. SDP-CUT appears to be very effective in getting into the neighborhood of the optimal solution as seen in problems 1–10 in Table 1. However, there is the “leveling off” effect shown in Figure 16. Future work on SDP-CUT can involve investigating Conjecture 6.2. Implementations that can handle a larger weight 𝑤 or smaller 𝜖 will result in better performance.

Another area for improvement is the stopping criteria. Currently, the stopping criteria is ||𝑥𝑘𝑥𝑘1||<STOL, which lets us know if we are no longer making progress. However, this does not always guarantee optimality of the solution. We have also shown that scaling the objective vector 𝑐 by some factor 𝜈>1 could allow SDP-CUT to give a better solution. Also, it is important to investigate how to make all the other components of SDP-CUT work well and efficiently. For example, we would like to find a way to compute the Hessian 2𝜙𝑤(𝑥))𝑖𝑘 more efficiently and without errors. The way we compute the stepsizes in Newton’s method is an area that requires further attention. The exact line search we preferred over the others is probably too expensive and a better way may be possible. These improvements would allow us to test SDP-CUT on larger problems and compare it with other methods. It is also of interest to study what kind of problems SDP-CUT would work well or fail.

Acknowledgment

The first author would like the thank the National Science Foundation (NSF) for the opportunity to start initial work on this paper through their funding of the Research Experience for Undergraduates (REU) program at Northern Arizona University.

References

  1. S. Jibrin, Redundancy in Semidefinite Programming: Detection and Elimination of Redundant Linear Matrix Inequalities, VDM, Saarbrucken, Germany, 2009.
  2. L. Vandenberghe and S. Boyd, “Semidefinite programming,” SIAM Review, vol. 38, no. 1, pp. 49–95, 1996. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  3. L. Vandenberghe and S. Boyd, “Applications of semidefinite programming,” Applied Numerical Mathematics, vol. 29, no. 3, 1999. View at Google Scholar
  4. F. Alizadeh, “Interior point methods in semidefinite programming with applications to combinatorial optimization,” SIAM Journal on Optimization, vol. 5, no. 1, pp. 13–51, 1995. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  5. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, New York, NY, USA, 2004.
  6. R. J. Caron, T. Traynor, and S. Jibrin, “Feasibility and constraint analysis of sets of linear matrix inequalities,” INFORMS Journal on Computing, vol. 22, no. 1, pp. 144–153, 2010. View at Publisher · View at Google Scholar
  7. J. W. Chinneck, “The constraint consensus method for finding approximately feasible points in nonlinear programs,” INFORMS Journal on Computing, vol. 16, no. 3, pp. 255–265, 2004. View at Publisher · View at Google Scholar
  8. W. Ibrahim and J. W. Chinneck, “Improving solver success in reaching feasibility for sets of nonlinear constraints,” Computers & Operations Research, vol. 35, no. 5, pp. 1394–1411, 2008. View at Publisher · View at Google Scholar
  9. R. E. Gomory, “Outline of an algorithm for integer solutions to linear programs,” Bulletin of the American Mathematical Society, vol. 64, no. 5, pp. 275–278, 1958. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  10. J. E. Kelley, Jr., “The cutting-plane method for solving convex programs,” Journal of the Society for Industrial and Applied Mathematics, vol. 8, no. 4, pp. 703–712, 1960. View at Google Scholar
  11. E. W. Cheney and A. A. Goldstein, “Newton's method for convex programming and Tchebycheff approximation,” Numerische Mathematik, vol. 1, no. 1, pp. 253–268, 1959. View at Publisher · View at Google Scholar
  12. J. Renegar, “A polynomial-time algorithm, based on Newton's method, for linear programming,” Mathematical Programming, vol. 40, no. 1–3, pp. 59–93, 1988. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  13. I. S. Pressman and S. Jibrin, “The weighted analytic center for linear matrix inequalities,” Journal of Inequalities in Pure and Applied Mathematics, vol. 2, no. 3, article 29, 2002. View at Google Scholar · View at Zentralblatt MATH
  14. C. B. Chua, “A new notion of weighted centers for semidefinite programming,” SIAM Journal on Optimization, vol. 16, no. 4, pp. 1092–1109, 2006. View at Publisher · View at Google Scholar · View at Zentralblatt MATH
  15. S. Jibrin and J. W. Swift, “The boundary of weighted analytic centers for linear matrix inequalities,” Journal of Inequalities in Pure and Applied Mathematics, vol. 5, no. 1, article 14, 2004. View at Google Scholar · View at Zentralblatt MATH
  16. J. Nocedal and S. J. Wright, Numerical Optimization, Springer, New York, NY, USA, 2nd edition, 2006.
  17. K. C. Tutuncu and M. J. Todd, On the Implementation and Usage of SDPT3-a Matlab Software Package for Semidefinite-Quadratic-Linear Programming Version 4, 2006.
  18. B. Borchers and L. Vandenberghe, “SDPLIB 1.2, a library of semidefinite programming test problems,” Optimization Methods and Software, vol. 11-12, no. 1–4, pp. 683–690, 1999. View at Publisher · View at Google Scholar