Abstract

We present an efficient method for solving linearly constrained convex programming. Our algorithmic framework employs an implementable proximal step by a slight relaxation to the subproblem of proximal point algorithm (PPA). In particular, the stepsize choice condition of our algorithm is weaker than some elegant PPA-type methods. This condition is flexible and effective. Self-adaptive strategies are proposed to improve the convergence in practice. We theoretically show under mild conditions that our method converges in a global sense. Finally, we discuss applications and perform numerical experiments which confirm the efficiency of the proposed method. Comparisons of our method with some state-of-the-art algorithms are also provided.

1. Introduction

In this paper, we consider the following generic convex programming: where is a convex (not necessary smooth) function, , , and is a closed convex set. Problem (1) generalizes a wide range of problems that frequently arise in signal and image processing and reconstruction, mechanics, statistics, operations research, and other fields, for example, basis pursuit [14], nearest correlation matrix [57], matrix completion problem [2, 3, 810], and so forth. Before we begin, some assumptions should be presented for problem (1).

Assumption 1. The solution set of (1) is denoted by , and it is assumed to be nonempty.

Assumption 2. The objective function is simple. This means that, for a given constant , the following proximal problem admits a closed-form solution or can be solved efficiently with high precision: where is any given vector. At first sight, this assumption seems to be quite restrictive, but this is indeed for many problems in practice. For example, nuclear norm function in matrix completion problem, -norm function in basis pursuit problem, and so forth.

Many fundamental methods have been developed over the past decades to solve problem (1). Proximal point algorithm (PPA) is one of the leading approaches for solving convex optimization problems. It is earlier used for regularized linear equations and has been applied to convex optimization by Martinet [11]. There are some significant theoretical achievements [1219] in the field of PPA for convex optimization and monotone variational inequalities (VIs). Nowadays, it is still the object of intensive investigation [20] and leads to a variety of primal and dual methods. Common to PPA and its variants is the difficulty of their subproblems; this restricts the practical interest. Augmented Lagrangian method (ALM) [21] is a powerful method for linearly constrained problems. It can be regarded as a variant of PPA applied to the dual problem of (1). However, with the additional regularized term , its subproblems require an inverse operator of the form which is hard to implement in some cases. Particularly, is general or large scale, so the computation of inverse operator may fail. Hence, ALM is not sufficiently competitive when the objective function is “simple.” Extragradient method (EGM) [22] is a practical method for (1) which employs the information of current iteration. In fact, EGM is an explicit type method and requires two calls to the gradient per iteration; therefore, it might suffer slow convergence. Recently, He and Yuan [23] proposed a contraction method based on PPA (PPA-CM) to solve (1), which is elegant and simple. Inspired by PPA-CM, a Lagrangian PPA-based (LPPA) contraction method is presented in [24] which employs an asymmetrical proximal term [25]. These two PPA-based methods have nice convergence properties that are similar in many ways to PPA, but they both require a quite restrictive condition for convergence and are sensitive to the initial choice of stepsizes.

In this paper, we focus on development of PPA-type method for solving (1). Based on LPPA, we propose a general self-adaptive relaxed-PPA method (SRPPA) which is simple yet efficient. Our algorithm capitalizes certain features of PPA, hence, we term it relaxed-PPA. The proposed algorithm has several nice fronts. First, our method is a PPA-type method with asymmetrical linear term, which is clearly a different nature to classical PPA. It relaxes the jointly structure of subproblem to a tractable one. Second, we provide two simple search directions for new iterate. Third, the stepsize choice is flexible, and the condition for convergence guarantee is weaker than both PPA-CM and LPPA. Finally, simple adaptive strategies are employed to choose stepsize, and this appealing aspect is significantly important in practice. We also demonstrate that our method is relevant for various applications whose practical success is made possible by our efficient algorithm.

This paper is organized as follows. In Section 2, we provide some notations and preliminaries which are useful for subsequent analysis. In Section 3, we review some related works. The general relaxed-PPA and its variant are formally presented in Section 4. Self-adaptive strategies to choose stepsize are also described. Next, in Section 5, the convergence analysis is provided. In Section 6, we present some concrete applications of (1) and elaborate on the implementation of our method; preliminary numerical results are also reported to verify the efficiency of our proposed method. Finally, in Section 7, we conclude the paper with a discussion about the future research directions.

2. Preliminaries

In this section, we first establish some important notations used throughout this paper. Then, we describe the variational inequality formulation of (1) which is convenient for the convergence analysis.

denotes the subdifferential set of the convex function : and is called a subgradient of , see [26]. Let and , by the convexity of the function , we have which indicates that the mapping is monotone.

Now, we show that (1) can be characterized by a variational inequality; see, for example, [27]. By attaching a Lagrange multiplier vector to the linear constraint , the Lagrangian function of (1) is here, and is defined on . Then, by the optimality condition, we can easily see that (1) amounts to finding a pair of which satisfies where . Denoting the system (7) can be characterized by the following variational inequality denoted by : Recalling the monotonicity of , it is easy to get that (9) is monotone. Since the solution set of (1) is assumed to be nonempty, the solution set of , denoted by , is also nonempty. Our analysis will be built upon this equivalent VI formulation.

There are basically two lines of research for (9), either deal with it by implicit methods that are in general computationally intractable or concentrate on relaxing it with explicit methods. In this section, we first briefly review the well-known classical PPA and EGM. And then, PPA-CM [23] and LPPA [24] are discussed, which will provide motivation for our general self-adaptive relaxed-PPA.

3.1. Classical PPA for the Equivalent Variational Inequality

PPA and its variants are implicit methods which have fast asymptotical convergence rate and produce highly accurate solutions. At each iteration, the subproblem of classical PPA consists of a regularized term, which can be expressed as follows: given any iterate , find such that Then, the update step is taken as follows: PPA has a nice convergence property Although classical PPA is conceptually appealing, subproblem (10) presents certain computational challenges. More specifically, primal variable and dual variable are tied together, and their subproblems are treated as a joint problem. In most cases, this joint subproblem may be as difficult as the original problem (9). As a result, PPA is “conceptual” rather than implementable. It lacks capability in exploiting potential decomposable/specific structure of subproblem. Variants of classical PPA have been explored in the literature, in order to solve the proximal subproblem (10), inexactly, see, for example, [14, 15, 17, 19]. Unfortunately, inexact variants take expensive computation for obtaining approximative solutions.

3.2. The Methods Based on the Simplest Relaxation

To overcome the drawbacks of the classical PPA, it is instinctive to relax subproblem (10) to a solvable one. The most straightforward and simplest relaxation is to replace with in the proximal subproblem (10), which amounts to the following subproblem: The solution of the relaxed problem (13) is given by . It is clear that methods with such relaxation are explicit type methods. However, cannot be accepted directly as the new iterate. Using the terminology “predictor-corrector,” such point can be viewed as a predictor. Here, we list two simple methods which employ predictor to obtain corrector as the new iterate. (i)The extragradient method (EGM) updates the new iterate (corrector) by (ii)The projection and contraction methods (PCM) [2830] perform update as follows: where The sequence generated by the above mentioned EGM or PCM satisfies which is similar to PPA. Both EGM and PCM use the simplest relaxation to obtain in th iteration, hence are computationally practical. These methods have appealing practical aspects; however, such simplest relaxation does not exploit the inner structure of (9). This observation prompts the need for dedicated relaxations.

3.3. PPA-Type Contraction Method

The algorithms that are closely related to ours are PPA-CM [23] and LPPA [24]. The PPA-CM obtains the predictor by solving the following subproblem: find such that where And perform the update The framework of LPPA is as follows: where And the new iterate is defined by Both procedures are simple and can solve subproblem efficiently; but their nice convergence results require a quite restrictive condition, that is; in PPA-CM and in LPPA, respectively. The stepsizes are directly determined by such condition; hence, it is important to estimate . Overestimation may lead to poor stepsizes and slow convergence, while underestimation may result in divergence. In addition, they are both quite sensitive to the choice of . To overcome those drawbacks, we propose a general self-adaptive relaxed-PPA, and as mentioned earlier, it can provide improved guarantee for convergence and has potential progress in the choice of stepsize. Furthermore, self-adaptive strategies are designed to improve performance.

4. The General Self-Adaptive Relaxed PPA-Method

In this section, we weave together the ideas of the previous section to present general self-adaptive relaxed-PPA method (SRPPA) which is mostly inspired by LPPA [24]. At first sight, the predictor applied in SRPPA is much the same as LPPA, but the stepsize choice condition for convergence is quite different; moreover, we prove that it is weaker than LPPA. Self-adaptive strategies are elaborately designed to ensure the robustness of our algorithm. Two simple yet efficient constructions of new iterate are also presented which will provide some inspirations for designing various search directions.

4.1. General Relaxed-PPA Method

The general primal-dual relaxed-PPA method with implementable structure for (1) is summarized in Algorithm 1. Note that the order of and can be changed to obtain a variant, which is summarized in Algorithm 2. Our relaxed-PPA is intended to blend the implementable properties of EGM (or PCM) with the fast convergence performance of PPA. Now, it is helpful to introduce additional notations that will be used in the rest of this paper. Let be a positive symmetry definite matrix (we will specify it later), The relaxed-PPA described here involves two steps. First, we solve the relaxed subproblem (∗), (∗∗) to obtain predictor, which is nice and efficient for the nature of the problem under study. Note that the -predictor step (∗) involves minimizing plus a convex quadratic function, and under Assumption 2, it can be efficiently solved or it admits a closed form solution. And then, -predictor step (∗∗) is just a projection onto which is tractable and computationally efficient. It is clear that the prediction step employs a Gauss-Seidel manner to update information efficiently. The correction step ($) only involves matrix-vector multiplication which is very simple and straightforward.

Step  1. Initialization. Let and pick , set .
Step  2. Predictor. Let
      ,     (*)
and
          .                (**)
Step  3. If the stepsizes and are chosen satisfying
          ,                  (#)
then go to Step 4. Otherwise, increase and go back to Step 2.
Step  4. Corrector and updating
          .                ($)
Step  5. Adjustment
   .
Set .

Step  1. Initialization. Let and pick , set .
Step  2. Predictor. Let
           ,                 (†)
and
      .        (††)
Step  3. If the stepsize and are chosen satisfying
           ,                    (‡)
then go to Step 4. Otherwise, increase and go back to Step 2.
Step  4. Corrector and updating
          .               (∧)
Step  5. Adjustment
   .
Set .

Remark 3. We first make some insight into the correction step in Algorithm 1. The obtained plays no direct role as the new iterate. Instead, we need some kind of “corrector” defined in ($). Although matrix in ($) is just a required positive symmetry definite, our goal here is to fully integrate the information of and to construct effective, simple search direction for the corrector. Based on this consideration, we elaborately provide two simple choices of .

Case 1. It is natural to set to induce a simple update form. Then, it is easy to get that

Case 2. Let . This case is a little less intuitive, but it can lead to a simple update form as well as Case 1. The underlying derivation is a little more complicate. Applying in ($), we get

Recalling is a lower triangular matrix, by the fact that its inverse is also a lower triangular, we have

Plugging the previous relationship to (31), we have

In fact, this is a scheme of Gaussian back substitution.

Both cases only involve one matrix-vector multiplication which makes the update form simple. And the computational cost is usually inexpensive.

Remark 4. We now study the subproblem described in (∗), (∗∗), and the stepsize choice condition (#). For easy analysis, we characterize (∗), (∗∗) as the following VI formulation.
Find such that and its compact form We observe that subproblem (32) is similar to (10) in PPA, except for the construction of asymmetry matrix . As mentioned before, (32) is the same as the prediction subproblem in [24]. Even though they are closely related, the stepsize choice here is quite different. We provide more specific and weaker condition for stepsize . It is clear that condition (#) does not need prior knowledge of matrix . Furthermore, it only involves matrix-vector multiplication, and so, it is easy to verify, and it is amenable to large-scale . If fail to meet this convergence condition (#), one should appropriately increase . In the following subsection, we will elaborate on the self-adaptive strategies to increase the stepsizes. At this point, condition (#) may be seen somewhat unmotivated. Some insight into this will be provided later, as we proceed with the convergence analysis. The convergence condition in [24] has a quite different feature: satisfy It is stronger than condition (#). The following lemma is devoted to the proof of this result.

Lemma 5. Let be the sequence generated by Algorithm 1, , , and defined in (24), (25), and (26), respectively. Suppose that condition (33) is satisfied. Then, condition (#) holds immediately.

Proof. Note that Recall that matrix described in Algorithm 1 can be designed in two different cases.
Case  1. If , we immediately have
According to Cauchy-Schwarz inequality, we get
The last inequality follows directly from condition (33).
Case  2. If , by the definition of , we obtain
Then, we get
The first inequality follows from the Cauchy-Schwarz inequality, and the last one follows directly from condition (33).
Note that, in both cases, we have that condition (#) holds if .

Condition (33) is not only stronger than Condition (#), but it also requires that matrix is positive semidefinite, while condition (#) does not. Furthermore, condition (33) may require the explicit expression of or knowledge of . Despite these drawbacks, condition (33) is appealing to the problems in which is known beforehand or easy to compute/obtain. For instance, is small scale, an identity matrix or a projection operator, and so forth. It is clear that both condition (#) and (33) are more flexible than the one in PPA-CM [23]. The most aggressive condition (#) may lead to further improvement in stepsize choice. Moreover, it is worthwhile to notice that condition (#) is elegantly designed and provides with favourable property. In fact, for general matrix , condition (#) also can guarantee convergence.

Remark 6. The update stepsize plays an important role here. In fact, it can be regarded as an optimal stepsize which will be illustrated in the following section.

Remark 7. We should restrict the adjustment in Step 5 of Algorithm 1 within a limited number to avoid divergence.

In Algorithm 1, we carry out the -predictor before performing -predictor. The roles of and are symmetric; hence, sweeping the order will not break the Gauss-Seidel structure. We switch and and obtain a variant of relaxed-PPA with the order of the -predictor step and -predictor step reversed. This variant is illustrated in Algorithm 2. However, there is no a priori information to know which algorithm is superior. Here, we let

4.2. Adaptive Enhancements

Both PPA-CM and LPPA employ fixed stepsizes . Experiments reveal that they will suffer slow convergence when stepsizes are chosen inappropriately. A natural question is, how to choose the proper initial stepsizes . Here, we propose self-adaptive strategies with the goal of improving the convergence in practice, as well as making performance less dependent on the initial choice of stepsizes. Our strategies dynamically incorporate the information of the current iteration to perform more informative choice of stepsizes for the next iteration [31]. When doing so, the algorithm will be adaptive and free from the initial choice. Denote and then, (31) can be rewritten as Under -norm, the quantity can be viewed as a residual for the dual feasibility condition, and can be viewed as a primal residual. These two residuals converge to zero as relaxed-PPA proceeds. Note that And this implies that small values of tend to reduce the primal residual but have potential to enlarge violations of dual feasibility and tend to produce larger dual residual. This observation motivates us to balance primal and dual residuals. When condition (#) fails, we increase stepsizes properly according to the adaptation shown in Algorithm 3.

;
;
else
.

Here, , . This adaptation strategy increases when the dual residual appears large compared to the primal residual and increases when the dual residual seems too small relative to the primal residual .

As mentioned, condition (33) is stronger than condition (#). If one chooses condition (33), our RPPA also converges. It must have predetermined stepsizes satisfying (here, ). However, there is no priority knowledge of the choice of individual or . Here, we can also adjust automatically when choosing condition (33). Intuitively, we consider expansion of the entire residual under -norm: there are three terms involving or , and we intend to balance these terms. For fixed , take ; then Applying (44) into (43), clearly we have Now, we consider adjusting stepsize to balance and and obtain another adaptation strategy (see Algorithm 4).

;
;
else
.

It is worth noting that too many adjustments of stepsizes by Algorithm 4 might cause the algorithm to diverge in practice, and we therefore restrict these adaptations within a limited number of iterations. If one chooses Algorithm 4, there is no need to carry out Step 5 in Algorithm 1 (or Algorithm 2). These techniques embedded into relaxed-PPA automatically choose a “better” stepsize for the next iteration.

5. Convergence Analysis

In this section, we analyze convergence of our primal-dual relaxed-PPA. The convergence analysis of dual-primal scheme can follow a similar procedure.

Let be any solution point, setting in (32) yields Since , we have . Consequently, by using the monotonicity of , the right hand side of (46) is nonnegative, and thus Now, we are writing the update as and Here, can be viewed as a progress function. The following lemma shows that is a lower bound of .

Lemma 8. Let and be defined in (49); then one has

Proof. Let be any solution, from the definition of , we have Applying (47) to the first term of (51) gives Substituting (52) into (51), we immediately obtain the assertion.

We note that is a quadratic function of and it is natural to maximize to obtain an “optimal” : We now show that the “optimal” is bounded above from zero in the following Lemma.

Lemma 9. Let sequence be produced by Algorithm 1, defined in (53); then, one has

Proof. Using the definition of in (53), we have, for all , The inequality follows from condition (#).

Setting in (50) yields Combining Lemmas 8 and 9, we immediately obtain the following convergence theorem.

Theorem 10. Let sequence be produced by Algorithm 1; then one gets

Theorem 11. Let sequence be generated by Algorithm 1. Then, converges to some which is a solution of   (9).

Proof. First, for each , we have It follows from (57) that is a bounded sequence and Consequently, is also bounded. Since , it follows from (58) that Because is bounded, it has at least a cluster point. Let be a cluster point of and let the subsequence converge to . It follows that and consequently, This means that is a solution of . Note that the inequality (57) is true for all solution points of , and hence, we have Since and , for any given , there exists an integer such that Therefore, for any , it follows from (63) and (64) that This implies that the sequence converges to , which is a solution of (9).

6. Applications and Preliminary Numerical Experiments

The general self-adaptive relaxed-PPA (SRPPA) offers a flexible framework for solving many interesting problems. We illustrate our algorithm with different applications: basis pursuit problem, nearest correlation matrix problem. In this section, we describe the results of experiments whose goal is to demonstrate the efficiency of general relaxed-PPA (RPPA) and its self-adaptive version. To that end, we compare RPPA with certain state-of-the-art algorithms on different problems. Our experiments focus on efficiency and speed of convergence and evaluate the methods in terms of their number of iterations and computational times.

All the codes were written by Matlab R2009b version, and all the numerical experiments were performed on a Lenovo desktop computer with Intel (R) Core (TM) i5 CPU with 3.2 GHz and 3.5 GB RAM.

6.1. Basis Pursuit Problem

Basis pursuit (BP) finds signal representations in overcomplete dictionaries by equality-constrained minimization problem. Formally, one solves the problem And here, denotes the norm defined as . BP is a fundamental problem in image processing and modern statistical signal processing, particularly the theory of compressed sensing; see, for example, [14] for intensive study. We now discuss our approach to BP problem of over-complete representations. Our experiments in this subsection use synthetic data which were mainly designed to illustrate the nice performance of our RPPA. The synthetic problem that we test here is similar to the one employed in [32]. We generate the data as follows: matrix is a random matrix, with Gaussian i.i.d. entries of zero mean and variance , with . is the original sparse signal, constructed with nonzero values, randomly from standard normal distribution. We use to generate the measurements as . It is desirable to use test problems that have a precisely known solution. In fact, when is very sparse, it is the solution to the minimization problem (66). Hence, in our synthetic problem, is exactly the solution.

In our first experiment, we compared general RPPA using two different ’s mentioned in Section 4.1. For BP problem, we use condition (#) and Algorithm 3. Since constructed here is a general random matrix, and when is large scale, might be obtained costly. A simple stopping criterion was used in this experiment, and the stopping tolerance Tol was set to . In all the tests, initial stepsizes were set as , , the primal variable was initialized as zeros, and the dual was ones in Matlab. Table 1 summarizes the performance of general SRPPA. Here, SPDRPPAG-I(SPDRPPAG-II) denotes self-adaptive primal-dual RPPA with Algorithm 3 (Algorithm 4), , if , and , if . DPRPPA (DPRPPA) denotes dual-primal RPPA version.

Basically, SRPPAs converge very quickly and achieved tight error in a few hundred iterations. For this experiment, one can see that SDPRPPAG1-I is fastest in all cases. Both SDPRPPAG1-I and SPDRPPAG1-I are Gaussian type methods, with , and they exhibit very similar performance. SDPRPPAG2-I and SPDRPPAG2-I with are Gaussian back substitute form methods and perform a little slower than Gaussian type methods. We also plot a figure to graphically illustrate the performance of four SRPPAs. Figure 1 shows the results from the test with and , depicting error versus CPU time. Quality-wise, SPDRPPAG1-I was on par with SDPRPPAG1-I.

In the second experiment, we compare the performance of SPDRPPAG1-I with TFOCS (source code can be found at http://cvxr.com/tfocs/) [32], ADMM (source code can be found at http://www.stanford.edu/~boyd/papers/admm/) [33], and PPA-CM. To make the comparison independent of the stopping criterion for each algorithm, we first run TFOCS to get its solution and set a benchmark error and then run other algorithms until they obtain smaller errors than this benchmark. TFOCS was stopped upon Since we found that is small enough to guarantee very high accuracy, we set in TFOCS. The parameters of TFOCS and ADMM were taken with their defaults. To guarantee the convergence, fixed stepsizes were set to , for PPA-CM. In SPDRPPAG1-I, we also choose the same convergence condition (#) and initial step size , as the previous experiment. We varied the size of from () to . The results of this experiment are summarized in Table 2. There, we report the run time in seconds, the number of iterations, and the error of the recovery solution. In Table 2, “—” means “out of memory.”

We observe from Table 2 that four algorithms reach high accuracy around . SPDRPPAG1-I is about two times faster than the first-order method implemented in the TFOCS package, and moreover, it usually outperforms TFOCS in terms of iterations. For medium size problems, SPDRPPAG1-I is clearly faster than ADMM. Even for small size problems, SPDRPPAG1-I shows its superior performance. The main reason lies in that ADMM computed to solve its subproblem exactly which would take expensive computational cost. Not surprisingly, the general SPDRPPAG1-I performs better than the primary PPA-CM. Here, the total iterations of SPDRPPAG1-I are less than of PPA-CM. As we have mentioned, “optimal” update stepsize and more flexible condition for convergence may provide SPDRPPAG1-I improved performance. SPDRPPAG1-I is faster than PPA-CM in terms of CPU times. However, the superiority of CPU time is not so significant as iteration number. For the cases , it is just about of PPA-CM. This is not particularly surprising; compared to PPA-CM, SPDRPPAG1-I has to take extra computation for convergence condition and “optimal” in each iteration.

6.2. Nearest Correlation Matrix Problem

The nearest correlation matrix problem is solving the problem where is the vector whose entries are all , denotes the cone of positive definite symmetric matrices, is the vector of diagonal elements of , and denotes the matrix Fröbenius norm .

Here, we apply PPA-CM, LPPA, and SDPRPPA1-II for solving (70). The standard Matlab Mex interface mexsvd is used to conduct the eigenvalue decomposition. We constructed test data sets and stopping criterion like those of [24]. As mentioned in the prequel, we expect our SRPPA to produce robust performance. To assess the effectiveness of the adaptive strategies proposed in Section 6, we now move on to the description of experiments that focus on the consequences of the initial stepsizes. For investigating, we used dimensions and varied from to , and initial points were set to in all cases. Note that ; we fixed for PPA-CM, for LPPA and chose as initial start for SDPRPPAG1-II. Since the experiments with other values of give qualitatively similar results, we therefore do not plot those results to avoid clutter in the figures. The respective numerical results are plotted in Figure 2.

It is clear that, for PPA-CM and LPPA, the convergence performance was a result of the stepsize selection. They are both fairly sensitive to initial stepsize (or ). The results confirm that, with inappropriate stepsizes, both PPA-CM and LPPA become significantly slow. SDPRPPAG1-II yields significantly robust performance with adaptive strategy. And it is independent of the initial stepsizes and illustrates its superior performance. Furthermore, SDPRPPAG1-II yields competitive results even when PPA-CM and LPPA chose the “good” initial stepsize. This underlies the importance of adaptive strategy in producing good performance. Of course, care should be taken. For instance, the cost of computing optimal stepsize here is negligible, compared to the computation of SVD; when they are more costly, general LPPA will be expected to perform slower than PPA-CM.

7. Conclusions

In this paper, we proposed an efficient general self-adaptive relaxed-PPA method for linearly constrained convex programming and provided theoretical convergence analysis for this method. The stepsizes choice condition is flexible and simple. Self-adaptive strategies are provided to make our method more efficient and robust. Experiments of the method in comparison to other state-of-art methods are provided to confirm the efficiency of the proposed method. Our numerical results suggest that SRPPA is effective and simple to implement. There are a few directions for further research, but we list here only two. The first is the question of whether we may modify the algorithm to work with more general constrained convex problems. Second, we aim to provide various relaxations of the subproblem for the practical purpose.

Acknowledgments

The author is grateful to Caihua Chen, Wenxing Zhang, and Xiangfeng Wang for interesting discussions on nearest correlation matrix problem. Xiaoling Fu was supported by the NSFC Grant 70901018.