Some Recent Trends in Variational Inequalities and Optimization Problems with Applications
View this Special IssueResearch Article  Open Access
Proximal Alternating Direction Method with Relaxed Proximal Parameters for the Least Squares Covariance Adjustment Problem
Abstract
We consider the problem of seeking a symmetric positive semidefinite matrix in a closed convex set to approximate a given matrix. This problem may arise in several areas of numerical linear algebra or come from finance industry or statistics and thus has many applications. For solving this class of matrix optimization problems, many methods have been proposed in the literature. The proximal alternating direction method is one of those methods which can be easily applied to solve these matrix optimization problems. Generally, the proximal parameters of the proximal alternating direction method are greater than zero. In this paper, we conclude that the restriction on the proximal parameters can be relaxed for solving this kind of matrix optimization problems. Numerical experiments also show that the proximal alternating direction method with the relaxed proximal parameters is convergent and generally has a better performance than the classical proximal alternating direction method.
1. Introduction
This paper concerns the following problem: where is a given symmetric matrix, matrices and are symmetric and scalars, and are the problem data, denotes that is a positive semidefinite matrix, denotes the trace of a matrix, and denotes the Frobenius norm; that is, and is nonempty. Throughout this paper, we assume that the Slater’s constraint qualification condition holds so that there is no duality gap if we use Lagrangian techniques to find the optimal solution to problem (1).
Problem (1) is a type of matrix nearness problem, that is, the problem of finding a matrix that satisfies some properties and is nearest to a given one. Problem (1) can be called the least squares covariance adjustment problem or the least squares semidefinite programming problem and solved by many methods [1–4]. In a least squares covariance adjustment problem, we make adjustments to a symmetric matrix so that it is consistent with prior knowledge or assumptions and a valid covariance matrix [2, 5, 6]. The matrix nearness problem has many applications especially in several areas of numerical linear algebra, finance industry, and statistics in [6]. A recent survey of matrix nearness problems can be found in [7]. It is clear that the matrix nearness problem considered here is a convex optimization problem. It thus follows from the strict feasibility and coercivity of the objective function that the minimum of (1) is attainable and unique.
In the literature of interior point algorithms, is called the semidefinite cone and the related problem (1) belongs to the class of semidefinite programming (SDP) and secondorder cone programming (SOCP) [8]. In fact, it is possible to reformulate problem (1) into a mixed SDP and SOCP as in [3, 9]: where .
Thus, problem (1) can be efficiently solved by standard interiorpoint methods such as SeDuMi [10] and SDPT3 [11] when the number of variables (i.e., entries in the matrix ) is modest, say under 1000 (corresponds to around 32) and the number of equality and inequality constraints is not too large (say 5,000) [2, 3, 12].
Specially, let where is the vector of diagonal elements of and is the vector of s. Then problem (1) can be viewed as the nearest correlation matrix problem. For the nearest correlation matrix problem, a quadratically convergent Newton algorithm was presented recently by Qi and Sun [13], and improved by Borsdorf and Higham [1]. For problem (1) with equality and inequality constraints, one difficulty in finding an efficient method for solving this problem is the presence of the inequality constraints. In [3], Gao and Sun overcome this difficulty by reformulating the problem as a system of semismooth equations with two level metric projection operators and then design an inexact smoothing Newton method to solve the resulting semismooth system. For the problem (1) with large number of equality and inequality constraints, the numerical experiments in [14] show that the alternating direction method (hereafter alternating direction method is abbreviated as ADM) is more efficient in computing time than the inexact smoothing Newton method which additionally requires solving a large system of linear equations at each iteration. The ADM has many applications in solving optimization problems [15, 16]. Papers written by Zhang, Han, Li, Yuan, and Bauschke and Borwein show that the ADM can be applied to solve convex feasibility problems [17–19].
The proximal ADM is a class of ADM type methods which can also be easily applied to solve the matrix optimization problems. Generally, the proximal parameters (i.e., the parameters and in (14) and (15)) of the proximal ADM are greater than zero. In this paper, we will show that the restriction on the proximal parameters can be relaxed while the proximal ADM is used to solve problem (1). Numerical experiments also show that the proximal ADM with the relaxed proximal parameters generally has a better performance than the classical proximal ADM.
The paper is organized as follows. In Section 2, we give some preliminaries about the proximal alternating direction method. In Section 3, we convert the problem (1) to a structured variational inequality and apply the proximal ADM to solve it. The basic analysis and convergent results of the proximal ADM with relaxed proximal parameters are built in Section 4. Preliminary numerical results are reported in Section 5. Finally, we give some conclusions in Section 6.
2. Proximal Alternating Direction Method
In order to introduce the proximal ADM, we first consider the following structured variational inequality problem which includes two separable subvariational inequality problems: find such that where and are monotone; that is, , , and ; and are closed convex sets. Studies of such variational inequality can be found in Glowinski [20], Glowinski and Le Tallec [21], Eckstein and Fukushima [22–24], He and Yang [25], He et al. [26], and Xu [27].
By attaching a Lagrange multiplier vector to the linear constraint , problem (6)(7) can be explained as the following form (see [20, 21, 24]): find such that where For solving (9)(10), Gabay [28] and Gabay and Mercier [29] proposed the ADM method. In the classical ADM method, the new iterate is generated from a given triple via the following procedure.
First, is found by solving the following problem: where . Then, is obtained by solving where . Finally, the multiplier is updated by where is a given penalty parameter for the linearly constraint . Most of the existing ADM methods require that the subvariational inequality problems (11)(12) should be solved exactly at each iteration. Note that the involved subvariational inequality problem (11)(12) may not be wellconditioned without strongly monotone assumptions on and . Hence, it is difficult to solve these subvariational inequality problems exactly in many cases. In order to improve the condition of solving the subproblem by the ADM, some proximal ADMs were proposed (see, e.g., [26, 27, 30–34]). The classical proximal ADM is one of the attractive ADMs. From a given triple , the classical proximal ADM produces the new iterate by the following procedure.
First, is obtained by solving the following variational inequality problem: where is the given proximal parameter and . Then, is found by solving where is the given proximal parameter and . Finally, the multiplier is updated by
In this paper, we will conclude that problem (1) can be solved by the proximal ADM and the restriction on the proximal parameters , can be relaxed as , when the proximal ADM is applied to solve problem (1). Our numerical experiments later also show that the numerical performance of the proximal ADM with smaller value of proximal parameters is generally better than the proximal ADM with comparatively larger value of proximal parameters.
3. Converting Problem (1) to a Structured Variational Inequality
In order to solve the problem (1) with proximal ADM, we convert problem (1) to the following equivalent one: Following the KKT condition of (17), the solution to (17) can be found by finding such that where
It is easy to see that problem (18)(19) is a special case of the structured variational inequality (9)(10) and thus can be solved by proximal ADM. For given , it is fortunate that the can be exactly obtained by the proximal ADM in the following way: where the projection of on a nonempty closed convex set of under Frobenius norm, denoted by , is the unique solution to the following problem; that is, It follows that the solution to is called the projection of on and denoted by . Using the fact that matrix Frobenius norm is invariant under unitary transform, it is known (see [35]) that where is the symmetric Schur decomposition of ( is an orthogonal matrix whose column vector , , is the eigenvector of , and , , is the related eigenvalue), In order to obtain the projection , we need to solve the following quadratic program: The dual problem of (28) can be written as where is positive semidefinite and and have the following form, respectively:
Problem (29) is often a mediumscale quadratic programming (QP) problem. A variety of methods for solving the QP are commonly used, including interiorpoint methods and active set algorithm (see [36, 37]).
Particularly, if is the following special case: where expresses that each element of is nonnegative, and are given symmetric matrices, and means that ; then is easy to be carried out and is given by where and compute the elementwise maximum and minimum of matrix and , respectively.
4. Main Results
Let be the sequence generated by applying the procedure (14)–(16) to problem (18)(19); then for any , we have that Further, letting where is the unit matrix, and then we can get the following lemmas.
Lemma 1. Let be the sequence generated by applying the proximal ADM to problem (18)(19) and let be any solution to problem (18)(19); then one has
Proof. From (22) and (35), we have Since (9) and are a solution to problem (18)(19) and , , we have From (38), it follows that Thus, we have Substituting (40) into (37), we get the assertion of this lemma.
Lemma 2. Let be the sequence generated by applying the proximal ADM to problem (18)(19) and let be any solution to problem (18)(19); then one has where
Proof. It follows from (33) that Thus, we have From the above inequality, we get Hence, (41) holds and the proof is completed.
Theorem 3. Let be the sequence generated by applying the proximal ADM to problem (18)(19) and let be any solution to problem (18)(19); then one has where and .
Proof. From (41), we have Rearranging the inequality above, we find thatUsing the CauchySchwarz Inequality on the last term of the righthand side of (49), we obtain Substituting (50) into (49), we get Thus, the proof is completed.
Based on the Theorem 3, we get the following lemma.
Lemma 4. Let be the sequence generated by applying proximal ADM to problem (18)(19), any solution to problem (18)(19), , and ; then one has the following.(1)The sequence is nonincreasing;(2)The sequence is bounded;(3);(4) and are both symmetric positivedefinite matrices.
Proof. Since
it is easy to check that if , , then and are symmetric positivedefinite matrices.
Let be the smallest eigenvalue of matrix . Then, from (46), we have
Following (53), we immediately have that is nonincreasing and thus the sequence is bounded. Moreover, we have
So, we get
then
Thus, the proof is completed.
Following Lemma 4, now we are in the stage of giving the main convergence results of proximal ADM with and for problem (18)(19).
Theorem 5. Let be the sequence generated by applying proximal ADM to problem (18)(19), , and ; then converges to a solution point of (18)(19).
Proof. Since the sequence is bounded (see point of Lemma 4), it has at least one cluster point. Let be a cluster point of and the subsequence converges to . It follows from (33) that Following point of Lemma 4, we have This means that is a solution point of (18)(19). Since converges to , we have that, for any given , there exists an integer such that Furthermore, using the inequality (53), we have Combining (59) and (60), we get that This implies that the sequence converges to . So the proof is completed.
5. Numerical Experiments
In this section, we implement the proximal ADM to solve the problem (1) and show the numerical performances of proximal ADM with different proximal parameters. Additionally, we compare the classical ADM (i.e., the proximal ADM with proximal parameters and ) with the alternating projections method proposed by Higham [6] numerically and show that the alternating projections method is not equivalent to proximal ADM with zero proximal parameters. All the codes were written in Matlab 7.1 and run on IBM notebook PC R400.
Example 6. In the first numerical experiment, we set the as an matrix whose entries are generated randomly in . Let and further let the diagonal elements of be 1 that is, , . In this test example, we simply let be in the form of (31) and Moreover, let , , , , and , where and are both the Matlab functions. For different problem size and different proximal parameters and , Table 1 shows the computational results. There, we report the number of iterations (It.) and the computing time in seconds (CPU.) it takes to reach convergence. The stopping criterion of the proximal ADM is where is the maximum absolute value of the elements of the matrix .

Remark 7. Note that if the proximal parameters are equal to zero, that is, and , then the proximal ADM is the classical ADM.
Example 8. All the data are the same as in Example 6 except that is an matrix whose entries are generated randomly in , The computational results are reported in Table 2.

Example 9. Let be in the form of (31) and , , . Assume that , , , , , , and the stopping criterion are the same as those in Example 6, but the diagonal elements of matrix are replaced by where is a given number, is the Matlab function generating a number randomly in . In the following numerical experiments, we let . For different problem size and different proximal parameters and , Table 3 shows the number of iterations and the computing time in seconds it takes to reach convergence.

Example 10. All the data are the same as in Example 9 except that . The computational results are reported in Table 4.

Example 11. Let be an matrix whose entries are generated randomly in , , and let the diagonal elements of be . And let
where ,, are subsets of denoting the indexes of such entries of that are constrained by equality, lower bounds, and upper bounds, respectively. In this test example, we let the index sets ,, and be the same as in Example 5.4 of [3]; that is, and consist of the indices of randomly generated elements at the th row of , with and , respectively. We take for , for , and for .
Moreover, let , , , , , and the stopping criterion be the same as those in Example 6. For different problem size , different proximal parameters and , and different values of , Tables 5(a) and 5(b) show the number of iterations and the computing time in seconds it takes to reach convergence, respectively.
Numerical experiments show that the proximal ADM with relaxed parameters is convergent. Moreover, we draw the conclusion that the proximal ADM with smaller value of proximal parameters generally converges more quickly than the proximal ADM with comparatively larger value of proximal parameters to solve the problem (1).
(a)  
 
(b)  

Example 12. In this test example, we apply the proximal ADM with , (i.e., the classical ADM) to solve the nearest correlation matrix problem, that is, problem (1) with in the form of (5), and compare the classical ADM numerically with the alternating projections method (APM) [6]. The APM computes the nearest correlation matrix to a symmetric by the following process: ,
;
for ;
;
;
;
end.
In this numerical experiment, the stopping criterion of the APM is
Let the matrix and the initial parameters of classical ADM be the same as those in Example 6. Table 6(a) reports the numerical performance of proximal ADM and the APM for computing the nearest correlation matrix to .
Further, let be an matrix whose entries are generated randomly in and . The other data are the same as above. Table 6(b) reports the numerical performance of the classical ADM and the APM for computing the nearest correlation matrix to the matrix . Numerical experiments show that the classical ADM generally exhibits a better numerical performance than the APM for the test problems above.
(a)  
 
(b)  

6. Conclusions
In this paper, we apply the proximal ADM to a class of matrix optimization problems and find that the restriction of proximal parameters can be relaxed. Moreover, numerical experiments show that the proximal ADM with relaxed parameters generally has a better numerical performance in solving the matrix optimization problem than the classical proximal alternating direction method.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors thank the referees very sincerely for their valuable suggestions and careful reading of their paper. This research is financially supported by a research Grant from the Research Grant Council of China (Project no. 10971095).
References
 R. Borsdorf and N. J. Higham, “A preconditioned Newton algorithm for the nearest correlation matrix,” IMA Journal of Numerical Analysis, vol. 30, no. 1, pp. 94–107, 2010. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 S. Boyd and L. Xiao, “Leastsquares covariance matrix adjustment,” SIAM Journal on Matrix Analysis and Applications, vol. 27, no. 2, pp. 532–546, 2005. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 Y. Gao and D. Sun, “Calibrating least squares semidefinite programming with equality and inequality constraints,” SIAM Journal on Matrix Analysis and Applications, vol. 31, no. 3, pp. 1432–1457, 2009. View at: Publisher Site  Google Scholar  MathSciNet
 S. Gravel and V. Elser, “Divide and concur: a general approach to constraint satisfaction,” Physical Review E, vol. 78, Article ID 036706, 2008. View at: Google Scholar
 N. J. Higham, “Computing a nearest symmetric positive semidefinite matrix,” Linear Algebra and its Applications, vol. 103, pp. 103–118, 1988. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 N. J. Higham, “Computing the nearest correlation matrix—a problem from finance,” IMA Journal of Numerical Analysis, vol. 22, no. 3, pp. 329–343, 2002. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 N. J. Higham, “Matrix nearness problems and applications,” in Applications of Matrix Theory, M. Gover and S. Barnett, Eds., vol. 22, pp. 1–27, Oxford University Press, Oxford, UK, 1989. View at: Google Scholar  Zentralblatt MATH  MathSciNet
 S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, 2004. View at: MathSciNet
 L. Vandenberghe and S. Boyd, “Semidefinite programming,” SIAM Review, vol. 38, no. 1, pp. 49–95, 1996. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,” Optimization Methods and Software, vol. 11/12, no. 1–4, pp. 625–653, 1999. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 R. H. Tütüncü, K. C. Toh, and M. J. Todd, “Solving semidefinitequadraticlinear programs using SDPT3,” Mathematical Programming, vol. 95, no. 2, pp. 189–217, 2003. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 J. Malick, “A dual approach to semidefinite leastsquares problems,” SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 1, pp. 272–284, 2004. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 H. Qi and D. Sun, “A quadratically convergent Newton method for computing the nearest correlation matrix,” SIAM Journal on Matrix Analysis and Applications, vol. 28, no. 2, pp. 360–385, 2006. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 B. He, M. Xu, and X. Yuan, “Solving largescale least squares semidefinite programming by alternating direction methods,” SIAM Journal on Matrix Analysis and Applications, vol. 32, no. 1, pp. 136–152, 2011. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 P. M. Pardalos and M. G. C. Resende, Handbook of Applied Optimization, Oxford University Press, Oxford, UK, 2002. View at: MathSciNet
 P. M. Pardalos, T. M. Rassias, and A. A. Khan, Nonlinear Analysis and Variational Problems, vol. 35 of Springer Optimization and Its Applications, Springer, New York, NY, USA, 2010, In honor of George Isac, Edited by Panos M. Pardalos, Themistocles M. Rassias and Akhtar A. Khan. View at: Publisher Site  MathSciNet
 H. H. Bauschke and J. M. Borwein, “On projection algorithms for solving convex feasibility problems,” SIAM Review, vol. 38, no. 3, pp. 367–426, 1996. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. Zhang, D. Han, and Z. Li, “A selfadaptive projection method for solving the multiplesets split feasibility problem,” Inverse Problems, vol. 25, no. 11, 2009. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. Zhang, D. Han, and X. Yuan, “An efficient simultaneous method for the constrained multiplesets split feasibility problem,” Computational Optimization and Applications, vol. 52, no. 3, pp. 825–843, 2012. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 R. Glowinski, Numerical Methods for Nonlinear Variational Problems, Springer, New York, NY, USA, 1984. View at: MathSciNet
 R. Glowinski and P. Le Tallec, Augmented Lagrangian and OperatorSplitting Methods in Nonlinear Mechanics, vol. 9 of SIAM Studies in Applied Mathematics, SIAM, Philadelphia, Pa, USA, 1989. View at: Publisher Site  MathSciNet
 J. Eckstein, “Some saddlefunction splitting methods for convex programming,” Optimization Methods and Software, vol. 4, pp. 75–83, 1994. View at: Google Scholar
 J. Eckstein and M. Fukushima, “Some reformulations and applications of the alternating direction method of multipliers,” in Large Scale Optimization: State of the Art, W. W. Hager, D. W. Hearn, and P. M. Pardalos, Eds., pp. 115–134, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1994. View at: Google Scholar  Zentralblatt MATH  MathSciNet
 M. Fukushima, “Application of the alternating direction method of multipliers to separable convex programming problems,” Computational Optimization and Applications, vol. 1, no. 1, pp. 93–111, 1992. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 B. He and H. Yang, “Some convergence properties of a method of multipliers for linearly constrained monotone variational inequalities,” Operations Research Letters, vol. 23, no. 35, pp. 151–161, 1998. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 B. He, L.Z. Liao, D. Han, and H. Yang, “A new inexact alternating directions method for monotone variational inequalities,” Mathematical Programming, vol. 92, no. 1, pp. 103–118, 2002. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 M. H. Xu, “Proximal alternating directions method for structured variational inequalities,” Journal of Optimization Theory and Applications, vol. 134, no. 1, pp. 107–117, 2007. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 D. Gabay, “Applications of the method of multipliers to variational inequalities,” in Augmented Lagrangian Methods: Applications to the Numerical Solution of BoundaryValue Problems, M. Fortin and R. Glowinski, Eds., pp. 299–331, NorthHolland, Amsterdam, The Netherlands, 1983. View at: Google Scholar
 D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via finite element approximations,” Computer and Mathematics with Applications, vol. 2, pp. 17–40, 1976. View at: Google Scholar
 O. Güler, “New proximal point algorithms for convex minimization,” SIAM Journal on Optimization, vol. 2, no. 4, pp. 649–664, 1992. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. W. Hager and H. Zhang, “Asymptotic convergence analysis of a new class of proximal point methods,” SIAM Journal on Control and Optimization, vol. 46, no. 5, pp. 1683–1704, 2007. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. W. Hager and H. Zhang, “Selfadaptive inexact proximal point methods,” Computational Optimization and Applications, vol. 39, no. 2, pp. 161–181, 2008. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 R. T. Rockafellar, “Monotone operators and the proximal point algorithm,” SIAM Journal on Control and Optimization, vol. 14, no. 5, pp. 877–898, 1976. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 M. Teboulle, “Convergence of proximallike algorithms,” SIAM Journal on Optimization, vol. 7, no. 4, pp. 1069–1083, 1997. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 W. K. Glunt, “An alternating projections method for certain linear problems in a Hilbert space,” IMA Journal of Numerical Analysis, vol. 15, no. 2, pp. 291–305, 1995. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 J. Nocedal and S. J. Wright, Numerical Optimization, Springer, New York, NY, USA, 1999. View at: Publisher Site  MathSciNet
 N. Narendra, “A new polynomial time algorithm for linear programming,” Combinatorica, vol. 4, pp. 373–395, 1987. View at: Google Scholar
Copyright
Copyright © 2014 Minghua Xu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.