An Asymmetric Proximal Decomposition  Method for Convex Programming with Linearly Coupling Constraints

Fu, Xiaoling; Wang, Xiangfeng; Wang, Haiyan; Zhai, Ying

doi:https://doi.org/10.1155/2012/281396

Advances in Operations Research

On this page

Abstract Introduction Conclusions References Copyright Related Articles

Special Issue

Numerical Methods for Solving Variational Inequalities and Complementarity Problems

View this Special Issue

Research Article | Open Access

Volume 2012 | Article ID 281396 | https://doi.org/10.1155/2012/281396

An Asymmetric Proximal Decomposition Method for Convex Programming with Linearly Coupling Constraints

Xiaoling Fu,¹Xiangfeng Wang,²Haiyan Wang,¹and Ying Zhai³

Academic Editor: Abdellah Bnouhachem

Received17 Nov 2011

Accepted10 Jan 2012

Published12 Apr 2012

Abstract

The problems studied are the separable variational inequalities with linearly coupling constraints. Some existing decomposition methods are very problem specific, and the computation load is quite costly. Combining the ideas of proximal point algorithm (PPA) and augmented Lagrangian method (ALM), we propose an asymmetric proximal decomposition method (AsPDM) to solve a wide variety separable problems. By adding an auxiliary quadratic term to the general Lagrangian function, our method can take advantage of the separable feature. We also present an inexact version of AsPDM to reduce the computation load of each iteration. In the computation process, the inexact version only uses the function values. Moreover, the inexact criterion and the step size can be implemented in parallel. The convergence of the proposed method is proved, and numerical experiments are employed to show the advantage of AsPDM.

1. Introduction

The original model considered here is the convex minimization problem with linearly coupling constraints: where , are given matrixes, are given -vector, and are the th block convex differentiable functions for each . This special problem is called convex separable problem. Problems possessing such separable structure arise in discrete-time deterministic optimal control and in the scheduling of hydroelectric power generation [1]. Note that are differentiable, setting ; by the well-known minimum principle in nonlinear programming, it is easy to get an equivalent form of problem (1.1): find such that where Problems of this type are called separable variational inequalities (VIs). We will utilize this equivalent formulation and provide method for solution of separable VI.

One of the best-known algorithms for solving convex programming or equivalent VI is the proximal point algorithm (PPA) first proposed by Martinet (see [2]) and had been studied well by Rockafellar [3, 4]. PPA and its dual version, the method of multipliers, draw on a large volume of prior work by various authors [5–9]. However, classical PPA and most of its subsequence papers cannot take advantage of the separability of the original problem, and this makes them inefficient in solving separable structure problems. One major direction of PPA’s study is to develop decomposition methods for separable convex programming and VI. The motivations for decomposition techniques are splitting the problem into isolate smaller or easier subproblems and parallelizing computations on specific parallel computing device. Decomposition-type methods [10–14] for large-scale problems have been widely studied in optimization as well as in variational problems and are explicitly or implicitly derived from PPA. However, most of those methods only can solve separable problems with special equality constraints: Two very well-known methods for solving equality constrained convex problems and VI are the augmented Lagrangian method [15, 16] (ALM) and the alternating direction method (ADM) [17]. The classic ALM has been deeply studied and has many advantages over the general Lagrange methods; see [18] for more detail. However, it can not preserve separability. ADM is a different method but closely related to ALM, which essentially can preserve separability for problems with two operators (). Recently, separable augmented Lagrangian method (SALM) [19, 20] overcomes the nonseparability of ALM. For example, for solving problem (1.1) with equality constraints, Hamdi and Mahey [19] allocated a resource quantity to each block leading (1.1) to an enlarged problem in It is worth mentioning that (1.5) is only equivalent to problem (1.1) with equality constraints. The expression of the augmented lagrangian function of (1.5) is: with SALM finds a saddle point of problem (1.5) by the following stages:(i);(ii);(iii).

Note that the process in SALM for allows one to solve subproblems in parallel. This has great practical importance from the computation point of view. In fact, SALM belongs to the family of splitting algorithms and ADM for solving special convex problem (1.4) with and SALM has to introduce an additive variable to exploit the inner separable structure of the problem, which makes the problem larger. Moreover, SALM is suitable to solve equality constraints problems and fraught with difficulties in solving inequality constraints problems.

To our best knowledge, there are few dedicated methods for solving inequality constraints problems (1.1) or VI(1.2)-(1.3), except the decomposition method proposed by Tseng [21] and the PPA-based contraction method by He et al. [22]. The decomposition method in [21] decomposes the computation of at a fine level without introducing additive variable . But, in each iteration of this method, the minimization subproblems for are dependent on the step size of multiplier, which greatly restricts the computation of subproblems. The PPA-based contraction method in [22] has a nice decomposable structure; however, it has to solve the subproblem exactly. To solve (1.1) or VI(1.2)-(1.3), motivated by PPA-based contraction method and SLAM, we propose an asymmetric proximal decomposition method (AsPDM) which can well conserve the separability feature of the problem. Besides, it does not need to introduce the resource variables like SALM and the subproblems do not depend on the step size of multiplier. In the following, we briefly describe our method for (1.1): we add an auxiliary quadratic term to the general Lagrangian function: with The general framework of AsPDM is as follows:

Phase I

Phase II

Here, , , , and are proper chosen which will be detailed in the later sections. Note that the first phase consists of isolate subproblems, and each involves , only, namely; it can be partitioned into independent lower-dimension subproblems. Hence, this method can take advantage of operators’ separability. Since we mainly focus on solving equivalent separable VI, hence, we present this method under VI framework and analyze its convergence in the following sections.

2. The Asymmetric Proximal Decomposition Method

2.1. Structured VI

The separable VI(1.2)-(1.3) consists of partitioned sub-VIs. Introducing a Lagrange multiplier vector associated with the linearly coupling constraint , we equivalently formulate the separable VI(1.2)-(1.3) as an enlarged VI: find such that where VI(2.1)-(2.2) is referred as structured variational inequality (SVI), denoted as SVI. Here,

2.2. Preliminaries

We summarize some basic properties and related definitions which will be used in the following discussions.

Definition 2.1. (i) The mapping is said to be monotone if and only if
(ii) A function is said to be Lipschitz continuous if there is a constant such that

The projection onto a closed convex set is a basic concept in this paper. Let be any closed convex set. We use to denote the projection of onto under the Euclidean norm; that is, The following lemmas are useful for the convergence analysis in this paper.

Lemma 2.2. Let be a closed convex set in , then one has (1)(2)

Proof. See [23].

Lemma 2.3. Let be a closed convex set in , then is a solution of VI if and only if

Proof. See [10, page 267].

Hence, solving VI is equivalent to finding a zero point of the residue function Generally, the term (denotes ) is referred to as the error bound of VI, since it measures the distance of from the solution set.

2.3. The Presentation of the Exact AsPDM

In each iteration, by our proper construction, our method solves independent sub-VIs involving each individual variable only so that can be obtained in parallel. In what follows, to illustrate our method’s practical significance, we interpret our algorithm process as a system which has a central authority and local administrators; each administrator attempts to unilaterally solve a certain problem under the presumption that the instructions given by the authority are parametric inputs and the responses of other administrations’ actions are not available; namely, the local administrators acts synchronously and independently once they receive the information given by the central authority. We briefly describe our method which consists of two main phases.

Phase I: For arbitrary given by the central authority, each local administrator uses his own way to offer the solution (denoted as ) of his individual problem: find , such that

Phase II: After the local administrators accomplish their tasks, the central authority collects the resulting , moreover, corresponding , which can be viewed as the feedback information from the local administrators and sets: Here, is suitably chosen by the central authority. So the central authority aims to employ this feedback information effectively to provide which will be beneficial for the next iteration loop. In this paper, our proposed methods will update the new iterate by the following two forms: or where is a specific step size and

We make the standard assumptions to guarantee that the problem under consideration is solvable and the proposed methods are well defined.

Assumption A. is monotone and Lipschitz continuous, .

By this assumption, it is easy to get that is monotone.

2.4. Presentation of the Inexact AsPDM

In this section, the inexact version of the AsPDM method is present, and some remarks are briefly made.

For later analysis convenience, we denote At each iteration, we solve sub-VIs (see (2.11)) independently. No doubt, the computation load for an exact solution of (2.11) is usually expensive. Hence, it is desirable for us to consider solving (2.11) inexactly under a relative relaxed inexact criterion. We now describe and analyze our inexact method. Each iteration consists of two main phases, one of which provides an inexact solution of (2.11) and the other of which employs the inexact solution to offer a new iterate for the next iteration.

The first phase of our method works as follows. At the beginning of th iteration, an iterate is given. If , then is the exact solution of th sub-VI; there is nothing we need to do with the th sub-VI. Otherwise, we should find such that with Here, the obtained should satisfy following two inexact criteria:

Once one of the above criteria fails to be satisfied, we will increase by and turn back to solve the th sub-VI of (2.17) with this updated . It should be noted that both inexact criteria are quite easy to check since they do not contain any unknown variables. In addition, another favorable characterization of these criteria is that they are independent; namely, they only involve , irrelevant to .

In what follows, let us describe the second phase. We require where (here, ) is suitably chosen to satisfy Now we use this (or ) to construct the new iteration. Here, we provide two simple forms for the new iteration: or where In fact, each iteration of the proposed method consists of two main phases. Using the point of view that the problem is a system with a central authority and administrators, the first phase is accomplished by administrators based on the instruction given by the authority. That is, th sub-VI only involves th administrator’s activities. On the other hand, the second phase is implemented by the central authority to give new instruction for the next iteration.

Remark 2.4. In the inexact AsPDM, the main task of Phase I is to find a solution for (2.17). From (2.17), it is easy to get that It seems that equality (2.22) is an implicit form since both sides of (2.22) contain . In fact, we can transform equality (2.22) to an explicit form. Using the property of the projection, we have Consequently, using the above formula, we can compute quite easily.

Remark 2.5. Combining (2.22) and (2.19), we then find that If , it yields an exact version. In this special case, it is clear that We find that this formula is quite similar to the iterates produced by the classic PPA [3], which employs as the new iterate; here, is a positive symmetry definite matrix. For deeper insight, our method does not appear fit into any of the known PPA frameworks. It is virtually not equivalent to PPA even if is positive definite. The reason why our method can not be viewed as PPA lies in the fact that is asymmetry, moreover, may be not positive definite. This lack of symmetry makes it fail to introduce an inner product as . Consequently, if one sets as the new iterate, one may fail to obtain the convergence. Due to the asymmetric feature of , we call our method asymmetric proximal decomposition method.

Remark 2.6. Recalling that is obtained by (2.19), it is easy to get that Combining (2.17) and (2.27), we have
Since is generated by (2.17)–(2.20) from a given , we have that implies and . According to (2.28), we have In other words, is a solution of Problem (2.1)-(2.2) if () and . Hence, we use as stopping criterion in the proposed method.

Remark 2.7. The update form (2.21*a) is based on the fact that is a descent direction of the unknown distance function at point . This property will be proved in Section 3.1. in (2.21) is the “optimal” step length, which will be detailed in Section 3.2. We can also use (2.21*b) to update the new iterate. For fast convergence, the practical step length should be multiplied by a relaxed factor .

Remark 2.8. Note that if and only if . In the case , by choosing a suitable , (2.20) will be satisfied. We state this fact in the following lemma.

Lemma 2.9. Let and be defined in (2.15) and (2.16), respectively. If , for all , one has

Proof. According to Definition (2.16), we have and the assertion is obtained.

Set in the above lemma, we get If one chooses , then the Condition (2.20) is always satisfied; hence, can be regarded as a safe upper bound for this condition. Note that we use an inequality in the proof of Lemma 2.9; it seems that there exists some relaxations. As a result, rather than fix , let be a smaller value, and check if Condition (2.20) is satisfied. If not, increase by and try again. This process enables us to reach a suitable to meet (2.20).

Note that, in our proposed method, problems VI (2.17) produce in a parallel wise. In addition, instead of taking the solution of the subproblems, the new iterate in the proposed methods is updated by a simple manipulation, for example, (2.18*a)-(2.18*b).

3. Convergence of AsPDM

In the proposed methods, the first phase (accomplished by the local administrators) offers a descent direction of the unknown distance function, and the second phase (accomplished by the central authority) determines the “optimal” step length along this direction. This section gives more theory analysis.

3.1. The Descent Direction in the Proposed AsPDM

For any , is the gradient of the unknown distance function at point . A direction is called a descent direction of at point if and only if the inner product . Let be generated by (2.17)–(2.20) from a given . A goal of this subsection is to elucidate that, for any , It guarantees that is a descent direction of at point . The above inequality plays an important role in the convergence analysis.

Lemma 3.1. Assume that is generated by (2.17)–(2.20) from a given , then for any one has

Proof. Since , substituting in (2.28), we obtain Using the monotonicity of and applying with in (2.1), it is easy to get Combining (3.3) and (3.4), we then find Note that Criterion (2.18*a) holds; we have The last inequality follows directly from the result of Lemma 2.9. Consequently, Using the preceding inequality and (2.32) in (3.5) yields completing the proof.

Now, we state the main properties of in the lemma below.

Lemma 3.2. Let be generated by (2.17)–(2.20) from given . Then for any one has

Proof. Recalling that , substituting in inequality (2.1), we have, immediately, By some manipulations, our assertion holds immediately.

3.2. The Step Size and the New Iterate

Since is a descent direction of at point , the new iterate will be determined along this direction by choosing a suitable step length. In order to explain why we have the “optimal” step as defined in (2.21), we let be the step-size-dependent new iterate, and let be the profit function of the th iteration. Because includes the unknown vector , it can not be maximized directly. The following lemma offers us a lower bound of which is a quadratic function of .

Lemma 3.3. Let be generated by (2.17)–(2.20) from a given . Then one has where

Proof. It follows from Definition (3.12) and inequality (3.5) that Let us deal with which seems more complicated: Since is a quadratic function of , it reaches its maximum at this is just the same defined in (2.21). In practical computation, taking a relaxed factor is wise for fast convergence. Note that for any , it follows from (3.13), (3.14), and (2.21) that In order to guarantee that the right-hand side of (3.18) is positive, we take .

In fact, is bounded below by a positive amount which is the subject of the following lemma.

Lemma 3.4. Assume that is generated by (2.17)–(2.20) from a given , then one has

Proof. Using the fact that square matrix is positive symmetric definite, we have Moreover, Note that Criterion (2.18*b) implies Hence, applying in the above inequality, we get Combining (3.20) and (3.22), we have Consequently, applying the above inequality, (3.7), and (2.32) to yields and thus the assertion is proved.

Now, we are in the stage to prove the main convergence theorem of this paper.

Theorem 3.5. For any , the sequence generated by the proposed method satisfies Thus we have and the iteration of the proposed method will terminate in finite loops.

Proof. First, it follows from (3.12) and (3.18) that Using (2.32), (3.7), and (3.19), we have Substituting (3.28) in (3.27), Assertion (3.25) is proved. Therefore, we have and Assertion (3.26) follows immediately.
Since we use as the stopping criterium, it follows from (3.26) that the iteration will terminate in finite loops for any given .

4. Numerical Examples

This section describes experiments testifying to the good performance of proposed method. The algorithms were written in Matlab (version 7.0) and complied on a notebook with CPU of Intel Core 2 Duo (2.01 GHz and RAM of 0.98 GB).

To evaluate the behavior of the proposed method, we construct examples about convex separable quadratic programming (CSQP) with linearly coupling constraints. The convex separable quadratic programming was generated as follows: where is a symmetric positive definite matrix, , , and . We construct matrices and in the test examples as follows. The elements of are randomly given in , and the matrices are defined by setting where In this way, is positive definite and has prescribed eigenvalues between . If is the solution of Problem (4.1), according to the KKT principle, there is a such that Let and be random vectors whose elements are between . We set where , are positive parameters. By setting we constructed a test problem of (4.1) which has the known solution point and the optimal Lagrangian multipliers . We tested such problems with , . Here, two example sets were considered. The problems in the first set have 3 separable operators (), and the second have 2 ().

In the first experiment, we employ AsPDM with update to solve CSQP with 3 separable operators. (The reason why we choose here is that it usually performs better than .) The stopping criterion was chosen as ; the parameters were set as , , and . Table 1 reports the number iterations (denoted as Its.), the total number of function evaluations (denoted as ) for different problem-sizes. Here, . Observed form Table 1, the solutions are obtained in a moderate number of iterations; thus the proposed method is effectively applicable. In addition, the evaluations of per iteration are approximately equal to 2. AsPDM is well suited to solve separable problems.

Next, we compared the computational efficiency of AsPDM against the method in [7] (denoted as PCM), regarded as a highly efficient PPA-based method that can be well suited to solve VI. Iterations were terminated when the criterion was met. Table 2 reports the iterations, the total number of function evaluation for both methods. We observe that both methods are acceptable to for us to find a solution. Concerning computational efficiency, we can observe that AsPDM is comparable and clearly faster than PCM; moreover, function evaluations are also less, except in the case of . In some cases, AsPDM can reduce about computation cost than PCM. For , we plot the error versus iteration number for both AsPDM and PCM in Figure 1. We have found that both methods converge quickly for the first hundred iterations but slow down as the exact solution is reached. The speed of AsPDM is better than PCM.

In addition to being fast, AsPDM can solve the problem separately; that is the most significant advantage over other methods. Hence, AsPDM is more suitable to solve the real-life separable problems.

5. Conclusions

We have proposed AsPDM for solving separable problems. It decomposes the original problem to independent low-dimension subproblems and solves those subproblems in parallel. Only the function values is required in the process, and the total computational cost is very small. AsPDM is easy to implement and does not appear to require application-specific tuning. The numerical results also evidenced the efficiency of our method. Thus, the new method is applicable and recommended in practice.

Acknowledgment

The author was supported by the NSFC Grant 70901018.

References

R. T. Rockafellar and R. J.-B. Wets, “Generalized linear-quadratic problems of deterministic and stochastic optimal control in discrete time,” SIAM Journal on Control and Optimization, vol. 28, no. 4, pp. 810–822, 1990.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
B. Martinet, “Régularisation d'inéquations variationnelles par approximations successives,” Revue Francaise d'Informatique et de Recherche Opérationelle, vol. 4, pp. 154–158, 1970.
View at: Google Scholar | Zentralblatt MATH
R. T. Rockafellar, “Monotone operators and the proximal point algorithm,” SIAM Journal on Control and Optimization, vol. 14, no. 5, pp. 877–898, 1976.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
R. T. Rockafellar, “Augmented Lagrangians and applications of the proximal point algorithm in convex programming,” Mathematics of Operations Research, vol. 1, no. 2, pp. 97–116, 1976.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
A. Auslender, M. Teboulle, and S. Ben-Tiba, “A logarithmic-quadratic proximal method for variational inequalities,” Computational Optimization and Applications, vol. 12, no. 1-3, pp. 31–40, 1999.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
Y. Censor and S. A. Zenios, “Proximal minimization algorithm with D-functions,” Journal of Optimization Theory and Applications, vol. 73, no. 3, pp. 451–464, 1992.
View at: Publisher Site | Google Scholar
B. He, X. Yuan, and J. J. Z. Zhang, “Comparison of two kinds of prediction-correction methods for monotone variational inequalities,” Computational Optimization and Applications, vol. 27, no. 3, pp. 247–267, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
A. Nemirovsky, Prox-method with rate of convergence 0(1/k) for smooth variational inequalities and saddle point problem, Draft of 30/10/2003.
M. Teboulle, “Convergence of proximal-like algorithms,” SIAM Journal on Optimization, vol. 7, no. 4, pp. 1069–1083, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation, Numerical Methods, Prentice Hall, Englewood Cliffs, NJ, USA, 1989.
G. Chen and M. Teboulle, “A proximal-based decomposition method for convex minimization problems,” Mathematical Programming, vol. 64, no. 1, Ser. A, pp. 81–101, 1994.
View at: Publisher Site | Google Scholar
B. He, L.-Z. Liao, and S. Wang, “Self-adaptive operator splitting methods for monotone variational inequalities,” Numerische Mathematik, vol. 94, no. 4, pp. 715–737, 2003.
View at: Google Scholar | Zentralblatt MATH
P. Mahey, S. Oualibouch, and D. T. Pham, “Proximal decomposition on the graph of a maximal monotone operator,” SIAM Journal on Optimization, vol. 5, no. 2, pp. 454–466, 1995.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
P. Tseng, “Applications of a splitting algorithm to decomposition in convex programming and variational inequalities,” SIAM Journal on Control and Optimization, vol. 29, no. 1, pp. 119–138, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
R. Glowinski and P. Le Tallec, Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics, vol. 9 of SIAM Studies in Applied Mathematics, SIAM, Philadelphia, Pa, USA, 1989.
B.-S. He, H. Yang, and C.-S. Zhang, “A modified augmented Lagrangian method for a class of monotone variational inequalities,” European Journal of Operational Research, vol. 159, no. 1, pp. 35–51, 2004.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
M. Fukushima, “Application of the alternating direction method of multipliers to separable convex programming problems,” Computational Optimization and Applications, vol. 1, no. 1, pp. 93–111, 1992.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
J. Nocedal and S. J. Wright, Numerical Optimization, Springer Series in Operations Research, Springer, New York, NY, USA, 1999.
View at: Publisher Site
A. Hamdi and P. Mahey, “Separable diagonalized multiplier method for decomposing nonlinear programs,” Computational & Applied Mathematics, vol. 19, no. 1, p. 1–29, 125, 2000.
View at: Google Scholar
A. Hamdi, P. Mahey, and J. P. Dussault, “A new decomposition method in nonconvex programming via a separable augmented Lagrangian,” in Recent advances in Optimization, vol. 452 of Lecture Notes in Economics and Mathematical Systems, pp. 90–104, Springer, Berlin, Germany, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
P. Tseng, “Alternating projection-proximal methods for convex programming and variational inequalities,” SIAM Journal on Optimization, vol. 7, no. 4, pp. 951–965, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
B. S. He, X. L. Fu, and Z. K. Jiang, “Proximal-point algorithm using a linear proximal term,” Journal of Optimization Theory and Applications, vol. 141, no. 2, pp. 299–319, 2009.
View at: Publisher Site | Google Scholar | Zentralblatt MATH
E. H. Zarantonello, “Projections on convex sets in Hilbert space and spectral theory,” in Contributions to Nonlinear Functional Analysis, E. H. Zarantonello, Ed., Academic Press, New York, NY, USA, 1971.
View at: Google Scholar | Zentralblatt MATH

Copyright

Copyright © 2012 Xiaoling Fu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1129

Downloads

1035

Citations