Research Article | Open Access
Congying Han, Tingting Feng, Guoping He, Tiande Guo, "Parallel Variable Distribution Algorithm for Constrained Optimization with Nonmonotone Technique", Journal of Applied Mathematics, vol. 2013, Article ID 295147, 7 pages, 2013. https://doi.org/10.1155/2013/295147
Parallel Variable Distribution Algorithm for Constrained Optimization with Nonmonotone Technique
A modified parallel variable distribution (PVD) algorithm for solving large-scale constrained optimization problems is developed, which modifies quadratic subproblem at each iteration instead of the of the SQP-type PVD algorithm proposed by C. A. Sagastizábal and M. V. Solodov in 2002. The algorithm can circumvent the difficulties associated with the possible inconsistency of subproblem of the original SQP method. Moreover, we introduce a nonmonotone technique instead of the penalty function to carry out the line search procedure with more flexibly. Under appropriate conditions, the global convergence of the method is established. In the final part, parallel numerical experiments are implemented on CUDA based on GPU (Graphics Processing unit).
In this paper, we consider the following nonlinear programming problem: where and are all continuously differentiable. Suppose the feasible set in (1) is described by a system of inequality constraints:
In this paper, we give a new algorithm based on the method in , which partitions the problem variables into blocks , such that
And assume that has block-separable structure. Specifically,
Parallel variable distribution (PVD) algorithm for solving optimization problems was first proposed by Ferris and Mangasarian , in 1994, in which the variables are distributed among processors. Each processor has the primary responsibility for updating its block of variables while allowing the remaining secondary variables to change in a restricted fashion along some easily computable directions. The distinctive novel feature of this algorithm is the presence of the “forget-me-not” term which allows for a change in “secondary” variables. This makes PVD fundamentally different from the block Jacobi  and coordinate descent  methods. The forget-me-not approach improves robustness and accelerates convergence of the algorithm and is the key to its success. In 1997, Solodov  proposed useful generalizations that consist, for the general unconstrained case, of replacing exact global solution of the subproblems by a certain natural sufficient descent condition, and, for the convex case, of inexact subproblem solution in the PVD algorithm. Solodov  proposed an algorithm applied the PVD approach to problems with general convex constraints directly use projected gradient residual function as the direction and show that the algorithm converges, provided that certain conditions are imposed on the change of secondary variables. In 2002, Sagastizábal and Solodov  proposed two new variants of PVD for the constrained case. Without assuming convexity of constraints, but assuming block-separable structure, they showed that PVD subproblems can be solved inexactly by solving their quadratic programming approximations. This extends PVD to nonconvex (separable) feasible sets and provided a constructive practical way of solving the parallel subproblems. For inseparable constraints, but assuming convexity, they developed a PVD method based on suitable approximate projected gradient directions. The approximation criterion was based on a certain error bound result, and it was readily implementable. In 2011, Zheng et al.  gave a parallel SSLE algorithm, in which the PVD subproblems are solved inexactly by serial sequential linear equations, for solving large-scale constrained optimization with block-separable structure. Without assuming the convexity of constraints, the algorithm is proved to be globally convergent to a KKT point. Han et al.  proposed an asynchronous PVT algorithm for solving large-scale linearly constrained convex minimization problems with the idea of  in 2009, which based on the idea that a constrained optimization problem is equivalent to a differentiable unconstrained optimization problem by introducing the Fischer Function. And in particular, different from  the linear rate of convergence does not depend on the number of processors.
In this paper, we use  as our main reference on SQP-type PVD method for problems (1) and (4). Firstly, we introduce the algorithm in  simply. The original problem is distributed into parallel subproblems which treatment among parallel processors. The algorithm of  may result in that the linear constraints in quadratic programming subproblems are inconsistent or the solution of quadratic programming subproblems is unbounded, so that the algorithm may fail. This drawback has been overcome by many researchers, such as . In , exact penalty function is used as merit function to carry out the line-search. For the penalty function, as pointed out by Fletcher and Leyffer , the biggest drawback is that the penalty parameter estimates could be problematic to obtain. To overcome this drawback, we use a nonmonotone technique to carry out the line search instead of penalty function.
Recent research [12–14] indicates that the monotone line search technique may have some drawbacks. In particular, enforcing monotonicity may considerably reduce the rate of convergence when the iteration is trapped near a narrow curved valley, which can result in very short steps or zigzagging. Therefore, it might be advantageous to allow the iterative sequence to occasionally generate points with nonmonotone objective values. Grippo et al.  generalized the Armijo rule and proposed a nonmonotone line search technique for Newton’s method which permits increase in function value, while retaining global convergence of the minimization algorithm. In , Sun et al. give several nonmonotone line search techniques, such as nonmonotone Armijo rule, nonmonotone Wolfe rule, nonmonotone F-rule, and so on. Several numerical tests show that the nonmonotone line search technique for unconstrained optimization and constrained optimization is efficient and competitive [12–14, 16]. Recently,  gives a method to overcome the drawback of zigzagging with nonmonotone line search technique to determine the step length, which makes the algorithm more flexible.
The paper is organized as follows. The algorithm is presented in Section 2. In Section 3, under mild assumptions, some global convergence results are proved. The numerical results are shown in Section 4. And conclusions are given in the last section.
2. A Modified SQP-Type PVD Method with Nonmonotone Technique
To describe our algorithm, we first give the modified quadratic subproblems of SQP method. Given , we distribute it into blocks, as the iterate denote an approximate Hessian.
We use to perturb the subproblem of of  and give a new subproblem instead of it. Consider
For convenience, given , the block of will be denoted by
In (5), is in step . Note that (5) is always feasible for . To test whether constraints are satisfied or not, we denote the violation function as follows: where , , denotes the Euclidean norm on . It is easy to see that if and only if is a feasible point ( if and only if is infeasible).
We describe our modified PVD algorithm as follows.
Algorithm A. Consider the following.
Step 0. Start with any . Choose parameters , , , positive integer , and positive definite matrices , . Set , .
Having , check a stopping criterion. If it is not satisfied, compute as follows.
Step 3. Choose which is the largest one in the sequence satisfying:
Remark 1. In Step 3 of Algorithm A, the nonmonotone parameter satisfies
For the convenience, we denote , where .
In a restoration algorithm, we aim to decrease the value of more precisely, we will use a trust region type method to obtain , by the help of the nonmonotone technique. Let
Algorithm B is similar to the restoration phase given by Su and Yu , we describe the Restoration Algorithm as follows.
Algorithm B. Consider the following.
Step 0. Assume , , , , .
Step 1. If , then stop.
Step 2. Compute
to get . Calculate .
Step 3. If , then let , , , and go to Step 2.
Step 4. If , then let , , , and go to Step 1.
3. The Convergence Properties
To prove the global convergence of Algorithm A, we make the following assumptions.
(A1) The iterate remains in compacted subset .(A2) The objective function and the constraint functions are twice continuously differentiable on .(A3) For all , the set is linearly independent, where .(A4) There exist two constants such that , for all iteration and .(A5)The solution of problem (12) satisfies where is a constant.
Remark 2. Assumptions (A1) and (A2) are the standard assumptions. (A3) is the LICQ constraint qualification. (A4) plays an important role in obtaining the convergence results. (A5) is the sufficient reduction condition which guarantees the global convergence in a trust region method. Under the assumptions, is bounded below and the gradient function is uniformly continuous in , where .
Similar to Lemma 1 in , we can get the following conclusions.
Lemma 4. Suppose (A1)–(A4) hold, is a symmetric positive definite, then is well defined and is the unique KKT point of (5). Furthermore, is bounded over compact subsets of , where is the set of symmetric positive definite matrices.
Proof. First note that the feasible set for (5) is nonempty, since is always feasible. It is clear that, if is a solution to (5) if an only if solves the unconstrained problem Since the function being minimized in (14) is strictly convex and radially unbounded, it follows that is well defined and unique as a global minimizer for the convex problem (5) and thus unique as a KKT point for that problem. So we have due to (17) and , we have is bounded. Since is twice continuously differentiable and assumption (A1) holds, for all , is bounded, set the maximum is . For (16), and combining with Assumption (A4), we have , so that ; therefore, is bounded on .
Lemma 5. Suppose that is positive definite matrix, and is the solution of . Then we have the results (B1) and (B2).(B1) The following inequality holds: where .(B2) If then , and is a KKT point of problem (1).
(B1) Since , is a feasible point of , from the optimality of , we have
Together with the definination of , implies that (18) holds.
(B2) Since is the solution of (5), there exists and such that the KKT condition of (5), while , we have By the definition of and (22), . Then, from (18) and , . From (24), it follows that Hence, it follows from Assumption (A3) Together with (20) and (21) that . For (23) we have .
From (22) and (24), we have due to (20) and set , we have
Taking into account separability of constraints , and let , then
that is, is a KKT point of (1).
Lemma 7 (see [17, Lemma 4]). Under Assumption (A5), the Restoration Algorithm (Algorithm B) terminates finitely.
The following Lemma implies that every cluster point of which generated by Algorithm A is a feasible point of problem (1).
Lemma 8. Let be an infinite sequence generated by Algorithm A, then .
Theorem 9. Suppose is an infinite sequence generated by Algorithm A, is the solution of (5), , then , .
Proof. By Assumption (A1), there exists a point such that for , where is an infinite index set. By Algorithm A and Lemma 8, we consider the following two possible cases.
Case I. is an infinite index set. In this case, we have
Since , we obtain
Hence for , it holds
Since is bounded below, converges. Due to (31), we can obtain
By Lemma 6, there exists such that . By Assumption (A4), we obtain
From the uniform continuity of , this implies that
Let , for any given , we can have
Using the same arguments above, we obtain
Since admits a limit, by the uniform continuity of on , it holds
Then by the relation (29), we have
Using the same arguments for deriving (33), we obtain that
Case II. is a finite index set, which implies is an infinite index set.
If does not hold, then there exists a positive number and an infinite index set , such that , for all . Since is the solution of . By the KKT condition of (5), we have
Since , and (44), we obtain , therefore there exists satisfied .
By Lemma 8, implies . Hence there exists , such that for , it holds
Consequently, from (43), we deduce
Taking into account separability of constraints , then which contradicts the definition of . The proof is complete.
Proof. By Assumption (A2), there exists , such that . From Lemma 8, we have , which means that is a feasible point. From Theorem 9, we have . By Lemma 5, implies that , that is, .
Compare the KKT condition of (5) with the KKT condition of problem (1), and let .
Taking into account separability of constraints and , we conclude
Therefore is a KKT point of problem (1).
4. Numerical Results
In our test, we only verify our convergence theory without comparing with serial algorithms. In the future, we will propose the convergent rate of the parallel algorithm.
Not having access to a parallel computer, we have carried out on Graphics Processing Unit (GPU) to solve them and implement the algorithm on Compute Unified Device Architecture (CUDA) . However, the input and output are implemented by CPU. And all the subproblems are solved on blocks which are constructed into many threads. If there are blocks in GPU, which is equivalent to processors in a parallel computer. Many threads of the block can achieve a large number of vector calculations in current block.
We have implemented Algorithm A in C language. All codes were run under Visual Studio 2008 and Cuda 4.0 on the DELL Precision T7400. In Table 1, we report the number of iterations and the running times for Algorithms A on Problem (52). The results in Table 1 confirm that the proposed approach certainly makes sense. We solve the subproblem (5) which is a positive semidefinite quadratic programs by the modified interior-point methods. And the Restoration Algorithm of  is solved by Trust-region Approach combined with local pattern search methods. All algorithms are coded by C language without using any function from optimization Toolbox; hence the results are not the best. In Table 1, denotes the dimensions of test problem and denotes the variable number of one block. There is less iteration number of subproblem with more numbers of blocks for high-dimensional problem, which shows the algorithm is reliable.
The PVD algorithm which was proposed in 1994 is used to solve unconstrained optimization problems or has a special case of convex block-separable constraints. In 1997, M. V. Solodov proposed useful generalizations that consist, for the general unconstrained case, of replacing exact global solution of the subproblems by a certain natural sufficient descent condition, and, for the convex case, of inexact subproblem solution in the PVD algorithm. M. V. Solodov proposed a PVD approach in 1998 which applied to problems with general convex constraints directly and show that the algorithm converges. In 2002, C. A. Sagastizábal et al. propose two variants of PVD for the constrained case: without assuming convexity of constraints, but assuming block-separable structure and for inseparable constraints, but assuming convexity. In this paper, we propose the modified algorithm of  used to the general constraints but with block-separable structure.
In a word, the algorithm above is a special structure of the objective function or constraints with a special structure. In the further, we will study the parallel algorithm with general inseparable constraints.
This paper was supported by National Natural Science Foundation of China (11101420, 71271204) and other foundations (2010BSE06047). The authors would like to thank Dr. Jiang Zhipeng for valuable work on numerical experiments.
- C. A. Sagastizábal and M. V. Solodov, “Parallel variable distribution for constrained optimization,” Computational Optimization and Applications, vol. 22, no. 1, pp. 111–131, 2002.
- M. C. Ferris and O. L. Mangasarian, “Parallel constraint distribution,” SIAM Journal on Optimization, vol. 1, no. 1, pp. 487–500, 1994.
- D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation, Prentice Hall, Englewood Cliffs, NJ, USA, 1989.
- P. Tseng, “Dual coordinate ascent methods for non-strictly convex minimization,” Mathematical Programming, vol. 59, no. 2, pp. 231–247, 1993.
- M. V. Solodov, “New inexact parallel variable distribution algorithms,” Computational Optimization and Applications, vol. 7, no. 2, pp. 165–182, 1997.
- M. V. Solodov, “On the convergence of constrained parallel variable distribution algorithms,” SIAM Journal on Optimization, vol. 8, no. 1, pp. 187–196, 1998.
- F. Zheng, C. Han, and Y. Wang, “Parallel SSLE algorithm for large scale constrained optimization,” Applied Mathematics and Computation, vol. 217, no. 12, pp. 5377–5384, 2011.
- C. Han, Y. Wang, and G. He, “On the convergence of asynchronous parallel algorithm for large-scale linearly constrained minimization problem,” Applied Mathematics and Computation, vol. 211, no. 2, pp. 434–441, 2009.
- M. Fukushima, “Parallel variable transformation in unconstrained optimization,” SIAM Journal on Optimization, vol. 8, no. 3, pp. 658–672, 1998.
- J. Mo, K. Zhang, and Z. Wei, “A variant of SQP method for inequality constrained optimization and its global convergence,” Journal of Computational and Applied Mathematics, vol. 197, no. 1, pp. 270–281, 2006.
- R. Fletcher and S. Leyffer, “Nonlinear programming without a penalty function,” Mathematical Programming, vol. 91, pp. 239–269, 2002.
- L. Grippo, F. Lampariello, and S. Lucidi, “A nonmonotone line search technique for Newton's method,” SIAM Journal on Numerical Analysis, vol. 23, no. 4, pp. 707–716, 1986.
- L. Grippo, F. Lampariello, and S. Lucidi, “A truncated Newton method with nonmonotone line search for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 60, no. 3, pp. 401–419, 1989.
- G. Liu, J. Han, and D. Sun, “Global convergence of the BFGS algorithm with nonmonotone linesearch,” Optimization, vol. 34, no. 2, pp. 147–159, 1995.
- W. Sun, J. Han, and J. Sun, “Global convergence of nonmonotone descent methods for unconstrained optimization problems,” Journal of Computational and Applied Mathematics, vol. 146, no. 1, pp. 89–98, 2002.
- P. L. Toint, “An assessment of nonmonotone linesearch techniques for unconstrained optimization,” SIAM Journal on Scientific Computing, vol. 17, no. 3, pp. 725–739, 1996.
- K. Su and Z. Yu, “A modified SQP method with nonmonotone technique and its global convergence,” Computers & Mathematics with Applications, vol. 57, no. 2, pp. 240–247, 2009.
- Y.-X. Yuan and W. Sun, Optimization Theory and Methods, Springer, New York, NY, USA, 2006.
- C. T. Lawrence and A. L. Tits, “A computationally efficient feasible sequential quadratic programming algorithm,” SIAM Journal on Optimization, vol. 11, no. 4, pp. 1092–1118, 2001.
- I. Buck, Parallel Programming with CUDA, 2008, http://gpgpu.org/sc2008.
Copyright © 2013 Congying Han et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.