Abstract

A modified parallel variable distribution (PVD) algorithm for solving large-scale constrained optimization problems is developed, which modifies quadratic subproblem at each iteration instead of the of the SQP-type PVD algorithm proposed by C. A. Sagastizábal and M. V. Solodov in 2002. The algorithm can circumvent the difficulties associated with the possible inconsistency of subproblem of the original SQP method. Moreover, we introduce a nonmonotone technique instead of the penalty function to carry out the line search procedure with more flexibly. Under appropriate conditions, the global convergence of the method is established. In the final part, parallel numerical experiments are implemented on CUDA based on GPU (Graphics Processing unit).

1. Introduction

In this paper, we consider the following nonlinear programming problem: where and are all continuously differentiable. Suppose the feasible set in (1) is described by a system of inequality constraints:

In this paper, we give a new algorithm based on the method in [1], which partitions the problem variables into blocks , such that

And assume that has block-separable structure. Specifically,

Parallel variable distribution (PVD) algorithm for solving optimization problems was first proposed by Ferris and Mangasarian [2], in 1994, in which the variables are distributed among processors. Each processor has the primary responsibility for updating its block of variables while allowing the remaining secondary variables to change in a restricted fashion along some easily computable directions. The distinctive novel feature of this algorithm is the presence of the “forget-me-not” term which allows for a change in “secondary” variables. This makes PVD fundamentally different from the block Jacobi [3] and coordinate descent [4] methods. The forget-me-not approach improves robustness and accelerates convergence of the algorithm and is the key to its success. In 1997, Solodov [5] proposed useful generalizations that consist, for the general unconstrained case, of replacing exact global solution of the subproblems by a certain natural sufficient descent condition, and, for the convex case, of inexact subproblem solution in the PVD algorithm. Solodov [6] proposed an algorithm applied the PVD approach to problems with general convex constraints directly use projected gradient residual function as the direction and show that the algorithm converges, provided that certain conditions are imposed on the change of secondary variables. In 2002, Sagastizábal and Solodov [1] proposed two new variants of PVD for the constrained case. Without assuming convexity of constraints, but assuming block-separable structure, they showed that PVD subproblems can be solved inexactly by solving their quadratic programming approximations. This extends PVD to nonconvex (separable) feasible sets and provided a constructive practical way of solving the parallel subproblems. For inseparable constraints, but assuming convexity, they developed a PVD method based on suitable approximate projected gradient directions. The approximation criterion was based on a certain error bound result, and it was readily implementable. In 2011, Zheng et al. [7] gave a parallel SSLE algorithm, in which the PVD subproblems are solved inexactly by serial sequential linear equations, for solving large-scale constrained optimization with block-separable structure. Without assuming the convexity of constraints, the algorithm is proved to be globally convergent to a KKT point. Han et al. [8] proposed an asynchronous PVT algorithm for solving large-scale linearly constrained convex minimization problems with the idea of [9] in 2009, which based on the idea that a constrained optimization problem is equivalent to a differentiable unconstrained optimization problem by introducing the Fischer Function. And in particular, different from [9] the linear rate of convergence does not depend on the number of processors.

In this paper, we use [1] as our main reference on SQP-type PVD method for problems (1) and (4). Firstly, we introduce the algorithm in [1] simply. The original problem is distributed into parallel subproblems which treatment among parallel processors. The algorithm of [1] may result in that the linear constraints in quadratic programming subproblems are inconsistent or the solution of quadratic programming subproblems is unbounded, so that the algorithm may fail. This drawback has been overcome by many researchers, such as [10]. In [1], exact penalty function is used as merit function to carry out the line-search. For the penalty function, as pointed out by Fletcher and Leyffer [11], the biggest drawback is that the penalty parameter estimates could be problematic to obtain. To overcome this drawback, we use a nonmonotone technique to carry out the line search instead of penalty function.

Recent research [1214] indicates that the monotone line search technique may have some drawbacks. In particular, enforcing monotonicity may considerably reduce the rate of convergence when the iteration is trapped near a narrow curved valley, which can result in very short steps or zigzagging. Therefore, it might be advantageous to allow the iterative sequence to occasionally generate points with nonmonotone objective values. Grippo et al. [12] generalized the Armijo rule and proposed a nonmonotone line search technique for Newton’s method which permits increase in function value, while retaining global convergence of the minimization algorithm. In [15], Sun et al. give several nonmonotone line search techniques, such as nonmonotone Armijo rule, nonmonotone Wolfe rule, nonmonotone F-rule, and so on. Several numerical tests show that the nonmonotone line search technique for unconstrained optimization and constrained optimization is efficient and competitive [1214, 16]. Recently, [17] gives a method to overcome the drawback of zigzagging with nonmonotone line search technique to determine the step length, which makes the algorithm more flexible.

In this paper, we combine [1] with the ideas of [10, 17] and propose an infeasible SQP-type Parallel Variable Distribution algorithm for constrained optimization with nonmonotone technique.

The paper is organized as follows. The algorithm is presented in Section 2. In Section 3, under mild assumptions, some global convergence results are proved. The numerical results are shown in Section 4. And conclusions are given in the last section.

2. A Modified SQP-Type PVD Method with Nonmonotone Technique

To describe our algorithm, we first give the modified quadratic subproblems of SQP method. Given , we distribute it into blocks, as the iterate denote an approximate Hessian.

We use to perturb the subproblem of of [1] and give a new subproblem instead of it. Consider

For convenience, given , the block of will be denoted by

In (5), is in step . Note that (5) is always feasible for . To test whether constraints are satisfied or not, we denote the violation function as follows: where , , denotes the Euclidean norm on . It is easy to see that if and only if is a feasible point ( if and only if is infeasible).

We describe our modified PVD algorithm as follows.

Algorithm A. Consider the following.

Step 0. Start with any . Choose parameters , , , positive integer , and positive definite matrices , . Set , .

Having , check a stopping criterion. If it is not satisfied, compute as follows.

Step 1. Parallelization. For each processor , solve subproblem (5) to get . If , stop, otherwise return to Step 2.

Step 2. Synchronization. Set
If , return to Step 3; otherwise, set and return to Step 4.

Step 3. Choose which is the largest one in the sequence satisfying:

Step 4. Set . Let . If , then set , update to , set , and go to Step 1. Otherwise, let , call Restoration Algorithm of [17] to obtain , let , , and go to Step 1.

Remark 1. In Step 3 of Algorithm A, the nonmonotone parameter satisfies
For the convenience, we denote , where .

In a restoration algorithm, we aim to decrease the value of more precisely, we will use a trust region type method to obtain , by the help of the nonmonotone technique. Let

and .

Algorithm B is similar to the restoration phase given by Su and Yu [17], we describe the Restoration Algorithm as follows.

Algorithm B. Consider the following.

Step 0. Assume , , , , .

Step 1. If , then stop.

Step 2. Compute
to get . Calculate .

Step 3. If , then let , , , and go to Step 2.

Step 4. If , then let , , , and go to Step 1.

3. The Convergence Properties

To prove the global convergence of Algorithm A, we make the following assumptions.

Assumptions

(A1) The iterate remains in compacted subset .(A2) The objective function and the constraint functions are twice continuously differentiable on .(A3) For all , the set is linearly independent, where .(A4) There exist two constants such that , for all iteration and .(A5)The solution of problem (12) satisfies where is a constant.

Remark 2. Assumptions (A1) and (A2) are the standard assumptions. (A3) is the LICQ constraint qualification. (A4) plays an important role in obtaining the convergence results. (A5) is the sufficient reduction condition which guarantees the global convergence in a trust region method. Under the assumptions, is bounded below and the gradient function is uniformly continuous in , where .

Remark 3. We can use quasi-Newton BFGS methods to update , different from [18], we can use a small modification of to make reserve positive definite according to the formula (33) of [17].

Similar to Lemma  1 in [19], we can get the following conclusions.

Lemma 4. Suppose (A1)–(A4) hold, is a symmetric positive definite, then is well defined and is the unique KKT point of (5). Furthermore, is bounded over compact subsets of , where is the set of symmetric positive definite matrices.

Proof. First note that the feasible set for (5) is nonempty, since is always feasible. It is clear that, if is a solution to (5) if an only if solves the unconstrained problem Since the function being minimized in (14) is strictly convex and radially unbounded, it follows that is well defined and unique as a global minimizer for the convex problem (5) and thus unique as a KKT point for that problem. So we have due to (17) and , we have is bounded. Since is twice continuously differentiable and assumption (A1) holds, for all , is bounded, set the maximum is . For (16), and combining with Assumption (A4), we have , so that ; therefore, is bounded on .

Lemma 5. Suppose that is positive definite matrix, and is the solution of . Then we have the results (B1) and (B2).(B1) The following inequality holds: where .(B2) If then , and is a KKT point of problem (1).

Proof. (B1) Since , is a feasible point of , from the optimality of , we have Together with the definination of , implies that (18) holds.
(B2) Since is the solution of (5), there exists and such that the KKT condition of (5), while , we have By the definition of and (22), . Then, from (18) and , . From (24), it follows that Hence, it follows from Assumption (A3) Together with (20) and (21) that . For (23) we have .
From (22) and (24), we have due to (20) and set , we have
Taking into account separability of constraints , and let , then
that is, is a KKT point of (1).

Lemma 6 (see [17, Lemma 3]). In Step 3, the line search procedure is well defined.

Lemma 7 (see [17, Lemma 4]). Under Assumption (A5), the Restoration Algorithm (Algorithm B) terminates finitely.

From Lemmas 6 and 7, we know that the Algorithm A and Algorithm B are well implemented.

The following Lemma implies that every cluster point of which generated by Algorithm A is a feasible point of problem (1).

Lemma 8. Let be an infinite sequence generated by Algorithm A, then .

Theorem 9. Suppose is an infinite sequence generated by Algorithm A, is the solution of (5), , then , .

Proof. By Assumption (A1), there exists a point such that for , where is an infinite index set. By Algorithm A and Lemma 8, we consider the following two possible cases.

Case I. is an infinite index set. In this case, we have

Since , we obtain

Hence for , it holds

Since is bounded below, converges. Due to (31), we can obtain

By Lemma 6, there exists such that . By Assumption (A4), we obtain

From the uniform continuity of , this implies that

Let , for any given , we can have

If , since , (35) and (36) follow from (33) and (34). For a given number , we have

Using the same arguments above, we obtain

Therefore (35) and (36) hold for any given .

Now for any , , from Remark 1, we obtain , that is . Since together with (35), we can obtain , . Therefore,

Since admits a limit, by the uniform continuity of on , it holds

Then by the relation (29), we have

Using the same arguments for deriving (33), we obtain that

Case II. is a finite index set, which implies is an infinite index set.

If does not hold, then there exists a positive number and an infinite index set , such that , for all . Since is the solution of . By the KKT condition of (5), we have

Since , and (44), we obtain , therefore there exists satisfied .

By Lemma 8, implies . Hence there exists , such that for , it holds

Consequently, from (43), we deduce

Together with (44), (47), and transpose

Taking into account separability of constraints , then which contradicts the definition of . The proof is complete.

Theorem 10. Suppose is an infinite sequence generated by Algorithm A and the assumptions in Theorem 9 hold. Then every cluster point of is a KKT point of problem (1).

Proof. By Assumption (A2), there exists , such that . From Lemma 8, we have , which means that is a feasible point. From Theorem 9, we have . By Lemma 5, implies that , that is, .
Compare the KKT condition of (5) with the KKT condition of problem (1), and let .
Taking into account separability of constraints and , we conclude
Therefore is a KKT point of problem (1).

4. Numerical Results

To get some insight into computational properties of our approach in Section 2, we considered the same test problems taken from [1]. Choose Problem 1 of [1] as follows:

In our test, we only verify our convergence theory without comparing with serial algorithms. In the future, we will propose the convergent rate of the parallel algorithm.

Not having access to a parallel computer, we have carried out on Graphics Processing Unit (GPU) to solve them and implement the algorithm on Compute Unified Device Architecture (CUDA) [20]. However, the input and output are implemented by CPU. And all the subproblems are solved on blocks which are constructed into many threads. If there are blocks in GPU, which is equivalent to processors in a parallel computer. Many threads of the block can achieve a large number of vector calculations in current block.

We have implemented Algorithm A in C language. All codes were run under Visual Studio 2008 and Cuda 4.0 on the DELL Precision T7400. In Table 1, we report the number of iterations and the running times for Algorithms A on Problem (52). The results in Table 1 confirm that the proposed approach certainly makes sense. We solve the subproblem (5) which is a positive semidefinite quadratic programs by the modified interior-point methods. And the Restoration Algorithm of [17] is solved by Trust-region Approach combined with local pattern search methods. All algorithms are coded by C language without using any function from optimization Toolbox; hence the results are not the best. In Table 1, denotes the dimensions of test problem and denotes the variable number of one block. There is less iteration number of subproblem with more numbers of blocks for high-dimensional problem, which shows the algorithm is reliable.

5. Conclusion

The PVD algorithm which was proposed in 1994 is used to solve unconstrained optimization problems or has a special case of convex block-separable constraints. In 1997, M. V. Solodov proposed useful generalizations that consist, for the general unconstrained case, of replacing exact global solution of the subproblems by a certain natural sufficient descent condition, and, for the convex case, of inexact subproblem solution in the PVD algorithm. M. V. Solodov proposed a PVD approach in 1998 which applied to problems with general convex constraints directly and show that the algorithm converges. In 2002, C. A. Sagastizábal et al. propose two variants of PVD for the constrained case: without assuming convexity of constraints, but assuming block-separable structure and for inseparable constraints, but assuming convexity. In this paper, we propose the modified algorithm of [1] used to the general constraints but with block-separable structure.

In a word, the algorithm above is a special structure of the objective function or constraints with a special structure. In the further, we will study the parallel algorithm with general inseparable constraints.

Acknowledgments

This paper was supported by National Natural Science Foundation of China (11101420, 71271204) and other foundations (2010BSE06047). The authors would like to thank Dr. Jiang Zhipeng for valuable work on numerical experiments.