Variational Analysis, Optimization, and Fixed Point Theory 2014View this Special Issue
A QP-Free Algorithm for Finite Minimax Problems
The nonlinear minimax problems without constraints are discussed. Due to the expensive computation for solving QP subproblems with inequality constraints of SQP algorithms, in this paper, a QP-free algorithm which is also called sequential systems of linear equations algorithm is presented. At each iteration, only two systems of linear equations with the same coefficient matrix need to be solved, and the dimension of each subproblem is not of full dimension. The proposed algorithm does not need any penalty parameters and barrier parameters, and it has small computation cost. In addition, the parameters in the proposed algorithm are few, and the stability of the algorithm is well. Convergence property is described and some numerical results are provided.
We consider the nonlinear minimax problems of the form where and are continuously differentiable. Denote By introducing an auxiliary variable , the problem (1) can be represented as the following standard nonlinear program (see ): It is obvious that the KKT conditions of (3) are equivalent to So, a point is called the stationary point (see ) of (1), if there exists a vector such that (4) holds, where is called the multiplier vector.
1.1. Related Work
Some algorithms have been proposed to solve the minimax problems (1) and can be grouped into three classes. The first one is the direct nonsmooth method. The problem (1) is viewed as an unconstrained nonsmooth optimization problem, which can be solved by some nonsmooth methods, such as bundle methods, gradient sampling methods, and cutting plane methods; see, for example, [3–5].
Secondly, a variety of regularization approaches have been used to obtain smooth approximations to the problem (1); see, for example, [6–10]. The main advantage of the smoothing techniques is that the minimax problems are converted into smooth unconstrained optimization problems that can be solved by a standard unconstrained minimization solver. However, when the approximation accuracy is high, the smooth approximating problems become significantly ill-conditioned. Hence, the unconstrained optimization solver may experience numerical difficulty and slow convergence. Consequently, the simple use of smoothing techniques is complicated by the need of trading-off accuracy of approximation against problem ill-conditioning.
The third one is based on solving the equivalent smooth nonlinear programming problem (3), such as the sequential quadratic programming (SQP) methods (see [11–16]), the trust-region strategies (see [17, 18]), sequential quadratically constrained quadratic programming (SQCQP) method (see ), gradient projection method (see ), and the interior point (IP) methods (see [21–23]). The advantage of this class method is that the nondifferentiable optimization problem (1) can be transformed into an equivalent smooth constrained nonlinear programming problem which can be solved by well-established methods.
It is well known that the SQP method is one of the efficient methods for solving the smooth nonlinear program due to its fast convergence rate. Jian et al. , Hu et al. , and Zhu et al.  present some new SQP type algorithms for unconstrained minimax problems, respectively. For an iterative point , a new quadratic program (QP) subproblem is given by where is a symmetric positive matrix and is the active constraints set. The descent direction of can be obtained by solving (5), and the algorithms have global and superlinear convergence by introducing another correction direction.
In these SQP algorithms or SQCQP method, QP subproblems or quadratically constrained quadratic programming (QCQP) subproblems with inequality constraints are also required to be solved which are computationally expensive compared with system of linear equations. In addition, the IP methods and the second class smoothing techniques do not need to solve QP and QCQP subproblems, but they need the penalty parameter or barrier parameter which is difficult to deal with and causes numerical difficulty when the penalty parameter or barrier parameter is too large. Therefore, it is necessary to construct a new algorithm without solving any QP and QCQP subproblems and using penalty parameters and barrier parameters.
1.2. Division of the Systems of Linear Equations
In this paper, we intend to replace the QP subproblem (5) used in [14–16] by two systems of linear equations with the same matrix so that the computation effort per iteration is much less and propose a QP-free algorithm, which does not need any penalty parameters and barrier parameters.
For (5), in order to speed up the rate of convergence and construct the systems of linear equations conveniently, we select an index and consider the following QP subproblem by introducing more parameters , associated with the iterate :
Since (6) is a convex program with linear constraints, its optimal solution is the KKT point; that is, (6) is equivalent to the following KKT system with the multiplier vector : Motivated by the KKT conditions above, we present the following system of linear equations (SLE): The equation “” comes from . Obviously, the system above is equivalent to Let be the solution of (9), and is taken as the first direction of our algorithm, which is not entirely suitable as the main search direction although it is a descent direction of . In fact, the global convergence cannot be guaranteed, since the properties of the nonnegativity and complementarity of multiplier vector cannot be guaranteed at the point with and at the accumulation points of the infinite sequence generated by the proposed algorithm. Thus, we consider computing the second direction by another linear system: where the right-hand parameters , are yielded by and , as follows: Considering that the linear systems (10) and (9) have the same decomposed coefficient matrix, the computational cost is typically low. Lemma 8 shows that is still a descent direction of , so can be taken as the main search direction to design the algorithm.
The parameters , , need to be devised deliberately to guarantee the nonsingularity of the coefficient matrix and global convergence. It is a difficult work throughout the whole research.
1.3. Properties of Our Algorithm
The proposed algorithm in this paper possesses the following properties.(i)Only the constraints indexed by some subset of are considered which reduces the scale and computation cost of the subproblems to some extent.(ii)At each iteration, only the solutions of two linear systems with the same coefficient matrix are required; that is, the new algorithm is completely QP-free.(iii)It does not need any penalty parameters and barrier parameters. Therefore, the difficulty of choosing some suitable penalty parameters and barrier parameters is avoided.(iv)It needs few parameters which are adjusted easily, and the algorithm is robust.(v)It has weakly global convergence under some suitable assumptions.
We conclude this section by giving some notation which is used throughout this paper. The symbol refers to the Euclidean norm. In addition, we denote by an empty set, the cardinality of any finite set by , and by det the determinant of the matrix . Furthermore, the directional derivation of at the point along with the direction is denoted by . It is easy to know that
2. Description of Algorithm
The new algorithm is based on the following assumption.(A1)The vectors are linearly independent for any and each point .
Lemma 1. Suppose that the vectors are linearly independent for and index set . Then, for any given , there exists a constant such that are linearly independent for each .
For a point , an index , and a given index set such that are linearly independent, we introduce the following technique similar to  to generate the parameter in Lemma 1. Define where is Napierian base and the parameter .
Lemma 2 (see ). Suppose that the vectors are linearly independent. Then the vectors are also linearly independent for each .
Let be a given iteration point. We denote and use the following pivoting operation to generate the index set such that has full column rank, so vectors are linearly independent.
Pivoting Operation (POP)
Step (i). Select an initial parameter .
Step (ii). Generate active constraint subset and matrix by
Step (iii). If or det, then set , , and stop; otherwise, set and repeat Step (ii).
For simplicity, denote From the POP and Lemma 2 above, we know that = has full column rank for each . Furthermore, it is easy to get the following lemma.
Lemma 3. Suppose that (A1) holds. Then the matrix has full column rank for each and .
To describe some beneficial properties of the POP above, which is helpful for discussing the convergence of our algorithm, we have the following results.
Lemma 4. Suppose that (A1) holds, and let .(i)The parameter can be obtained in a finite number of steps in the POP.(ii)If a sequence is bounded, then there exist two constants such that
Proof. Based on the assumption (A1), it is easy to get that (i) and (18) hold, so it is omitted here.
Now, we will prove that (19) holds. In view of and being the subsets of the fixed and finite set and the boundedness of , we assume by contradiction without loss of generality that there exist an infinite index set and such that According to (14) and (A1), we can get . So, is bounded from ; we may assume that there exists an infinite index set such that . On the other hand, from (18) one knows that vectors are linearly independent. Therefore, by Lemma 2, we know that are also linearly independent for each , which implies that vectors are linearly independent for each . So, are linearly independent for each . Denote that , has full column rank since . However, which contradicts the fact that has full column rank. So (19) holds.
A detailed description of the algorithm for solving (1) is given below.
Algorithm A. Parameters: , , .
Step 0 (initialization). Consider , an initial symmetric positive definite matrix . Set .
Step 1 (generating active set). Set parameter , generate the set by the POP, and let be the corresponding termination parameter.
Step 2. Compute according to (14), and adjust the parameter by
Step 3. Compute the unique solution of the following linear system: with
Step 4. Compute the unique solution of the following linear system: where is yielded by If and then is a stationary point of (1), and stop.
Step 5 (doing line search). Compute the step size , the first number of the sequence satisfying
Step 6. Set , and compute a new symmetric positive definite matrix . Set , and go back to Step 1.
Remark 5. If the case , arises; the algorithm will stop at iterate which is not a stationary point of (1). To avoid this pitfall, we will reset and solve (LS1) and (LS2) again.
For convenience of analysis in the rest of this paper, we give the equivalent forms of (LS1)-(LS2): So, the parameters , in (9)-(10) are selected as . From Algorithm A and (A1) as well as Lemma 3, we can get the following lemmas immediately.
Lemma 6. The matrix is nonsingular for each . Therefore, the coefficient matrix in systems (LS1)-(LS2) is nonsingular.
Taking into account the inverse matrix of that can be expressed as , with from (LS1)-(LS2) and (32), we have the following relations:
Lemma 7. If Algorithm A stops at an iterate with and , then is a stationary point of the problem (1).
Proof. If Algorithm A stops at an iterate with and , it follows from (31) that By the definition of , we have , , , and hence, by (34), Since has full column rank, from (35) it is easy to get and . If one denotes , , then from (35)-(36) we can conclude that is a stationary point of (1) with the multiplier vector .
Lemma 8. If , then(i);(ii);(iii) is a descent direction of at the iterate , so the proposed algorithm is well defined.
Proof. (i) From (34), (30), (33), and (27), one gets
Furthermore, since , from (34), we can get . This implies that (i) holds.
(ii) If , from and (27), we have . From Remark 5, we can assume without loss of generality that . Furthermore, it follows from (34) and (23) that . Therefore, from the second formula of (31) and assumption (A1), it is easy to get
(iii) From (12) and the above results (i)-(ii), one knows that , which implies that is a descent direction of at . So the line search can be performed and the proposed algorithm is well defined.
3. Global Convergence
In this part, under mild assumptions, we show that Algorithm A is weak globally convergent; that is, there exists at least one accumulation point of the iterates yielded by Algorithm A such that it is a stationary point of (1). To this end, in addition to (A1), the following two assumptions are necessary.(A2)The sequence generated by Algorithm A is bounded.(A3)There exist two constants such that
The following lemma establishes the boundedness of the associated sequences generated by the algorithm, and its proof is similar to Lemma 3.1 in , so it is omitted here.
Lemma 9. Suppose that (A1)–(A3) hold. Then, the sequences , , , , , , , , and are all bounded.
Lemma 10. Suppose that (A1)–(A3) hold. If an infinite index set satisfies then is a stationary point of (1).
Proof. From Lemma 9, we know there exists an infinite subset such that Again, follows since is monotone and bounded. In view of Lemma 8 (i) and (A3), we have This along with (40)-(41) shows that So, passing to the limit for , , in (30), one gets Furthermore, it follows from (23) and (27) that , . So, and . If one denotes , , and , then (44) implies that is a stationary point with the multiplier vector .
Theorem 11. Suppose that (A1)–(A3) hold. Then Algorithm A either stops at a stationary point of (1) in a finite number of iterations or generates an infinite sequence , of which at least one accumulation point is a stationary point of (1). In such sense, Algorithm A is said to possess weakly global convergence.
Proof. The cases and are discussed, respectively.
Case A. If , from (14), Lemma 4 (ii), and (A2), we know that there exists a constant such that holds for all large enough. Therefore, in view of (A2) and (23), there exists an infinite index such that that is, , , . So, from (34) and the boundedness of , we have . It further follows that , . Thus, . So, according to Lemma 10 and , it is easy to conclude that is a stationary point of (1).
Case B. If , then we have for all large enough. Suppose that is any given accumulation point of . Since is a fixed and finite set, from Lemma 9, there always exists an infinite index set such that Suppose by contradiction that is not a stationary point. We first show by contradiction that the given infinite index set satisfies , . Otherwise, since and , , there exists a constant such that
Consequently, we prove that the line search inequality (29) holds for all large enough and small enough. We denote It follows from the boundedness of and Taylor expansion that Denote . For , according to (18) in Lemma 4 (ii), it is easy to know that there exists a constant such that for all . From Lemma 8 (ii), one has for large enough and small enough For , one has for large enough and small enough .
For , from the POP and Lemma 4 (ii), we can obtain . So, holds for large enough and small enough.
Summarizing the analysis above, we know that there exists a constant such that the step size . Furthermore, for , . So, . Combining (29), we have Since the whole sequence is decreasing and , we know that . So, passing to the limit for and in inequality (51), one has , which contradicts the fact that , , , and . The contradiction shows that (40) holds. Furthermore, one can conclude from Lemma 10 that is a stationary point of (1), which contradicts the assumption. The whole proof is completed.
4. Numerical Results
In this section, some preliminary numerical tests on 5 typical problems from  are reported, and the computation results show that Algorithm A is efficient. All the numerical experiments were implemented on MATLAB 7.0, under Windows XP and 2.2 GHz CPU. The BFGS formula with Powell’s modification  is adopted in the algorithm, and is the identity matrix. The parameters were selected as , , and . In addition, execution is terminated if one of the following termination criteria is satisfied:(a),(b).
The computational results are reported in Table 1, and the columns of Table 1 have the following meanings: IP: the initial point; : the number of variables; : the number of functions ; ALG: the type of algorithm; NI: the number of iterations. “Algo A” represents Algorithm A in this paper, “J2006-1” and “J2006-2” represent the algorithms in , and “Hu2009” represents the algorithm in .
From Table 1, we can see that our algorithm can find the solutions of the test problems with a small number of iterations, and the computational results illustrate that our algorithm executes well for those problems. The numerical results are comparative with the algorithms in [13, 15]. Furthermore, we only need to solve two systems of linear equations with the same coefficient matrix per iteration. Considering that these linear systems have the same decomposed coefficient matrix, the computational cost per iteration of Algorithm A is typically low. This shows the potential advantage of our algorithm when applied to solving problems with large numbers of constraints. In addition, the parameters in the proposed algorithm are few, and the stability of the algorithm is very well.
5. Concluding Remarks
In this paper, a QP-free algorithm without solving any QP and QCQP subproblems is presented for unconstrained nonlinear finite minimax problems. At each iteration, only two systems of linear equations with the same coefficient matrix need to be solved. The proposed algorithm does not need any penalty parameters and barrier parameters which are difficult to deal with. Furthermore, under some mild assumptions, the global convergence is attained. As further work of this method, we think that there are still some problems worthy of discussing. For example, the assumption (A1) is different from the linearly independent assumption in common use, and any of them cannot derive the other. The discussion that the assumption (A1) is a constraint qualification needs further consideration. In addition, one should also take into account improving it to have superlinear convergence and generalizing it to solve minimax problems with inequality constraints.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This research was supported by the National Natural Science Foundation of China (11271086), the Guangxi Natural Science Foundation (2013GXNSFBA019017, 2012GXNSFAA053007), and Science Foundation of Guangxi Education Department (2013YB080, ZD2014107).