A Simple Alternating Direction Method for the Conic Trust Region Subproblem
A simple alternating direction method is used to solve the conic trust region subproblem of unconstrained optimization. By use of the new method, the subproblem is solved by two steps in a descent direction and its orthogonal direction, the original conic trust domain subproblem into a one-dimensional subproblem and a low-dimensional quadratic model subproblem, both of which are very easy to solve. Then the global convergence of the method under some reasonable conditions is established. Numerical experiment shows that the new method seems simple and effective.
In this paper, we consider the unconstrained optimization problemwhere is continuously differentiable. The trust region method is a very effective method for the unconstrained optimization problem (1) (see [1–6]). Traditional trust region methods are based on a quadratic model and the corresponding quadratic program subproblem is, at the th iteration,where is the current iterate point, , is symmetric and an approximation to the Hessian of , refers to the Euclidean norm, and is the trust region radius at the th iteration. There are many methods that can be used to solve the subproblems (2)-(3). The simple, low cost, and effective methods are dogleg methods (see [7–11]). Now, we recall the simple dogleg algorithm for solving trust region subproblem with the quadratic model as in .
tep 0. Input the data of the th iteration i.e., , and .
tep 1. Compute . If , then , and stop.
tep 2. Compute . If , then , and stop. Otherwise, go to tep 3.
tep 3. Computethen , where .
In 1980, Davidon first proposed the conic model (see ). It is an alternative model to substitute the quadratic model. For optimization problems; if the objective function has a strong nonquadratic or its curvature changes severely, then the conic model is better than the quadratic model both the effect of data fitting and the result of numerical calculations. In addition, the conic model can supply enough freedom to make best use of both information of gradients and function values in iterate points. In view of these good properties of conic model, it has attracted wide attention of many scholars [14–28]. In , Ni proposed a new trust region subproblem and gave the optimality conditions for the trust region subproblems of a conic model. That is, at the th iteration, the trial step is computed by solving the following conic model trust region subproblemwherehorizon vector , is symmetric and positive semidefinite, and () is a sufficiently small positive number. We note that the conic model has a denominator and the shape of the trust region is irregular; therefore the conic trust region subproblems (5)-(7) are not easy to search for the descent point and difficult to solve. The trust region method often does not require the exact solution of trust region subproblem but only requires an approximate solution. The dogleg method for solving the trust region subproblem based on conic model is an approximate solution method; however its calculation is relatively complicated.
In this paper, we continue to study the subproblems (5)-(7). In order to find an easy way to solve, inspired by the alternating direction method of multipliers (ADMM), we consider obtaining the approximate solution by the two-step search in two orthogonal directions in the trust region . ADMM is an algorithm that solves convex optimization problems by breaking them into smaller pieces, each of which are then easier to handle. Because of its significant efficiency and easy implementation, it has recently found wide application in a number of areas (see [29–45]).
In the following, we use the alternating orthogonal direction search method to find the approximate solution of the subproblems (5)-(7). The rest of this paper is organized as follows. In the next section, the motivation and description of the simple alternating direction search method are presented. In Section 3, we give the quasi-Newton method based on the conic model for solving unconstrained optimization problems and prove its global convergence properties. The numerical results are provided in Section 4.
2. Range of and the Approximate Solution of the Subproblem
In this section, we will modify the range of and give the motivation and description of the algorithm. We note that the conic model has one more parameter than . Therefore, can make full use of the existing function information to satisfy more interpolation conditions by using of the function values and the gradient values. All of these will improve the effectiveness of the algorithm. In general, chooses a descent direction, such as , , or (see [13–17]). For convenience, we omit the index of , , and in this section. Therefore, in this paper we assume thatand is positive (abbreviated as ).
LetFrom (8), we have and
Although in principle we are seeking the optimal solution of the subproblems (5)-(7), it is enough to find an approximate solution in the feasible region and guarantee a sufficient reduction in the model and the global convergence. Therefore, in order to simplify this algorithm we choose in (7) such that it satisfies
In the following, we consider the alternating direction search method to solve the subproblems (5)-(7) by making full use of the parameters . The new method is divided into two steps. First, we search in the direction of and then search in the direction which is perpendicular to .
By the direct computation, we have that the derivative of iswhereFrom (17), we know that and then . Since , then has only one stationary point
By simple calculation, the following lemmas can be easily obtained.
Lemma 2. Suppose and . Then .
Lemma 3. Under the same conditions as Lemma 2, then is monotonically increasing in the trust region ; is monotonically decreasing for and .
Proof. (1) If then from (16) we know that . From Lemmas 2 and 3, we can obtain that if then and if then . Therefore, .
(2) If , then . From Lemmas 2 and 3, we can similarly get .
(3) If , then . From Lemmas 2 and 3, we know that Because of , then from (17) we have . Since , thenThen . The theorem is proved.
It is worth noting that if then from (17) we have . Therefore, for this case we set and exit the calculation of subproblem. Otherwise, we know that is inside the trust region . Then, we should carry out the calculation of the second stage below.
We set and substitute it into . Then the subproblems (5)-(7) becomewhereIn order to remove the equality constraint in (28), we use the null space technique. That is, for then there exist mutually orthogonal unit vectors orthogonal to the parameter vector . Set and , where . Then (27)-(28) can be simplified as the following subproblem:whereSet , , and . By Algorithm 1, we can obtain the solution of the subproblems (30)-(31). Then and . Thus, the subproblems (5)-(7) are solved approximately.
Algorithm 5. Given , and ,
tep 1. If , then set . Then solve the subproblems (5)-(7) by Algorithm 1 to get , and stop.
tep 2. Compute and by (9) and (10). let .
tep 3. Compute , and by (17) and (20).
tep 4. Compute by (24).
tep 5. If , then , and stop; otherwise, compute , , , and by (29) and (32).
tep 6. Set , and . Then solve the subproblems (30)-(31) by Algorithm 1 to get .
tep 7. Set and , and stop.
In order to discuss the lower bound of predicted reduction in each iteration, we define the following predicted reduction:
Now we should prove the following theorem to guarantee the global convergence of the algorithm proposed in the next section.
Proof. (1) If is obtained by tep 1 of Algorithm 5, then from Nocedal and Wright  we havewhere .
(2) If is obtained by tep 5 of Algorithm 5, then and , where as defined in (24). By computation, we haveFrom (24), we know that if then and . And then from (17) and (20), we can obtainandCombining with (37)-(39), we know thatwhereFor , then holds obviously.
(3) If is obtained by tep 7 of Algorithm 5, then , where . From (24), we know that . Combining with (33) and (34), we have Because is obtained by Algorithm 1, then from  we havewhere and , , and are defined by (29) and (32). Thus,where the second equality is from (20) and the last inequality is from (41).
Therefore, the theorem follows from (36), (40), and (44) with
3. The Algorithm and Its Convergence
In this section, we propose a quasi-Newton method with a conic model for unconstrained minimization and prove its convergence under some reasonable conditions. In order to solve the problem (1), we approximate with a conic model of the formwhere , , , and are parameter vectors.
Now we give the simple alternating direction trust region algorithm based on conic model (46).
tep 0. Choose parameters , , , and ; give a starting point , , , and an initial trust region radius ; set .
tep 1. Compute and . If , and then stop with as the approximate optimal solution; otherwise go to tep 2.
tep 2. Set , , , and . Then solve the subproblem (5)-(7) by Algorithm 5 to get one of the approximate solution .
tep 3. Compute the ratio of predicted reduction and the actual reduction wheretep 4. If , then set and go to tep 2. If , then set and choose the new trust region bound satisfyingtep 5. Generate and ; set , and go to tep 1.
The choice of parameter in the cone model method is crucial. In general, and are chosen to satisfy certain interpolation conditions, which means that the conic model function interpolates both the function values and the gradient values of the objective function at and . The choice of the parameters and can refer to [13–17] and [47–49], respectively. In this paper, we are not prepared to study the specific iterative formulas of and in depth and directly adopt the choice of in  and the choice of in .
Theorem 8. Under the same conditions as Lemma 2, suppose that the level setand the sequence , , and is all uniformly bounded, is symmetric and positive definite, and is twice continuously differentiable in . Then for any , Algorithm 7 terminates in finite number of iterations, that is,
Proof. We give the proof by contradiction. Suppose that there is such thatFrom the hypothesis, we haveCombining with (51)-(55), we havewhere the first inequality follows fromand the second inequality is from andFrom tep 4 of Algorithm 7 and (56), we obtain that for all Since is bounded from below and , then we havewhich implies thatAnd thenOn the other hand, from (55) and (62) we can getThen from (55) to (63), we have where and . Combining with (56) and (65), we can get thatFrom (62) we have . Hence, there is a sufficiently large positive number such that andholds. From tep 4 of Algorithm 7, it follows that which is a contradiction to (62). The theorem is proved.
4. Numerical Tests
In this section, Algorithm 7 is tested with some standard test problems from [16, 50]. The names of the 16 test problems are listed in Table 1. All the computations are carried out in Matlab R2016b on a microcomputer in double precision arithmetic. These tests use the same stopping criterion . The columns in the Tables have the following meanings: No. denotes the numbers of the test problems; is the dimension of the test problems; Iter is the number of iterations; is the number of function evaluations performed; is the number of gradient evaluations; is the final objective function value; is the Euclidean norm of the final gradient; CPU(s) denotes the total iteration time of the algorithm in seconds.
The parameters in these algorithms are
In order to analyze the effectiveness of our new algorithm, we compare Algorithm 7 with the alternating direction trust region method based on conic model (abbreviated as ADCTR) in . The numerical results of ADCTR and Algorithm 7 are listed in Table 2. We note that the optimal value of these test problems is . From Table 2, we can see that the performance of Algorithm 7 is feasible and effective. For the above 16 problems, Algorithm 7 is better than the ADCTR for 13 tests and is somewhat bad for 4 tests, and the two algorithms are same in efficiency for the other 1 tests. Therefore, it seems that Algorithm 7 is better than algorithm ADCTR in .
The algorithm ADCTR and Algorithm 7 are similar; that is, the idea of alternating direction method is used to solve the conic trust region subproblem. However, Algorithm 7 in this paper takes into account the special property that the parameter vector is generally taken as the descending direction. Thus, under the assumption of , the calculation of Algorithm 7 is simpler to calculate and has shorter CPU time, better calculation effect, and also global convergence.
However, there are still many aspects worthy of further study, for example, weakening the positive definite condition of , using algorithms to solve large-scale problems, calculation of convergence rate, and so on.
All data generated or analysed during this study are included within the article.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This work was supported by National Natural Science Foundation of China (11071117, 11771210) and the Natural Science Foundation of Jiangsu Province (BK20141409, BK20150420).
R. Schnabel, “Conic methods for unconstrained minimization and tensor methods for nonlinear equations,” Mathematical Programming the State of the Art, vol. 21, no. 1, pp. 417–438, 1982.View at: Google Scholar
L. Zhao and W. Sun, “A conic affine scaling method for nonlinear optimization with bound constraints,” Asia-Pacific Journal of Operational Research, vol. 30, no. 3, pp. 1–30, 2013.View at: Google Scholar
J. Eckstein and M. Fukushima, “Some reformulations and applications of the alternating direction method of multipliers,” in Large Scale Optimization, pp. 115–134, Springer, 1994.View at: Google Scholar
C. H. Chen, B. S. He, Y. Ye et al., “The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent,” Mathematical Programming, vol. 155, no. 1-2, pp. 57–79, 2016.View at: Google Scholar
M. L. Goncalves, J. G. Melo, and R. D. Monteiro, “Improved pointwise iteration-complexity of a regularized ADMM and of a regularized non-Euclidean HPE framework,” SIAM Journal on Optimization, 2016.View at: Google Scholar
B. He and X. Yuan, “A class of ADMM-based algorithms for three-block separable convex programming,” Computational Optimization and Applications, vol. 70, no. 3, pp. 1–36, 2018.View at: Google Scholar
Y. T. Sun and J. L. Zhao, “An alternating directions method for structured split feasibility problems,” Journal on Numerical Methods and Computer Applications, vol. 39, no. 1, pp. 20–27, 2018 (Chinese).View at: Google Scholar
J. Nocedal and S. J. Wright, Numerical Optimization, Science Press, Beijing, China, 2006.
M. Al-Baali, “Damped techniques for enforcing convergence of quasi-Newton methods,” Taylor and Francis, Inc., vol. 29, no. 5, pp. 919–936, 2014.View at: Google Scholar
Q. Ni, Optimization Method and Program Design , Science Press, Beijing, China, 2009 (Chinese).