Abstract

We proposed a three-term gradient descent method that can be well applied to address the optimization problems in this article. The search direction of the obtained method is generated in a specific subspace. Specifically, a quadratic approximation model is applied in the process of generating the search direction. In order to reduce the amount of calculation and make the best use of the existing information, the subspace was made up of the gradient of the current and prior iteration point and the previous search direction. By using the subspace-based optimization technology, the global convergence result is established under Wolfe line search. The results of numerical experiments show that the new method is effective and robust.

1. Introduction

Gradient descent and conjugate gradient (CG) methods have profound significance for dealing with unconstrained optimization problems:where is a continuous-differential function. They are widely used because of the low storage requirements and the strong convergence property. Starting from an initial iteration point , in each iteration, the method produces an approximate solution sequence of (1):in which is the current iteration point, is the step length, and is the search direction that usually has the following form (for typical CG methods):where is the gradient of at and is the CG parameter. The step-size is usually determined by performing a specific line search, among which Wolfe line search [1, 2] is one of the most commonly used:andwhere . Sometimes, the strong Wolfe line search given in (4) andis also widely used in the CG method for the establishment of convergence results. Different choices of form different versions of the CG methods. The research results of conjugate gradient algorithm are very rich, including PRP, HS, LS, and DY [310]. For a detailed survey of conjugate gradient methods, refer to [11].

Recently, subspace technique has attracted more and more researchers’s attention. Various subspace techniques are applied to construct different approaches to deal with various optimization problems. For detailed description of subspace technique, refer to [1214]. In [15], Yuan and Stoer came up with the SMCG method by using subspace technique in the conjugate gradient method. Specifically, a two-dimensional subspace is using in [15], to determine the search direction, that is,where and are parameters and . In [16, 17], Li et al. obtained the values of and by minimizing the dynamically selected conic approximate model and tensor model, respectively. In order to solve nonlinear optimization problems, Conn et al. [18] brought up a class of iterated-subspace minimization methods. Similar to Yuan’s idea [15], Yang et al. [19] raised a new CG method in the same subspace .

For better convergence and robust numerical performance, some authors pay attention to study conjugate gradient methods with different forms. Zhang et al. [20] presented a three-term conjugate gradient method, which possesses sufficient descent property without line search. And, they further extended two-variant method and established the global convergence results for general function under the standard Wolfe line search. In [21], Narushima et al. proposed a specific three-term CG method, drawing on the idea of multistep quasi-Newton method. Deng and Wan [22] put forward a three-term CG algorithm, in which the direction is formed by , , and . For some other three-term conjugate gradient methods, refer to [2327].

The outline of the article is shown below. In Section 2, we use the technique of subspace minimization to derive the search direction on the subspace constituted by . We elaborate the proposed algorithm in Section 3. Section 4 provides the convergence analysis of the given algorithm under some suitable conditions. In Section 5, the numerical experiments compared with other methods are presented.

2. Derivation of the Search Direction

Different from the abovementioned subspace , in this paper, the following subspaceis used to construct the search direction. In order to simplify the calculation process, for each iteration point , the following quadratic model is used to approximate the function where can be viewed as an approximation of Hessian matrix. Moreover, is supposed to be positive definite and meet quasi-Newton equation . Model has two advantages, which is not only a good approximation of [14] but also convenient to minimize in the subspace . Since and are usually not collinear, we discuss the case of dimensions 3 and 2.Case I. dim () = 3.In this case, the direction has the formwhere are parameters to be calculated. Adding (10) to the minimizing problem (9), we get the following minimizing problem:then and are the solutions of the following linear system:To solve system (12), we need to estimate two quantities and . Inspired by BBCG [28], we adoptNext, we choose the BFGS [2932] update initialized with the identity matrix to compute :This choice not only retains useful information but also gives numerical results that perform better than the scaling matrix . Substituting (13) and (14) into the linear system (12), we getCombining (13), the determinant of the system (15) is calculated asThen, and are obtained asCase II. dim() = 2.By observing numerical experiments, we find that produces better numerical results. In this case, we chose as the subspace. The direction is expressed asSubstituting (18) into (9), we getThen, the solution of (19) can be expressed asTherefore,If it is an exact line search, namely, , then, from (17) and (21), we getwhich means , and the obtained method is degraded to the HS method (Hestenes and Stiefel) [4].

3. The Proposed TCGS Algorithm

This section is mainly to elaborate the three-term CG method with subspace techniques (TCGS), in which the acceleration scheme [33] is used (Algorithm 1).

Step 1: given and , set and .
Step 2: if holds, then stop; else, go to step 3.
Step 3: determine the step-size by Wolfe line search, which means that conditions (4) and (5) hold.
Step 4: generate by using the acceleration scheme [33], and compute , , and .
Step 5: if dim() = 2, compute by (21), and go to step 6. If dim() = 3, compute by (10) and (17), and go to step 6.
Step 6: set , and go to step 2.

4. Convergence Analysis

The main content of this section is to study the convergence properties of the TCGS algorithm. Some necessary assumptions for the objective function are given as follows.

Assumption 1. The level set is bounded.

Assumption 2. Suppose that is continuous-differential, and its gradient is Lipschitz continuous with the constant .,Based on the above assumptions, it can be showed that there exists a constant such thatIt is well known that, for optimization algorithm, the relevant properties of the search direction are very important for the convergence of the algorithm. Firstly, we will study some properties of the search direction generated by the TCGS algorithm. The following lemmas show that the direction is decreasing, and Dai–Liao conjugacy condition is satisfied.

Lemma 1. Suppose that . Then, generated by the TCGS algorithm is a descent direction.

Proof. According to (9), we can get . Since and is generated by the TCGS algorithm, . Furthermore,

Lemma 2. The search direction satisfies the Dai–Liao conjugacy condition with , namely, .

Lemma 3. Suppose that is generated by the proposed TCGS algorithm and the step size possesses the conditions (4) and (5), then

Proof. Based on condition (5) and assumption (23), we getSince , we prove it.

Lemma 4. Assume that Assumptions 1 and 2 hold. Consider the algorithm TCGS in which Wolfe line search is used to compute the step size . Then, the Zoutendijk condition [34] holds:

Proof. Combining equation (4) and Lemma 3, we haveCondition (29) and Assumption 1 deduce (28) directly.

Lemma 5. Suppose that the objective function satisfies Assumptions 1 and 2, consider the algorithm TCGS in which the strong Wolfe line search (4) and (6) is used to compute the step size . Ifholds, then

Theorem 1. Under the Assumptions 1 and 2, consider the sequence generated by algorithm TSGS, satisfies the conditions (4) and (6). For uniformly convex function , i.e., there exists a constant such that

Then, (31) holds.

Proof. From (23) and (32), we get, respectively,On the basis of (33), combined with Cauchy inequality, it follows thatFurthermore, add the triangle inequality and (24), and we haveNext, we analyze the convergence in two cases:Case I. dim() = 3. and are computed by (16), and using (24), (33), (35), and (36), we getSimilarly, we obtainTherefore,Case II. dim() = 2.Using (20), (24), and (33), we obtainTherefore,From Lemma 3, (31) holds.

5. Numerical Results

This section aims to observe the performance of the TCGS algorithm through numerical experiments and verify the effectiveness of the algorithm in dealing with unconstrained problems. For experimental purposes, we compared the numerical performance of TCGS with the PRP and MTHREECG [22] methods, in which MTHREECG has a structure similar to CGS, and PRP is a classic and effective CG method.

In [22], Deng and Wan presented a similar three-term conjugate gradient method (MTHREECG) with the following form:

We chose a total of 75 test functions, and all the test functions come from [35]. The dimension of each function is from 1000, 2000, … to 10000 for the numerical experiments. The code is written by Fortran and available at https://camo.ici.ro/neculai/THREECG/threecg.for [36]. The default parameter values of the algorithm are consistent with [36].

The performance profile introduced by Dolan and Moré [37] is one of the most used tools for evaluating the performance of different methods. In this paper, we use it to investigate the numerical performance of the proposed TCGS algorithm.

The performance profile for the number of iterations can be seen in Figure 1. Looking at Figure 1, it can be found that the TCGS algorithm can solve about of the test problems with the least iterations. With the increasing factor , TCGS method outperforms both MTHREECG and PRP methods. Figure 2 represents the performance profile with respect to the number of function evaluations. The result shows the similar performance with the result in Figure 1, in which TCGS also outperforms both MTHREECG and PRP methods. From the numerical experiments, it can be seen that the proposed TCGS method is efficient and robust in dealing with a set of unconstrained test problems.

6. Conclusion

By using the idea of subspace minimization, we come up with a new type of subspace gradient descent method in this paper. In our method, the subspace is spanned by , and the quadratic approximation of the objective function is minimized to obtain the search direction. Therefore, the direction has the form , where the values of and are computed by discussing in two cases. In addition, we prove the descent property of the direction and show that Dai–Liao conjugacy condition is satisfied. Under the Wolf line search, the convergence of the proposed TCGS algorithm is established. Observing the numerical results, the performance of the TCGS algorithm in a set of unconstrained problems is competitive.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

This research was supported by the Guangxi Natural Science Foundation (nos. 2018GXNSFAA281340 and 2020GXNSFAA159014) and Program for Innovative Team of Guangxi University of Finance and Economics.