Research Article  Open Access
A DerivativeFree Conjugate Gradient Method and Its Global Convergence for Solving Symmetric Nonlinear Equations
Abstract
We suggest a conjugate gradient (CG) method for solving symmetric systems of nonlinear equations without computing Jacobian and gradient via the special structure of the underlying function. This derivativefree feature of the proposed method gives it advantage to solve relatively largescale problems (500,000 variables) with lower storage requirement compared to some existing methods. Under appropriate conditions, the global convergence of our method is reported. Numerical results on some benchmark test problems show that the proposed method is practically effective.
1. Introduction
Let us consider the systems of nonlinear equations:where is a nonlinear mapping. Often, the mapping, , is assumed to satisfy the following assumptions:(A1)There exists s.t. .(A2) is a continuously differentiable mapping in a neighborhood of .(A3) is invertible.(A4)The Jacobian is symmetric.The prominent method for finding the solution of (1) is the classical Newton’s method which generates a sequence of iterates from a given initial point viawhere . The attractive features of this method are rapid convergence and easy implementation. Nevertheless, Newton’s method requires the computation of the Jacobian matrix, which require the firstorder derivative of the systems. In practice, computations of some functions derivatives are quite costly and sometime they are not available or could not be done precisely. In this case Newton’s method cannot be applied directly.
In this work, we are interested in handling largescale problems for which the Jacobian either is not available or requires a low amount of storage; the best method is CG approach. It is vital to mention that the conjugate gradient methods are among the popular used methods for unconstrained optimization problems. They are particularly efficient for handling largescale problems due to their convergence properties, simple implementation, and low storage [1]. Notwithstanding, the study of conjugate gradient methods for largescale symmetric nonlinear systems of equations is scanty, and this is what motivated us to have this paper.
In general, CG methods for solving nonlinear systems of equations generate iterative points from initial given point usingwhere is attained via line search and direction is obtained usingwhere is termed as conjugate gradient parameter.
These problems which are under study may arise from an unconstrained optimization problem, a saddle point problem, KarushKuhnTucker (KKT) of equality constrained optimization problem, the discretized twopoint boundary value problem, the discretized elliptic boundary value problem, and so forth.
Equation (1) is the firstorder necessary condition for the unconstrained optimization problem when is the gradient mapping of some function ,For the equality constrained problem,where is a vectorvalued function, the KKT conditions can be represented as system (1) with , andwhere is the vector of Lagrange multipliers. Notice that the Jacobian is symmetric for all (see, e.g., [2]).
Problem (1) can be converted to the following global optimization problem (5) with our function defined byA large number of efficient solvers for largescale symmetric nonlinear equations have been proposed, analyzed, and tested in the last decade. Among them, the most classic one entirely due to Li and Fukushima [3], in which a GaussNewtonbased BFGS method is developed, and the global and superlinear convergence are also established. Subsequently, its performance is further improved by Gu et al. [4], where norm descent BFGS methods are designed. Since then, norm descent type BFGS methods especially cooperating with trust regions strategy are presented in the literature and showed their moderate effectiveness experimentally [5]. Still the matrix storage and solving of linear systems of equations are required in the BFGS type methods presented in the literature. The recent designed nonmonotone spectral gradient algorithm [6] falls within the framework of matrix free methods.
The conjugate gradient methods for symmetric nonlinear equations have received a good attention and take an appropriate progress. However, Li and Wang [7] proposed a modified FletcherReeves conjugate gradient method which is based on the work of Zhang et al. [8], and the results illustrate that their proposed conjugate gradient method is promising. In line with this development, further studies on conjugate gradient are inspired for solving largescale symmetric nonlinear equations. Zhou and Shen [9] extended the descent threeterm polakRibierePolyak of Zhang et al. [10] for solving (1) by combining with the work of Li and Fukushima [3]. Meanwhile the classic polakRibierePolyak is successfully used to solve symmetric equation (1) by Zhou and Shen [1].
Subsequently Xiao et al. [11] proposed a method based on wellknown conjugate gradient of Hager and Zhang [12], and the proposed method converges globally. Extensive numerical experiments showed that each of the abovementioned methods performs quite well. The combination of conjugate gradient algorithms and the Newton method, for the first time, was presented by Andrei [13, 14]. Hence, in this paper, we intended to present a new enhanced CG parameter which is matrix and derivativefree, respectively. This is made possible by combining Birgin and Martínez direction with classical Newton direction.
We organized the paper as follows. In the next section, we present the details of the proposed method. Convergence results are presented in Section 3. Some numerical results are reported in Section 4. Finally, conclusions are made in Section 5.
2. Derivation of the Method
In this section we present a new CG parameter , as a result of combining Birgin and Martínez direction with classical Newton direction. Recalling the Birgin and Martínez direction in [15] is defined bywhere ; see Raydan [16] for detail.
In [2] Ortega and Rheinboldt used the termto approximate the gradient , which avoids computing exact gradient and updated via line search method. It is clear that when is small, then
Recall, from Newton’s direction,Combining (9) and (11), we haveMultiplying both sides of (12) by leads toAfter little linear algebra, (13) transforms toTo ensure good approximation, we multiply both sides of (14) by to obtainEquation (15) can be rewritten asFrom secant condition, we haveIt is vital to note that, for this work, we claim that is symmetric matrix . Hence, (18) can also be written asSubstituting (19) into (16) yieldsAfter simplification, we obtained our CG parameter asMotivated by [3, 7] and using (10) we derive our CG parameterHaving derived the CG parameter in (22) and by using (9), we then present our direction aswhere .
Finally, we present our scheme as The direction given by (23) may not be a descent direction of (8), and then the standard Wolfe and Armijo line searches cannot be used to compute the step size directly. Zhang et al. [8] proved the global convergence of the global PRP method for general nonconvex optimization using some nondescent line search. Motivated by this, we use the nonmonotone line search proposed by Li and Fukushima in [3] to compute our step size . Let , , and be constants and let be a given positive sequence such thatLet that satisfy
Now, we can describe the algorithm for our proposed method as follows.
Algorithm 1 (derivativefree CG method (DFCG)).
Consider the following steps.
Step 1. Given , , , and compute , and set .
Step 2. Compute using (10) and test the stopping criterion. If yes, then stop; otherwise continue with Step 3.
Step 3. Compute by the line search (26).
Step 4. Compute .
Step 5. Compute the search direction as .
Step 6. Consider and go to Step 2.
3. Convergence Result
This section presents global convergence results of the derivativefree conjugate gradient methods. To begin with, let us define the level setIn order to analyze the convergence of our method, we will make the following assumptions on nonlinear systems .
Assumption 2.
Consider the following:(i)The level set defined by (27) is bounded.(ii)There exists such that and is continuous for all .(iii)In some neighborhood of , the Jacobian is Lipschitz continuous; that is, there exists a positive constant such that for all .
Properties (i) and (ii) imply that there exist positive constants , , and such that
Lemma 3 (see [3]). Let the sequence be generated by the algorithms above. Then the sequence converges and for all .
Lemma 4. Let the properties of (1) above hold. Then one has
Proof. By (25) and (26) we have, for all ,by summing the above inequality; then we obtainso from (29) and the fact that satisfies (25) the series is convergent. This implies (31). By a similar way, we can prove that (32) holds.
The following result shows that our derivativefree conjugate gradient methods algorithm is globally convergent.
Theorem 5. Let the properties of (1) above hold. Then the sequence which is generated by derivativefree conjugate gradient methods algorithm converges globally; that is,
Proof. We prove this theorem by contradiction. Suppose that (35) is not true, and then there exists a positive constant such thatSince , (36) implies that there exists a positive constant satisfyingCase (i) is as follows: consider Then by (32), we have . This and Lemma 3 show that , which contradicts with (36).
Case (ii) is as follows: consider . Since , this case implies thatand by definition of in (10) and the symmetry of the Jacobian, we havewhere we use (29) and (30) in the last inequality. Inequalities (25), (26), and (36) show that there exists a constant such thatBy (10) and (29), we getFrom (41) and (30), we obtainThis together with (38) and (32) shows that . Hence from (41), (42), and (40), we havemeaning that there exists a constant such that for sufficiently large Again from the definition of our we obtainwhich implies that there exists a constant such that for sufficiently large Without loss of generality, we assume that the above inequalities hold for all . Then we getwhich shows that the sequence is bounded. Since , then does not satisfy (26); namely,which implies thatBy the meanvalue theorem, there exists such thatSince is bounded, without loss of generality, we assume . By (10) and (23), we havewhere we use (45) and (26) and the fact that the sequence is bounded.
On the other hand, we haveHence, from (49)–(52), we obtain , which means . This contradicts with (36). The proof is then completed.
4. Numerical Results
In this section, we compare the performance of our following methods for solving nonlinear equation (1) with an inexact PRP conjugate gradient method for solving symmetric nonlinear equations [1]:(i)A derivativefree CG method (DFCG): we set , , , and .(ii)An inexact PRP (IPRP): we set , , , and .The code for the DFCG method was written in Matlab 7.4 R2010a and run on a personal computer 1.8 GHz CPU processor and 4 GB RAM memory. We stopped the iteration if the total number of iterations exceeds 1000 or . We tested the two methods on nine test problems with different initial points and values. Problems 6–9 are from [9]. Problem 1: Problem 2: Problem 3: Problem 4: Problem 5: Problem 6: Problem 7: Problem 8: Problem 9:
In Table 1, we listed numerical results, where “Iter” and “Time” stand for the total number of all iterations and the CPU time in seconds, respectively; is the norm of the residual at the stopping point. The numerical results indicate that the proposed method DFCG compared to IPRP has minimum number of iterations and CPU time, respectively. Figures 1 and 2 are performance profile derived by Dolan and Moré [17] which show that our claim is justified, that is, less CPU time and number of iterations for each test problem.

Moreover, on average, our is too small which signifies that the solution obtained is true approximation of the exact solution compared to the IPRP.
5. Conclusions
Many practical problems possessed symmetrical property. In this paper we present a derivativefree conjugate gradient (DFCG) method for symmetric nonlinear equations and compare its performance with that of an inexact PRP conjugate gradient method [1] by doing some numerical experiments. We also proved the global convergence of our proposed method by using a backtracking type line search, and the numerical results show that our method is efficient.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
References
 W. Zhou and D. Shen, “An inexact PRP conjugate gradient method for symmetric nonlinear equations,” Numerical Functional Analysis and Optimization, vol. 35, no. 3, pp. 370–388, 2014. View at: Publisher Site  Google Scholar  Zentralblatt MATH
 J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, NY, USA, 1970.
 D. H. Li and M. Fukushima, “A globally and superlinearly convergent GaussNewtonbased BFGS method for symmetric nonlinear equations,” SIAM Journal on Numerical Analysis, vol. 37, no. 1, pp. 152–172, 2000. View at: Google Scholar
 G.Z. Gu, D.H. Li, L. Qi, and S.Z. Zhou, “Descent directions of quasiNewton methods for symmetric nonlinear equations,” SIAM Journal on Numerical Analysis, vol. 40, no. 5, pp. 1763–1774, 2002. View at: Publisher Site  Google Scholar  Zentralblatt MATH  MathSciNet
 G. Yuan, X. Lu, and Z. Wei, “BFGS trustregion method for symmetric nonlinear equations,” Journal of Computational and Applied Mathematics, vol. 230, no. 1, pp. 44–58, 2009. View at: Publisher Site  Google Scholar
 W. Cheng and Z. Chen, “Nonmonotone spectral method for largescale symmetric nonlinear equations,” Numerical Algorithms, vol. 62, no. 1, pp. 149–162, 2013. View at: Publisher Site  Google Scholar
 D.H. Li and X.L. Wang, “A modified fletcherreevestype derivativefree method for symmetric nonlinear equations,” Numerical Algebra, Control and Optimization, vol. 1, no. 1, pp. 71–82, 2011. View at: Publisher Site  Google Scholar
 L. Zhang, W. Zhou, and D.H. Li, “Global convergence of a modified Fletcher–Reeves conjugate gradient method with Armijotype line search,” Numerische Mathematik, vol. 104, no. 4, pp. 561–572, 2006. View at: Publisher Site  Google Scholar
 W. Zhou and D. Shen, “Convergence properties of an iterative method for solving symmetric nonlinear equations,” Journal of Optimization Theory and Applications, vol. 164, no. 1, pp. 277–289, 2015. View at: Publisher Site  Google Scholar
 L. Zhang, W. Zhou, and D.H. Li, “A descent modified PolakRibièrePolyak conjugate gradient method and its global convergence,” IMA Journal of Numerical Analysis, vol. 26, no. 4, pp. 629–640, 2006. View at: Publisher Site  Google Scholar
 Y. Xiao, C. Wu, and S.Y. Wu, “Norm descent conjugate gradient methods for solving symmetric nonlinear equations,” Journal of Global Optimization, vol. 62, no. 4, pp. 751–762, 2015. View at: Publisher Site  Google Scholar
 W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005. View at: Publisher Site  Google Scholar
 N. Andrei, “A scaled BFGS preconditioned conjugate gradient algorithm for unconstrained optimization,” Applied Mathematics Letters, vol. 20, no. 6, pp. 645–650, 2007. View at: Publisher Site  Google Scholar
 N. Andrei, “Scaled memoryless BFGS preconditioned conjugate gradient algorithm for unconstrained optimization,” Optimization Methods and Software, vol. 22, no. 4, pp. 561–571, 2007. View at: Publisher Site  Google Scholar
 E. G. Birgin and J. M. Martínez, “A spectral conjugate gradient method for unconstrained optimization,” Applied Mathematics and Optimization, vol. 43, no. 2, pp. 117–128, 2001. View at: Publisher Site  Google Scholar
 M. Raydan, “The Barzilaiai and Borwein gradient method for the large scale unconstrained minimization problem,” SIAM Journal on Optimization, vol. 7, no. 1, pp. 26–33, 1997. View at: Publisher Site  Google Scholar
 E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2015 Mohammed Yusuf Waziri and Jamilu Sabi’u. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.