International Journal of Mathematics and Mathematical Sciences

Volume 2015, Article ID 961487, 8 pages

http://dx.doi.org/10.1155/2015/961487

## A Derivative-Free Conjugate Gradient Method and Its Global Convergence for Solving Symmetric Nonlinear Equations

^{1}Department of Mathematical Sciences, Faculty of Science, Bayero University Kano, Kano, Nigeria^{2}Department of Mathematics, Faculty of Science, Northwest University Kano, Kano, Nigeria

Received 15 June 2015; Accepted 11 August 2015

Academic Editor: Naseer Shahzad

Copyright © 2015 Mohammed Yusuf Waziri and Jamilu Sabi’u. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We suggest a conjugate gradient (CG) method for solving symmetric systems of nonlinear equations without computing Jacobian and gradient via the special structure of the underlying function. This derivative-free feature of the proposed method gives it advantage to solve relatively large-scale problems (500,000 variables) with lower storage requirement compared to some existing methods. Under appropriate conditions, the global convergence of our method is reported. Numerical results on some benchmark test problems show that the proposed method is practically effective.

#### 1. Introduction

Let us consider the systems of nonlinear equations:where is a nonlinear mapping. Often, the mapping, , is assumed to satisfy the following assumptions:(A1)There exists s.t. .(A2) is a continuously differentiable mapping in a neighborhood of .(A3) is invertible.(A4)The Jacobian is symmetric.The prominent method for finding the solution of (1) is the classical Newton’s method which generates a sequence of iterates from a given initial point viawhere . The attractive features of this method are rapid convergence and easy implementation. Nevertheless, Newton’s method requires the computation of the Jacobian matrix, which require the first-order derivative of the systems. In practice, computations of some functions derivatives are quite costly and sometime they are not available or could not be done precisely. In this case Newton’s method cannot be applied directly.

In this work, we are interested in handling large-scale problems for which the Jacobian either is not available or requires a low amount of storage; the best method is CG approach. It is vital to mention that the conjugate gradient methods are among the popular used methods for unconstrained optimization problems. They are particularly efficient for handling large-scale problems due to their convergence properties, simple implementation, and low storage [1]. Notwithstanding, the study of conjugate gradient methods for large-scale symmetric nonlinear systems of equations is scanty, and this is what motivated us to have this paper.

In general, CG methods for solving nonlinear systems of equations generate iterative points from initial given point usingwhere is attained via line search and direction is obtained usingwhere is termed as conjugate gradient parameter.

These problems which are under study may arise from an unconstrained optimization problem, a saddle point problem, Karush-Kuhn-Tucker (KKT) of equality constrained optimization problem, the discretized two-point boundary value problem, the discretized elliptic boundary value problem, and so forth.

Equation (1) is the first-order necessary condition for the unconstrained optimization problem when is the gradient mapping of some function ,For the equality constrained problem,where is a vector-valued function, the KKT conditions can be represented as system (1) with , andwhere is the vector of Lagrange multipliers. Notice that the Jacobian is symmetric for all (see, e.g., [2]).

Problem (1) can be converted to the following global optimization problem (5) with our function defined byA large number of efficient solvers for large-scale symmetric nonlinear equations have been proposed, analyzed, and tested in the last decade. Among them, the most classic one entirely due to Li and Fukushima [3], in which a Gauss-Newton-based BFGS method is developed, and the global and superlinear convergence are also established. Subsequently, its performance is further improved by Gu et al. [4], where norm descent BFGS methods are designed. Since then, norm descent type BFGS methods especially cooperating with trust regions strategy are presented in the literature and showed their moderate effectiveness experimentally [5]. Still the matrix storage and solving of -linear systems of equations are required in the BFGS type methods presented in the literature. The recent designed nonmonotone spectral gradient algorithm [6] falls within the framework of matrix free methods.

The conjugate gradient methods for symmetric nonlinear equations have received a good attention and take an appropriate progress. However, Li and Wang [7] proposed a modified Fletcher-Reeves conjugate gradient method which is based on the work of Zhang et al. [8], and the results illustrate that their proposed conjugate gradient method is promising. In line with this development, further studies on conjugate gradient are inspired for solving large-scale symmetric nonlinear equations. Zhou and Shen [9] extended the descent three-term polak-Ribiere-Polyak of Zhang et al. [10] for solving (1) by combining with the work of Li and Fukushima [3]. Meanwhile the classic polak-Ribiere-Polyak is successfully used to solve symmetric equation (1) by Zhou and Shen [1].

Subsequently Xiao et al. [11] proposed a method based on well-known conjugate gradient of Hager and Zhang [12], and the proposed method converges globally. Extensive numerical experiments showed that each of the above-mentioned methods performs quite well. The combination of conjugate gradient algorithms and the Newton method, for the first time, was presented by Andrei [13, 14]. Hence, in this paper, we intended to present a new enhanced CG parameter which is matrix- and derivative-free, respectively. This is made possible by combining Birgin and Martínez direction with classical Newton direction.

We organized the paper as follows. In the next section, we present the details of the proposed method. Convergence results are presented in Section 3. Some numerical results are reported in Section 4. Finally, conclusions are made in Section 5.

#### 2. Derivation of the Method

In this section we present a new CG parameter , as a result of combining Birgin and Martínez direction with classical Newton direction. Recalling the Birgin and Martínez direction in [15] is defined bywhere ; see Raydan [16] for detail.

In [2] Ortega and Rheinboldt used the termto approximate the gradient , which avoids computing exact gradient and updated via line search method. It is clear that when is small, then

Recall, from Newton’s direction,Combining (9) and (11), we haveMultiplying both sides of (12) by leads toAfter little linear algebra, (13) transforms toTo ensure good approximation, we multiply both sides of (14) by to obtainEquation (15) can be rewritten asFrom secant condition, we haveIt is vital to note that, for this work, we claim that is symmetric matrix . Hence, (18) can also be written asSubstituting (19) into (16) yieldsAfter simplification, we obtained our CG parameter asMotivated by [3, 7] and using (10) we derive our CG parameterHaving derived the CG parameter in (22) and by using (9), we then present our direction aswhere .

Finally, we present our scheme as The direction given by (23) may not be a descent direction of (8), and then the standard Wolfe and Armijo line searches cannot be used to compute the step size directly. Zhang et al. [8] proved the global convergence of the global PRP method for general nonconvex optimization using some nondescent line search. Motivated by this, we use the nonmonotone line search proposed by Li and Fukushima in [3] to compute our step size . Let , , and be constants and let be a given positive sequence such thatLet that satisfy

Now, we can describe the algorithm for our proposed method as follows.

*Algorithm 1 (derivative-free CG method (DFCG)). *
Consider the following steps.*Step 1*. Given , , , and compute , and set .*Step 2*. Compute using (10) and test the stopping criterion. If yes, then stop; otherwise continue with Step 3.*Step 3*. Compute by the line search (26).*Step 4*. Compute .*Step 5*. Compute the search direction as .*Step 6*. Consider and go to Step 2.

#### 3. Convergence Result

This section presents global convergence results of the derivative-free conjugate gradient methods. To begin with, let us define the level setIn order to analyze the convergence of our method, we will make the following assumptions on nonlinear systems .

*Assumption 2. *
Consider the following:(i)The level set defined by (27) is bounded.(ii)There exists such that and is continuous for all .(iii)In some neighborhood of , the Jacobian is Lipschitz continuous; that is, there exists a positive constant such that for all .

Properties (i) and (ii) imply that there exist positive constants , , and such that

Lemma 3 (see [3]). *Let the sequence be generated by the algorithms above. Then the sequence converges and for all .*

Lemma 4. *Let the properties of (1) above hold. Then one has*

*Proof. *By (25) and (26) we have, for all ,by summing the above inequality; then we obtainso from (29) and the fact that satisfies (25) the series is convergent. This implies (31). By a similar way, we can prove that (32) holds.

*The following result shows that our derivative-free conjugate gradient methods algorithm is globally convergent.*

*Theorem 5. Let the properties of (1) above hold. Then the sequence which is generated by derivative-free conjugate gradient methods algorithm converges globally; that is,*

*Proof. *We prove this theorem by contradiction. Suppose that (35) is not true, and then there exists a positive constant such thatSince , (36) implies that there exists a positive constant satisfyingCase (i) is as follows: consider Then by (32), we have . This and Lemma 3 show that , which contradicts with (36).

Case (ii) is as follows: consider . Since , this case implies thatand by definition of in (10) and the symmetry of the Jacobian, we havewhere we use (29) and (30) in the last inequality. Inequalities (25), (26), and (36) show that there exists a constant such thatBy (10) and (29), we getFrom (41) and (30), we obtainThis together with (38) and (32) shows that . Hence from (41), (42), and (40), we havemeaning that there exists a constant such that for sufficiently large Again from the definition of our we obtainwhich implies that there exists a constant such that for sufficiently large Without loss of generality, we assume that the above inequalities hold for all . Then we getwhich shows that the sequence is bounded. Since , then does not satisfy (26); namely,which implies thatBy the mean-value theorem, there exists such thatSince is bounded, without loss of generality, we assume . By (10) and (23), we havewhere we use (45) and (26) and the fact that the sequence is bounded.

On the other hand, we haveHence, from (49)–(52), we obtain , which means . This contradicts with (36). The proof is then completed.

*4. Numerical Results*

*4. Numerical Results*

*In this section, we compare the performance of our following methods for solving nonlinear equation (1) with an inexact PRP conjugate gradient method for solving symmetric nonlinear equations [1]:(i)A derivative-free CG method (DFCG): we set , , , and .(ii)An inexact PRP (IPRP): we set , , , and .The code for the DFCG method was written in Matlab 7.4 R2010a and run on a personal computer 1.8 GHz CPU processor and 4 GB RAM memory. We stopped the iteration if the total number of iterations exceeds 1000 or . We tested the two methods on nine test problems with different initial points and values. Problems 6–9 are from [9]. Problem 1: Problem 2: Problem 3: Problem 4: Problem 5: Problem 6: Problem 7: Problem 8: Problem 9:*

*In Table 1, we listed numerical results, where “Iter” and “Time” stand for the total number of all iterations and the CPU time in seconds, respectively; is the norm of the residual at the stopping point. The numerical results indicate that the proposed method DFCG compared to IPRP has minimum number of iterations and CPU time, respectively. Figures 1 and 2 are performance profile derived by Dolan and Moré [17] which show that our claim is justified, that is, less CPU time and number of iterations for each test problem.*