On Some Efficient Techniques for Solving Systems of Nonlinear Equations
We present iterative methods of convergence order three, five, and six for solving systems of nonlinear equations. Third-order method is composed of two steps, namely, Newton iteration as the first step and weighted-Newton iteration as the second step. Fifth and sixth-order methods are composed of three steps of which the first two steps are same as that of the third-order method whereas the third is again a weighted-Newton step. Computational efficiency in its general form is discussed and a comparison between the efficiencies of proposed techniques with existing ones is made. The performance is tested through numerical examples. Moreover, theoretical results concerning order of convergence and computational efficiency are verified in the examples. It is shown that the present methods have an edge over similar existing methods, particularly when applied to large systems of equations.
Solving the system of nonlinear equations is a common and important problem in various disciplines of science and engineering [1–4]. This problem is precisely stated as follows: For a given nonlinear function , where and , to find a vector such that . The solution vector can be obtained as a fixed point of some function by means of fixed point iteration One of the basic procedures for solving systems of nonlinear equations is the classical Newton’s method [4, 5] which converges quadratically under the conditions that the function is continuously differentiable and a good initial approximation is given. It is defined by where is the inverse of first Fréchet derivative of the function . Note that this method uses one function, one first derivative, and one matrix inversion evaluations per iteration.
In order to improve the order of convergence of Newton’s method, many modifications have been proposed in literature, for example, see [6–18] and references therein. For a system of equations in unknowns, the first Fréchet derivative is a matrix with evaluations whereas the second Fréchet derivative has evaluations. Thus, the methods such as those developed in [6–8] with second derivative are considered less efficient from a computational point of view.
In quest of efficient methods without using second Fréchet derivative, a variety of third and higher order methods have been proposed in recent years. For example, Frontini and Sormani in  and Homeier in  developed some third order methods each requiring the evaluations of one function, two first-order derivatives, and two matrix inversions per iteration. Darvishi and Barati  presented a fourth-order method which uses two functions, three first derivatives, and two matrix inversions. Cordero and Torregrosa  developed two variants of Newton’s method with third-order convergence; one of the variants requires the evaluations of one function, three first derivatives, and two matrix inversions whereas, the other requires one function, two first derivatives, and two matrix inversions. Noor and Waseem  have developed two third-order methods of which one method requires one function, two first derivatives, and two matrix inversions, and the other requires one function, three first derivatives, and two matrix inversions. Cordero et al.  presented fourth and fifth-order methods requiring two functions, two first derivatives, and one matrix inversion and three functions, two first derivatives, and one matrix inversion, respectively. Cordero et al.  developed a sixth-order method which requires two functions, two first derivatives, and two matrix inversions. Grau-Sánchez et al.  proposed the methods with third, fourth, and fifth-order convergence; third-order method with one function, two first derivatives, and two matrix inversions, fourth method with three functions, one first derivative, and one matrix inversion, and fifth-order method with two functions, two first derivatives, and two matrix inversions. Grau-Sánchez et al.  have also developed fourth and sixth-order methods requiring the evaluations of one function, one divided difference, one first derivative, and two matrix inversions for the fourth-order method and two functions, one divided difference, one first derivative, and two matrix inversions for the sixth-order method. Sharma et al.  proposed a fourth-order method which requires one function, two first derivatives, and two matrix inversions per iteration.
In this paper, our aim is to develop second derivative free iterative methods that may satisfy the basic requirements of generating the quality numerical algorithms, that is, the algorithms which have (i) high convergence speed, (ii) minimum computational cost, and (iii) simple structure. Taking into account these considerations, we here devise the methods of third, fifth, and sixth-order of convergence. Third order method is composed of two steps namely, Newton’s and weighted-Newton steps, and per iteration it requires the evaluations of one function, two first derivatives, and one matrix inversion. Fifth and sixth-order methods are composed of three steps of which the first step is Newton’s step and the last two are weighted-Newton steps. Per iteration, both of these schemes require the evaluations of two functions, two first derivatives, and one matrix inversion.
We summarize the contents of the paper as follows. In Section 2, third-order scheme is developed and its convergence analysis is studied. In Section 3, fifth-order scheme is developed and its convergence analysis is studied. Sixth-order scheme with its convergence analysis is presented in Section 4. In Section 5, the computational efficiency of new methods is discussed and is compared with the methods which lie in the same category. Various numerical examples are considered in Section 6 to show the consistent convergence behavior of the methods and to verify the theoretical results. Section 7 contains the concluding remarks.
2. Development of Third-Order Method
Since our objective is to develop a method which accelerates the convergence of Newton’s method with minimum number of evaluations. Thus we consider two-step iterative scheme of the type: where and are some parameters and denotes the identity matrix. In order to explore the convergence property of (3), we recall the following result of Taylor’s expansion on vector functions (see ).
Lemma 1. Letting be a time Fréchet differentiable in a convex set , then for any , the following expression holds: where
Also, Then, where .
Inverse of is given by where ( to ) are such that they satisfy the definition Solving the above system, we have From the above expressions, we find that From (7) and (13), it follows that Letting , then using (14) in the first step of (3), we have Expanding about , we obtain Then, Substituting (14) and (17) in the second step of (3), we obtain the error equation as Our aim is to find values of the parameters and in such a way that the proposed iterative scheme (3) may produce order of convergence as high as possible. This happens when and . Then, the error equation (18) on using (15) becomes
According to the above analysis the following theorem can be formulated.
Theorem 2. Let the function be sufficiently differentiable in a neighborhood of its zero . If an initial approximation is sufficiently close to , then the local order of convergence of the iterative scheme (3) is at least , provided and .
Thus, the proposed scheme (3) finally can be written as Clearly this formula uses one function, two derivatives, and one matrix inversion evaluations per iteration.
3. The Fifth-Order Method
Based on the two-step scheme (20), we propose the following three-step scheme: where and are some parameters that are to be determined.
Now to calculate the local order of convergence, let . In view of (19), error in the second step of (21) is given by Expanding about , Using (13), (17), (22), and (23) in the third step of (21), it follows that It is clear that for the parameter values and , the terms containing and vanish. Thus, from the expressions of and the error equation (24) yields which shows fifth-order of convergence.
From the above analysis, we can state the following theorem.
Theorem 3. Let the function be sufficiently differentiable in an open neighborhood of its zero . If an initial approximation is sufficiently close to , then the local order of convergence of method (21) is at least , if and .
The proposed algorithm (21) now can be written as It is clear that this formula uses the evaluations of two functions, two derivatives and only one matrix inversion in all.
4. The Sixth-Order Method
Here, with the first two steps of the scheme (20) we consider the following three-step scheme: where , , and are some arbitrary constants.
In the following lines we obtain the local order of convergence of the above proposed scheme. Using (13), (17), (22), and (23) in the third step of (27), it follows that Combining (15), (22), and (28), it is easy to prove that for the parameters , , and , error equation (28) produces the maximum error. For this set of values the above equation reduces to which shows sixth-order of convergence. Thus, based on the above discussion, we can formulate the following theorem.
Theorem 4. Let the function be sufficiently differentiable in an open neighborhood of its zero . If an initial approximation is sufficiently close to , then the local order of convergence of method (27) is at least , provided , and .
Finally, the proposed scheme (27) is given by Like fifth-order scheme, this formula also uses two function evaluations, two derivatives evaluations, and only one matrix inversion in all.
5. Computational Efficiency
To obtain an assessment of the efficiency of the proposed methods we shall make use of efficiency index, according to which the efficiency of an iterative method is given by , where is the order of convergence and is the computational cost per iteration. In order to do this, we must consider all possible factors which contribute to the total cost of computation. For a system of nonlinear equations in unknowns, the computational cost per iteration is given by (see ) Here represents the number of evaluations of scalar functions used in the evaluation of , is the number of evaluations of scalar functions of , say , , , represents the number of products or quotients needed per iteration, and and are ratios between products and evaluations required to express the value of in terms of products. We suppose that a quotient is equivalent to products.
To compute in any iterative method, we evaluate scalar functions, whereas the number of scalar evaluations is for any new derivative . In addition, we must include the amount of computational work required to evaluate inverse of a matrix. Instead of computing the inverse operator, we solve a linear system, where we have products and quotients in the LU decomposition and products and quotients in the resolution of two triangular linear systems. Moreover, we must add products for the multiplication of a vector by a scalar, while products for the same operation in case of a matrix.
We compare the computational efficiency of present third order method with existing third-order methods by Potra-Pták , Homeier , Noor-Waseem , and Cordero-Torregrosa ; the fifth-order method with fifth-order method given by Grau-Sánchez et al. ; the sixth-order method with sixth-order method developed by Cordero et al. . These methods are expressed as follows.
Method by Potra-Pták ():
Homeier’s method ():
Method by Noor and Waseem ():
Method by Cordero and Torregrosa ():
Method by Grau-Sánchez et al. ():
Method by Cordero et al. ( ): In addition, we also compare present methods with each other. Let us denote efficiency indices of by and computational cost by . Then taking into account the above and previous considerations, we have
5.1. Comparison between the Efficiencies
To compare the computational efficiencies of the iterative methods , we consider the ratio It is clear that if , the iterative method is more efficient than . Taking into account that the border between two computational efficiencies is given by , this boundary is given by the equation written as a function of , , and , , is a positive integer , and .
versus Case. Here we study the boundary expressed by as a function of and . This boundary, on using (38) and (39) in (47), is given by In order to compare the efficiencies and in the -plane, we present in Figure 1 some boundary lines () corresponding to the cases . These boundaries are the lines with positive slopes, where on the above and on the below of each line.
versus Case. Using the corresponding values of and from (38) and (40) in (47), it follows that for all and . Thus, we have for all and . This is shown graphically in Figure 2 for a particular set . The reason for taking will be made clear in the next section.
versus and versus Cases. From (47) on using (38), (41) and then (38), (42), it is easy to prove that and for all and . Thus, we conclude that is higher than and for all and . This situation is shown graphically in Figure 3 taking the same set of values of as in the previous case.
versus Case. Here we study the boundary expressed by as a function of , , and . This boundary is given by where and . In order to compare the efficiencies and in the -plane, we present in Figure 4 some boundary lines () corresponding to the cases taking in each case. These boundaries are the straight lines with positive slopes, where on the above and on the below of each line.
versus Case. In this case the boundary is expressed by where and . Similar to the case above, here we also present the boundaries () using the same set of and in -plane to compare the efficiencies and . These boundaries are the lines with positive slopes, where on the above and on the below of each line (see Figure 5).
versus Case. For this case the boundary is given by where and . Here we also draw boundaries in -plane with the same set of values of and . The boundaries are the lines with negative slopes, where on the above and on the below of each line (see Figure 6).
versus Case. Substituting the corresponding values of and from (43) and (44) in (47), we see that for all and , which further implies that for all , . This result is verified graphically in Figure 7 by using the values .
versus Case. In this case substitution of the corresponding values of and from (45) and (46) in (47) results in for all and . Thus, we analyze that for all and . This is shown graphically in Figure 8 by using the same values of that are used in the previous case.
We summarize the above results in following theorem.
Theorem 5. (i) For all , , and we have:
(b) and ,
(c) and .
Otherwise, the comparison depends on , , , and .
(ii) For all , we have:
(a) for ,
(b) for ,
(c) for ,
(d) for ,
where , , , and .
6. Numerical Results
In this section, some numerical problems are considered to illustrate the convergence behavior and computational efficiency of the proposed methods. The performance is compared with the existing methods , , , , , and that we have introduced in previous section. All computations are performed in the programming package Mathematica  using multiple-precision arithmetic with 4096 digits. For every method, we analyze the number of iterations needed to converge to the solution such that . In numerical results, we also include CPU time utilized in the execution of program which is computed by the Mathematica command “TimeUsed". To verify the theoretical order of convergence, we calculate the computational order of convergence using the following formula : taking into consideration the last three approximations in the iterative process.
To connect the analysis of computational efficiency with numerical examples, we apply the definition of the computational cost (31), according to which an estimation of the factors and is claimed. For this, we express the cost of the evaluation of the elementary functions in terms of products, which depends on the computer, the software, and the arithmetics used (see [21, 22]). In Table 1, an estimation of the cost of the elementary functions in product units is shown, wherein the running time of one product is measured in milliseconds. For the hardware and the software used in the present numerical work, the computational cost of quotient with respect to product is (see Table 1).
We consider the following problems for numerical tests.
Problem 1. Consider the system of two equations: with initial approximation . The solution is . The concrete values of parameters in this case are , which we use in (38)–(46) for calculating computational costs and efficiency indices of different methods.
Problem 2. Consider the mixed Hammerstein integral equation (see [2, 17]): where ; and the kernel is We transform the above equation into a finite-dimensional problem by using Gauss-Legendre quadrature formula given as where the abscissas and the weights are determined for by Gauss-Legendre quadrature formula. Denoting the approximation of by , we obtain the system of nonlinear equations: where The initial approximation assumed is and the solution of this problem is , . Here the values of are .
Problem 3. Consider the system of three equations: with initial approximation . The solution correct to 16 places of decimal is , and the concrete values of parameters are .
Problem 4. Consider the system of eight equations: The initial approximation chosen is for the solution , . Here the values of are .
Problem 5. Consider the system of thirteen equations: with the initial value towards the solution ,, . For this problem the corresponding values of parameters are .
Problem 6. Consider the system of fifteen equations: with initial value . The solution of this problem is ,, and .
Problem 7. Now consider the system of twenty equations: with initial approximation and the solution . The concrete values of parameters in this case are .
The numbers of iterations , the computational order of convergence , the computational costs , in terms of products, the computational efficiencies , and the mean CPU time (CPUtime) for each method are displayed in Table 2. Computational cost and efficiency are calculated according to the corresponding expressions given by (38)–(46) by using the values of parameters , , and as shown at the end of each problem, while taking in each case. The mean CPU time is calculated by taking the mean of 50 performances of the program, where we use as the stopping criterion in single performance of the program.
From numerical results, we can observe that like the existing methods the present methods show consistent convergence behavior. It is also clear that the computational order of convergence overwhelmingly supports the theoretical order of convergence. As far as the verification of the results of Theorem 5 is concerned, it is simple to check the theoretical results of statement (i) using the numerical values of the efficiency indices displayed in the second last column of Table 2. However, the results of statement (ii) are not quite obvious to verify. In order to do this, we first find , , , and using the values of , , and obtained for each numerical problem and then we compare the efficiencies as per the rules (a)–(d) of statement (ii). The results are displayed in Table 3. From Table 2 we can see that the numerical values of , , , and also confirm the results as shown in Table 3.
Comparison of numerical results shows that is more efficient among third-order methods. However, the present third-order method is efficient among the rest of third-order methods in majority of the problems. The present fifth-order and sixth-order methods are more efficient than the existing methods of same and inferior order for larger systems. This behavior can be observed in the numerical results of Problems 5–7. From the results of last two columns in Table 2, one can conclude that the more is the efficiency of a method the lesser is the computing time of that method. This shows that the efficiency results are in complete agreement with the CPU time utilized in the execution of program.
7. Concluding Remarks
In the foregoing study, we have developed iterative methods of third, fifth and sixth-order of convergence for solving systems of nonlinear equations. The computational efficiency in its general form is discussed. Then a comparison between the efficiencies of proposed methods with existing methods is made. It is proved that the present methods are at least competitive with existing methods of similar nature; in particular, these are especially efficient for larger systems. To illustrate the new techniques, seven numerical examples are presented and completely solved. The performance is compared with some known methods of similar character. The theoretical order of convergence and the analysis of computational efficiency are verified in the considered examples. The numerical results have confirmed the robust and efficient nature of the proposed techniques.
A. M. Ostrowski, Solution of Equations and Systems of Equations, Academic Press, New York, NY, USA, 1966.
J. M. Ortega and W. C. Rheinboldt, Iterative Solutions of Nonlinear Equations in Several Variables, Academic Press, New York, NY, USA, 1970.
F. A. Potra and V. Pták, Nondiscrete Induction and Iterarive Processes, Pitman, Boston, Mass, USA, 1984.
C. T. Kelley, Solving Nonlinear Equations with Newton's Method, SIAM, Philadelphia, Pa, USA, 2003.
J. F. Traub, Iterative Methods for the Solution of Equations, Prentice-Hall, Englewood Cliffs, NJ, USA, 1964.
M. Grau-Sánchez, Á. Grau, and M. Noguera, “On the computational efficiency index and some iterative methods for solving systems of nonlinear equations,” Journal of Computational and Applied Mathematics, vol. 236, no. 6, pp. 1259–1266, 2011.View at: Publisher Site | Google Scholar | Zentralblatt MATH
S. Wolfram, The Mathematica Book, Wolfram Media, Champaign, Ill, USA, 5th edition, 2003.