Abstract

It is known that the steepest-descent method converges normally at the first few iterations, and then it slows down. We modify the original steplength and descent direction by an optimization argument with the new steplength as being a merit function to be maximized. An optimal iterative algorithm with -vector descent direction in a Krylov subspace is constructed, of which the optimal weighting parameters are solved in closed-form to accelerate the convergence speed in solving ill-posed linear problems. The optimally generalized steepest-descent algorithm (OGSDA) is proven to be convergent with very fast convergence speed, accurate and robust against noisy disturbance, which is confirmed by numerical tests of some well-known ill-posed linear problems and linear inverse problems.

1. Introduction

The steepest-descent method (SDM), which can be traced back to Cauchy (1847), is the simplest gradient method for solving positive definite linear equations system. The SDM is effective for well-posed and low-dimensional linear problems; however, for large scale linear system and ill-posed linear system it converges very slowly.

Several modifications to the SDM have been addressed. These modifications have led to a new interest in the SDM that the gradient vector itself is not a bad choice but rather that the original steplength used in the SDM leads to the slow convergence behavior. Barzilai and Borwein [1] presented a new choice of steplength through two-point stepsize. Although their method did not guarantee the monotonic descent of the residual norms, Barzilai and Borwein [1] were able to produce a substantial improvement of the convergence speed for a certain test of a positive linear system. The results of Barzilai and Borwein [1] have encouraged many researches on the SDM, for example, Raydan [2, 3], Friedlander et al. [4], Raydan and Svaiter [5], Dai et al. [6], Dai and Liao [7], Dai and Yuan [8], Fletcher [9], and Yuan [10].

The iterative method for solving the system of algebraic equations can be derived from the discretization of certain ordinary differential equations (ODEs) system [11]. In particular, some descent methods can be interpreted as the discretizations of gradient flows [12]. Indeed, the continuous algorithms have been investigated in many literature works for a long time, for example, Gavurin [13], Alber [14], Hirsch and Smale [15], and Chu [16]. The Lyapunov methods used in the analysis of iterative methods have been made by Ortega and Rheinboldt [17] and Bhaya and Kaszkurewicz [11, 18, 19]. Liu [20] has developed a state feedback controlling method together with a Lie-group differential algebraic equations method to solve ill-posed linear systems.

In this paper we solve where is an unknown vector determined from a given coefficient matrix , which might be unsymmetric, and the input which might be polluted by random noise. A measure of the ill-posedness of (1) is the condition number of : where is the Frobenius norm of .

Instead of (1), we can solve a normal linear system: where Throughout this paper the superscript “T” signifies the transpose.

We consider iterative method for solving (3) and define for any vector the steepest-descent vector: Ascher et al. [21] have viewed the gradient-descent method: as a forward Euler scheme of a time-dependent ODE: The absolute stability bound must be obeyed if a uniform stepsize is employed, where is the set of all the eigenvalues of .

Specifically, (6) presents a steepest-descent method (SDM), if the steplength is taken to be When is rather small the calculated may deviate from the real steepest-descent direction to a great extent due to a round-off error of computing machine, which usually leads to the numerical instability of SDM. An improvement of SDM is the conjugate gradient method (CGM), which enhances the search direction of the minimum by imposing an orthogonality to the residual vector at each iterative step.

The relaxed SDM (RSDM) for solving (3) is summarized as follows [22, 23].(i)Give and an initial value of .(ii)For , we repeat the following iterations: If converges according to a given stopping criterion , then stop; otherwise, go to step (ii).

The steepest-descent method is the basis of several gradient-based methods [22, 23], and it is one of the most prominent iterative methods for solving positive definite linear equations system. Although the SDM works very well for most linear systems, the SDM does lose some of its luster for some ill-posed problems like inverse problems, image processing, and box-constrained optimization. Both the efficiency and robustness of the SDM iterative techniques can be improved by using a suitable preconditioning technique.

However, we do not go to the details of the preconditioning techniques of SDM, and instead we will propose an alternative approach by modifying the search direction and steplength used in the SDM by an argument from the optimality of a suitably defined merit function. In this paper we explore a generalization of the SDM by introducing the concept of optimal descent vector to solve linear equations, which is an optimal combination of the steepest-descent vector and the -dimensional search vector in a Krylov subspace. Here we modify the direction as well as the steplength from a theoretical optimization being performed on the derived steplength in an -dimensional Krylov subspace.

The remaining parts of this paper are arranged as follows. A numerical iterative scheme with as a descent direction is proposed, and the convergence to the minimal point of a quadratic functional is proven in Section 2. In Section 3 we propose a generalization of the steepest-descent method to solve positive linear equations systems, by using a new steplength as a merit function to be optimized in an -dimensional Krylov subspace. Then, a genuine algorithm is constructed in Section 4, resulting in an optimally generalized steepest-descent algorithm (OGSDA) in terms of optimal weighting parameters which is being derived explicitly to maximize the introduced steplength. The numerical examples are given in Section 5 to display some advantages of the OGSDA, which is compared with CGM, BBM, GMRES, and other algorithms. Finally, the conclusions are drawn in Section 6.

2. The Convergence of New Algorithm

Solving (3) by the steepest-descent method (SDM) is equivalent to solving the following minimum problem: In the SDM, we search the next from the current state by minimizing the functional along the direction ; that is,

Through some calculations we can obtain Thus we have the following iterative algorithm of SDM: A straightforward calculation reveals that It means that the iterative sequences of SDM converge to the minimal point.

Then, we consider a more general scheme with the descent direction ; that is, Similarly we have the following result.

Lemma 1. Equation (16) is a result of the following minimization along the descent direction : The functional is decreasing stepwise and the iterative scheme (16) is convergent to the minimal point .

Proof. Inserting into the functional we have where we simplify the notation of by and omit the terms in and . Taking the derivative of (19) with respect to and equating it to zero we can derive Because is a steplength, we demand it to be positive; that is, . By feeding the above into (18) we can derive the iterative scheme (16).
Inserting (20) into (19) and subtracting the resultant by we can obtain It implies and the functional is strictly decreasing. Because the minimal point of is just the exact solution of (3), such that when the iterative sequences are generated from scheme (16), the functional decreases step-wisely to its minimal value and tends to the minimal point gradually. Thus we can conclude that the new scheme is convergent absolutely.

3. A Generalized SDM and Its Optimization

In Lemma 1 we have proven that the new scheme (16) is convergent, but it cannot tell us what descent direction is the best. Although the SDM uses the steepest-descent direction , it is well known that the steplength given in (13) quickly tends to a small value, causing the slowdown of the SDM scheme.

In order to overcome the slowness of SDM, Barzilai and Borwein [1] have presented a new choice of the steplength through two-point stepsize. The algorithm of the Barzilai-Borwein method (BBM) is where and , with initial guesses and . Nowadays the improvements of the original BBM have been developed in many literature works [2430] to treat different ill-posed and inverse problems.

In the numerical solution of linear equations system the Krylov subspace method is one of the most important classes of projective type numerical methods [3135]. The iterative algorithms that are applied to solve large scale linear systems are mostly the preconditioned Krylov subspace methods [36].

The GMRES used to solve (1) can be summarized as follows [37, 38].(i)Select and give an initial .(ii)For , we repeat the following computations: If converges according to a given stopping criterion , then stop; otherwise, go to step (ii). In the above, is an   ×   matrix, while is an augmented Heissenberg upper triangular matrix with dimension (  +  1)  ×  , and is the first column of .

3.1. A Generalized SDM

We suppose that the original steepest-descent direction in the SDM is replaced by a new descent vector : which is an optimal combination of and the -vector . The coefficients will be optimized in Section 3.2.

Now we describe how to set up the -vector , by the Krylov subspace method. Suppose that we have an -dimensional Krylov subspace generated by the coefficient matrix from the steepest-descent vector : Then the Arnoldi process is used to normalize and orthogonalize the Krylov vectors , such that the resultant vectors , , satisfy , where is the Kronecker delta tensor.

The other way to obtain , is simply by using the unit vectors in the Euclidean space, where is the th column vector of the identity matrix . The resultant -dimensional space is named the unit subspace.

3.2. Optimization of Algorithm

Let be an matrix, which consists of That is, the th column of is the vector . Because are linearly independent vectors and , the expansion matrix has a full rank with . Then, (26) can be written as where .

Now, in view of (16) the new steplength becomes When , the above recovers to that defined in (13). In order to speed up the convergence we can search a way to maximize the steplength, such that we have the following maximization problem: Here, we use the steplength as a merit function to be optimized. Hence, we can prove the following result.

Theorem 2. For and , the descent direction which maximizes (31) is given by Moreover, the steplength defined in (30) is positive; that is,

Proof. We divide this proof into three parts.(A)A Minimization Problem. Inserting (29) into (31) we can obtain the following minimization problem: where in which is an positive definite matrix. Because the matrix has a full rank with and positive definite and has a full rank with , from the above equation it follows that has a full rank with . Then, we confirm that is an positive definite matrix by (41).
From (39) and (40) we can derive where denotes the gradient with respect to .
By using we can derive the following governing equation of vector form to solve the optimal parameter :
Now we prove that solved from the above equation is a minimal point of the functional in (38). We can compute the Jacobian matrix of (45), which is denoted by : It is an positive definite matrix, because for any we have where we have used the positive definite property of , and . Because defined by (20) is a steplength, we demand it to be positive; that is, (see the proof given below for ).(B)A Quadratic Equation. Equations (40) and (43) can be written as where
From (45) we can observe that is proportional to , which is supposed to be where is a multiplier to be determined. From (51), two equations follow Then, by (42), (49), and (52) we can solve by
Inserting (54) for into (39) and (48) we have that where is an positive semi-definite matrix.
Now, from (53) and (55) we can derive a quadratic equation to solve : where (C)A Closed-Form Solution of Optimal Parameter . Now we prove . By using (50) we have if we can prove From (56) and (41) it follows that They indicate that is a Penrose pseudoinverse of in the Krylov subspace; hence, (60) follows.
According to (58), it is obvious that ; hence, by (59) we have As a consequence, (57) has a closed-form positive real solution: Inserting it into (54) we can obtain the closed-form solution of :
Upon comparing (53) and (30) we can derive (37) with the steplength being positive. Moreover, as a direct result of (30) and (37) we have .

Remark 3. Because of and in (57), it has a positive solution as shown in (65) and in addition a negative solution as follows: where in view of (58), (50), and (62). Hence, by (30) and (37) we have which contradicts (20), where is a steplength being positive. Below we will give a numerical example to demonstrate this fact.

Corollary 4. If , then one has

Proof. If , then defined by (28) is an nonsingular matrix. Simultaneously, defined by (56) is equal to , , , , and meanwhile (66) reduces to (70), where we have taken according to the above condition.

Remark 5. If , by (29) and (70) we have and inserting it into (31) we can prove . In any way, under this condition we have the fastest convergent behavior, because the solution is already obtained. Usually, we cannot take , which leads to a quite difficult task to find the inverse of when is a large number.

4. An Optimal Generalized Steepest-Descent Algorithm

Thus, we arrive at the following optimally generalized steepest-descent algorithm (OGSDA).(i)Select and , and give an initial value of .(ii)For , we repeat the following computations: If converges according to a given stopping criterion , then stop; otherwise, go to step (ii).

Remark 6. Depending on the different choice of , the optimal algorithm is further classified into two types. When , are constructed from the Krylov subspace and followed by the Arnoldi process to orthonormalize the Krylov vectors, the resultant optimal algorithm is named the OGSDA with Krylov subspace method. On the other hand, if is simply the th column of the unit matrix , then the resultant optimal algorithm is named the OGSDA with unit subspace method. Both methods have their own advantage. In the Krylov subspace method, the expansion matrix is not fixed, and it can adjust its configuration at each iterative step to accelerate the convergence, which however leads to a consumption of computational time in the calculation of the inverse of to obtain and . On the other hand, in the unit subspace method, the expansion matrix is fixed, and we only need to calculate , and one time, which is much time saving than that by using the Krylov subspace method; however, it cannot adjust the configuration of the descent direction at each iterative step, which might lead to an ill-condition in the course of computation.
We can evaluate the cost of computing the approximate solution by OGSDA with steps. We can estimate the cost of the computation of the scalar , which at each iterative step requires an inversion of matrix, the matrix-vector multiplications of -matrix and -vector three times, and the inner products of two -vectors five times. This portion is very time saving with totally , because of . The computation of the Arnoldi vectors needs totally multiplications. In the computation of it requires multiplications. The steps OGSDA therefore requires totally multiplications. Dividing by the total number of steps we can obtain that each step requires multiplications on the average. The computational cost of OGSDA is slightly larger than that of GMRES and BBM per step. However, because the convergence speed of the OGSDA is very fast, the total computational time is less than that of GMRES and BBM.

5. Numerical Examples

In order to evaluate the performance of the OGSDA, we test some well-known ill-posed linear problems of the Hilbert problem, the backward heat conduction problem, the heat source identification problem, and the inverse Cauchy problem.

5.1. Example 1

First by testing the performance of OGSDA on the solution of linear equations system, we consider the following convex quadratic programming problem with equality constraint: where is an matrix, is an matrix, and is an -vector, which means that (74) provides linear constraints. According to the Lagrange theory we need to solve (1) with the following and :

For definite we take and with Under the following parameters , , and and starting from the initial values of and two Lagrange multipliers and we apply the OGSDA to solve the resultant five-dimensional linear equations system, which is convergent with 38 steps as shown in Figure 1(a) for the residual. In Figures 1(b) and 1(c) we show, respectively, the optimal parameters and and the values of and . It can be seen that , as proven by (59) . Now we raise to and also the convergence criterion to and is fixed. The numerical results obtained by the OGSDA with Krylov subspace is convergent with only three steps. The final solution is a minimal value at the point .

In order to claim that the solution of given by (67) is incorrect, we use the corresponding algorithm to solve the above problem under the parameters and . First the algorithm does not converge within 500 steps under a weak convergence criterion with as shown in Figure 2(a). Then the algorithm gives negative steplength of as shown in Figure 2(b).

5.2. Example 2

Finding an -order polynomial function to best match a continuous function in the interval of leads to a problem governed by (3), where is the Hilbert matrix defined by is composed of the coefficients that appeared in , and is uniquely determined by the function .

The Hilbert matrix is a notorious example of highly ill-conditioned matrices. Equation (3) with the matrix having a large condition number usually displays that an arbitrarily small perturbation of data on the right-hand side may lead to an arbitrarily large perturbation to the solution on the left-hand side.

In this example we consider a highly ill-conditioned linear system (3) with given by (78). The ill-posedness of (3) increases fast with . We consider an exact solution with and is given by with being random numbers between .

Liu [22] has applied the relaxed steepest-descent method (RSDM) with , starting from , to solve the Hilbert problem with with a stopping criterion for the relative residual. Through 50000 iterations the numerical solution converges to the exact solution very accurately as shown in Table 1 with the maximum error being . When we apply the OGSDA with the unit subspace method under and to solve this problem it converges only three steps and with the maximum error being . It is very time saving because we only need to calculate one time, by which using the matrix conjugate gradient method (MCGM) developed by Liu et al. [39] we only spend 103 steps under the convergence criterion to find . For the purpose of comparison, the values obtained by other numerical methods are also listed in Table 1.

As shown in Corollary 4 when is taken to be we have the fastest convergent solution. Usually, we can take a smaller for the easy solution of . For example, if we take in the OGSDA with the Krylov subspace method, we can obatin the solution with the maximum error being through only four steps. Under the same conditions, the CGM is convergent with six steps with the maximum error being .

Next we consider a more difficult case with . In the computation a random noise in the level of is imposed on the right-hand side, and the convergence criteria used in CGM and OGSDA are both for the relative residual. The CGM converges very fast with 4 iterations as shown in Figure 3(a); however, the maximum error of CGM is large up to 0.25. For this noised case the OGSDA is applicable and convergent with 4 iterations, where and are used. The maximum error of OGSDA is 0.0113. In Figure 3 we compare (a) relative residuals, (b) steplength of OGSDA, and (c) numerical errors of OGSDA, CGM, GMRES, and BBM. It is obvious that for this highly ill-posed problem the OGSDA is convergent fast, and accurate. The solutions obtained by the CGM, GMRES, and BBM are not good, although the GMRES with converges with only two steps, CGM with four steps, and the BBM with ten steps.

5.3. Example 3

When we apply a central difference scheme to the following two-point boundary value problem: we can derive a linear equations system where is the spatial length and , are unknown values of at the grid points . and are the given boundary conditions. The above matrix is known as a central difference matrix.

The eigenvalues of are found to be which together with the symmetry of indicates that is positive definite, and is a large number when the grid number is very large.

In this numerical test we fix and thus the condition number of is 16210.7. Let us consider the boundary value problem in (81) with . The exact solution is where we fix and .

A relative random noise with intensity is added by into the right-hand side data of (82). Under a moderate convergence criterion with for the relative residual, we compare (a) relative residuals and (c) numerical errors of OGSDA, CGM, GMRES, and BBM in Figure 4. The number of iterations of CGM is 1325, and the maximum error is . Then we apply the OGSDA with and to this problem, whose number of iterations is 66, and the maximum error is . The number of iterations of GMRES with is 811, and the maximum error is , while the number of iterations of BBM is 1708, and the maximum error is .

It can be seen that the OGSDA is more convergent, faster, and more accurate than CGM and GMRES and much faster than the BBM; however, the accuracy of BBM is slightly better than that obtained by the OGSDA.

In order to demonstrate the efficiency of the present OGSDA with that of the CGM we can compare the convergence rate of these two algorithms. According to Liu [40, 41] the convergence rate is given by where , in which is the supplemental vector used in the CGM. In Figure 4(b) we compare the convergence rates of CGM and OGSDA, of which we can see that the convergence rate of OGSDA is much larger than that of CGM.

5.4. Example 4

When the backward heat conduction problem (BHCP) is considered in a spatial interval of by subjecting to the boundary conditions at two ends of a slab: we solve under a final time condition:

The fundamental solution of (87) is given by where is the Heaviside function.

The method of fundamental solutions (MFS) has a serious drawback that the resulting linear equations system is always highly ill-conditioned, when the number of source points is increased, or when the distances of source points are increased.

In the MFS the solution of at the field point can be expressed as a linear combination of the fundamental solutions : where is the number of source points, are unknown coefficients, and are source points being located in the complement of . For the heat conduction equation we have the basis functions:

It is known that the location of source points in the MFS has a great influence on the accuracy and stability. In a practical application of MFS to solve the BHCP, the source points are uniformly located on two vertical straight lines parallel to the -axis, not over the final time, which was adopted by Hon and Li [42] and Liu [43], showing a large improvement than the line location of source points below the initial time. After imposing the boundary conditions and the final time condition to (91) we can obtain a linear equations system: where and .

Since the BHCP is highly ill-posed, the ill-condition of the coefficient matrix in (93) is serious. To overcome the ill-posedness of (93) we can use the OGSDA to solve this problem. Here we compare the numerical solution with an exact solution: For the case with the value of final time data is in the order of , which is small by comparing with the value of the initial temperature to be retrieved, which is . First we impose a relative random noise with an intensity being imposed on the final time data. Under the following parameters , , , , and , we solve this problem by the OGSDA with the Krylov subspace method. With 28 iterations the OGSDA can obtain a very accurate solution with the maximum error being , which is better than calculated by Liu [35] using the optimal multivector iterative algorithm (OMVIA). In Figures 5(a) and 5(b) we plot the relative residual and steplength, of which we can see that the minimal steplength is 1. The numerical result is compared with the exact initial value in Figure 5(c), whose numerical error as shown in Figure 5(d) indicates that the present OGSDA is very robust against noise. Under the same parameters we also apply the BBM and GMRES with to solve the BHCP, of which both algorithms cannot converge to satisfy the convergence criterion with , and we let BBM run 100 steps and the GMRES run 30 steps with the residuals being shown in Figure 5(a). The maximum error obtained by the BBM is 0.151, while that for the GMRES is 0.809 as shown in Figure 5(d). For this problem we can quickly provide a very accurate numerical result by using the OGSDA, which is much better than the other three algorithms.

5.5. Example 5

Let us consider the inverse Cauchy problem for the Laplace equation: where and are given function. The inverse Cauchy problem is specified as follows.

To seek an unknown boundary function on the part of the boundary under (96)–(98) with the overspecified data being given on .

It is well known that the method of fundamental solutions (MFS) can be used to solve the Laplace equation when a fundamental solution is known. In the MFS the solution of at the field point can be expressed as a linear combination of fundamental solutions : For the Laplace equation (96) we have the fundamental solutions:

In the practical application of MFS, by imposing the boundary conditions (97) and (98) at points on (99) we can obtain a linear equations system: where in which , and The above with an offset can be used to locate the source points along a contour with a radius . When the linear equations system (101) is established, we can apply the OGSDA to solve it.

For the purpose of comparison we consider the following exact solution: defined in a domain with a complex amoeba-like irregular shape as a boundary: After imposing the boundary conditions (97) and (98) at points on (99) we can obtain a linear equations system. The noise being imposed on the measured data and is .

We solve this problem by the OGSDA with and . Through 15 iterations the residual is smaller than the convergence criterion as shown in Figure 6(a), and also the steplength is shown there. We compare the recovered data computed by the OGSDA with the exact one in Figure 6(b). The numerical error as shown in Figure 6(c) is smaller than 0.062. It can be seen that the OGSDA with Krylov subspace can accurately recover the unknown boundary condition.

We also apply the GMRES with and the BBM to this inverse Cauchy problem, whose residuals are shown in Figure 7. They fast tend to a plateau and then the residuals are no more reduced. The numerical solutions are very bad with the maximum error of BBM being 3.79 and the maximum error of GMRES being 2.81. As compared with those obtained by OGSDA, the accuracy and convergence speed of OGSDA are much better than those obtained by the GMRES and the BBM.

5.6. Example 6

In this section we apply the OGSDA to identify an unknown space-dependent heat source function for a one-dimensional heat conduction equation: In order to identify we impose a Neumann type boundary condition:

We propose a numerical differential method by letting . Taking the differentials of (106), (107), and (109) with respect to and letting we can derive This is an inverse heat conduction problem (IHCP) for without using the initial condition.

Therefore as being a numerical method, we can first solve the above IHCP for by using the MFS in Section 5.4 to obtain a linear equations system and then the method introduced in Section 4 to solve the resultant linear equations system. Thus, we can construct by which automatically satisfies the initial condition in (108).

From (114) it follows that which together with being inserted into (106) leads to Inserting (110) for into the above equation and integrating it we can derive the following equation to recover : This approach exhibits threefold ill-posedness: one is the use of the IHCP to solve which is used to provide the data of used in (117), one is all the boundary conditions being obtained from the first-order differentials of measured data as shown in (111)–(113), and another is the second-order differential of the data in (117).

We consider In (117) we disregard the ill-posedness of and suppose that the data are given exactly. We solve this problem by the OGSDA with unit subspace method, where we fix and . The maximum number of iterations is set to be 100. A random noise with intensity is added on the data . We have obtained the numerical solution with the steplength being shown in Figure 8(a), which is very small. We compare the heat source computed by the OGSDA with the exact one in Figure 8(b). The numerical error is smaller than 0.568 as shown in Figure 8(c).

6. Conclusions

In the present paper, we have derived an optimal algorithm including an -vector optimal search direction in an -dimensional Krylov subspace or a unit subspace as an optimally generalized steepest-descent algorithm (OGSDA) to solve highly ill-posed linear systems. This algorithm has good computational efficiency and accuracy in solving the ill-posed linear equations for the linear Hilbert problem with and three famous linear inverse problems of backward heat conduction problem, inverse Cauchy problem, and inverse heat source problem. In particular, the OGSDA has a better filtering effect against noise, such that for all the numerical examples of inverse problems the numerical results recovered are quite smooth. The robustness of the OGSDA was confirmed by imposing noisy disturbance on the given data even with an intensity being large up to 10%. In all the numerical examples the convergence steps are among 4 to 100 steps, which is very time saving, no one case running over one second in the PC computer. For highly ill-posed problems the performance of OGSDA without needing of large value of the Krylov subspace dimension is better than other algorithms investigated in this paper.

Conflict of Interests

This paper is a purely academic research, and the author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The project NSC-102-2221-E-002-125-MY3 and the 2011 Outstanding Research Award from the National Science Council of Taiwan and the 2011 Taiwan Research Front Award from Thomson Reuters are highly appreciated. The author thanks anonymous referee for the comments which improved the quality of this paper.