Abstract

In this paper, we mainly focus on the development and study of a new global GCRO-DR method that allows both the flexible preconditioning and the subspace recycling for sequences of shifted linear systems. The novel method presented here has two main advantages: firstly, it does not require the right-hand sides to be related, and, secondly, it can also be compatible with the general preconditioning. Meanwhile, we apply the new algorithm to solve the general coupled matrix equations. Moreover, by performing an error analysis, we deduce that a much looser tolerance can be applied to save computation by limiting the flexible preconditioned work without sacrificing the closeness of the computed and the true residuals. Finally, numerical experiments demonstrate that the proposed method illustrated can be more competitive than some other global GMRES-type methods.

1. Introduction

In the current study, we focus on the solution of sequences of shifted linear systems of the formwhere are large nonsingular matrices of order , the right-hand sides are vectors of the length , and are called shifts. Such shifted linear systems (1) arise in many realistic applications, for example, electromagnetic scattering [1], QCD [25], dynamics of structures [6], complex network [79], and model reduction [10].

Note that problem (1) is reduced to the solution of a sequence of linear systems with the formwhen , where the coefficient matrix and the right-hand sides change from one system to the next. One approach to more efficient solution of such systems is to exploit Krylov subspace recycling; that is, recycling a judiciously selected subspace of the search space typically reduces the number of iterations substantially. There are several algorithms that take advantage of recycling for subsequent systems as well as for restarts of generalized minimal residual (GMRES) type methods, such as the generalized conjugate residual method with inner orthogonalization (GCRO) [11], GCRO with deflated restarting (GCRO-DR) [12], GMRES with deflated restarting (GMRES-DR) [13] (only for a single system), and GCRO with optimal truncation (GCROT) [14].

It is also noted that, for each , we substantially solve a family of shifted linear systems with the form

Krylov subspace solvers [15] are particularly attractive for the solutions of (3) because of its shift-invariance property [16]. That is, for any shift , the -th Krylov subspace satisfiesas long as . This property has led to many approaches for solving such linear systems (3) simultaneously by generating only one subspace. Some solvers are built upon the (Bi-)Lanczos recurrences [1721] and some others use the Arnoldi-like iteration [16, 2228]. However, the shift-invariance property will be lost when using the general preconditioning, such as Jacobi preconditioning, SSOR preconditioning, ILU preconditioning, AMG preconditioning, and AINV preconditioning. Furthermore, when are unrelated, we cannot employ the shift-invariance property. It means that all residuals are not simultaneously minimized whiling maintaining collinearity, for example, the GMRES-type methods [16, 23, 29].

Therefore, we propose an alternative without exploiting the shift-invariance. In fact, the solution of (3) is equivalent to solve a Sylvester equation [30],where and . Therefore, (1) can also be reformulated as a family of Sylvester equations with the slowly changing coefficient matrices

For simplicity of the presentation, a linear operator is defined as follows:

Applying the linear operator , we can rewrite (1) as

Moreover, we also assume that the linear operator is invertible; that is,

Obviously, the linear systems (8) can be seen as generalized version of (2). Therefore, based on the special structure of (1), we propose a new generalized global version of the GCRO-DR method that not only combines with any type of the preconditioning but also allows one to retain important spectral information generated during the solution of the -th linear system and then to exploit such information to accelerate convergence of the iterations when solving the subsequent -th system. Moreover, the proposed method can be combined with any form of preconditioning and does not require the right-hand sides to maintain collinearity.

For simplicity, we denote by . In the following, we utilize the operator to present our flexible global GCRO-DR algorithms for solving (1).

The paper is organized as follows. A global version of GCRO-DR to solve (1) is derived in detail in Section 2. We also apply the flexible preconditioning operator at each iteration to improve the global GCRO-DR algorithm. By performing an error analysis, we deduce that a much looser tolerance can be applied to save computation by limiting the flexible preconditioned work without sacrificing the closeness of the computed and the true residuals. Meanwhile, we also propose a global version of the FGMRES-DR method and then analyze the relationship between two flexible global methods. In Section 2.5, the flexible global GCRO-DR method is applied to solve the general coupled matrix equations. Section 3 numerically demonstrates the effectiveness of the novel method. Finally, some conclusions and remarks are summarized in Section 4.

2. Flexible Global GCRO-DR Method

GCRO-DR is a well-established Krylov subspace solver that paired with a harmonic Ritz vector recycling strategy [12]. GCRO [11] is a nested Krylov method, which uses GCR as the outer solver and the restarted GMRES as an inner method. In [12], Parks et al. propose GCRO-DR that exploits deflated restarting within the framework of GCRO for a family of slowly changing linear equations. Because of its superior numerical behavior, we introduce a global version of GCRO-DR to solve (1). Here, we not only employ the subspace recycling at each cycle but also apply the flexible preconditioning operator at each iteration. The proposed method is referred to as flexible global GCRO-DR (FGL-GCRO-DR).

2.1. Flexible Global Arnoldi Process

We first present a global Arnoldi process used in the flexible setting to create an F-orthonormal basis for the global Krylov subspace in Algorithm 1. We define as the preconditioning operator used at the -th iteration, and the action on rectangular matrix is denoted by .

The flexible global Arnoldi process, recursively, generates an F-orthonormal basis of the global Krylov subspace spanned by the rectangular matrices . At the end of the -th iteration, we obtain a matrix relation (flexible global Arnoldi relation); that is,where , , andis an upper Hessenberg matrix.

2.2. The FGL-GCRO-DR Algorithm

Suppose that and are given matrices satisfyingwhere is a search vector, recycled from either the previous linear system or the previous cycle, and . By minimizing the residual F-norm,over the search space , and satisfy

Based upon what is proven in [11, 14] and using the framework discussed in [12, 32], the steps of the flexible global Arnoldi process are carried out, with the starting rectangular matrix , while holding the orthogonality to , to eventually obtain

Since , that is, , the following flexible global Arnoldi-like relation holds:

At the end, we find the approximate solution by minimizing the norm , where satisfieswith .

To improve the convergence of iterative solvers, the smallest magnitude eigenvalues will be deflated. Following the idea of [12, 32], we use harmonic Ritz vectors with those eigenvalues to construct approximate invariant subspace and then the subspace for next cycles is retained. Firstly, we compute harmonic Ritz vectors , such thatwhere , and each harmonic Ritz pair satisfies

It is noted that the harmonic residual vectors satisfy

Observe that the residual . So, the residual and the harmonic residuals are in the same -dimensional space . Then we can characterize the relationship by the following formula:where satisfies , , and .

Due to the relations (10) and (21) can be also expressed as

Next, we compute the reduced QR-factorization of with and and then define , assuch that . Since is -orthogonal to the subspace and is -orthonormal, we have

Therefore, at the end of the loop, an update is generated, scaled such that has -orthonormal columns; the previous global Krylov subspace basis is dropped, and then we restart. Details of the FGL-GCRO-DR method are summarized in Algorithm 2.

Input:, . Choose , , and initial guesses .
Output: such that .
(1)
(2)Whiledo
(3)If
(4),
(5)Apply Algorithm 2 to generate , , such that , and set
(6)Else
(7)Apply steps of the Algorithm 2 with the starting rectangular matrix , such that with
(8)Endif
(9) with
(10)Set ,
(11)Compute the smallest eigenpairs of ,
(12)
(13)
(14)
(15)
(16)
(17)Endwhile
and set ;
for, do
for, do
  , ;
  ;
end for
and set
end for

In addition, the major advantage of FGL-GCRO-DR is that it can retain important spectral information generated during the solution of the -th linear system and then exploit such information to accelerate convergence of the iterations when solving the subsequent -th system. Then we detail how to recycle subspace when solving sequences of slowly changing linear systems (8) as follows (Algorithm 3).

(1)Suppose that , and are defined from solving -th linear system linear and satisfy with .
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)Apply steps of Algorithm 2 with the starting rectangular matrix , such that with
(10) with
(11)
(12)
(13)
2.3. Comments on the FGL-GCRO-DR Method

To achieve better efficiency of the linear system solvers, we not only employ the subspace recycling at each cycle but also apply the flexible preconditioning operator at each iteration. As we know, the variable preconditioning can be construed as an inexact subspace method. If the matrix is difficult to invert, we will resort to several steps of iterative method as flexible preconditioning, so we obtain in some way an approximate result,at -th (inner) iteration for the equation . In such a situation, what are the properties of the true residual obtained by using the flexible preconditioning? We will address this question in this section. Following [33, 34], we setto be the residual that results after the iterative solver has been terminated; then

Therefore, the perturbation in linear operation with can be written explicitly as

The perturbation implies that the error computed has the bound

In fact, the solution is obtained directly from the approximation subspace ; that is,indicating that the perturbation does not affect the computation of the true residual. In other words, in exact arithmeticand is orthogonal to . As a consequence, the perturbation of the linear operations only influences the goodness of and the accuracy of the approximate solution. Therefore, we deduce that, in the flexible setting, a dynamic stopping criterion or a much looser tolerance can be employed to save computation by limiting the flexible preconditioning work without sacrificing the closeness of the computed and the true residuals.

2.4. The Relationship between Two Flexible Global Methods with Subspace Recycling

As we know, FGCRO-DR [32] and FGMRES-DR [35] are two important methods in the general category of augmented (deflated) GMRES-type methods, which choose harmonic Ritz vectors as augmented (deflated) subspace. They emphasize on the solution of linear system which has only one equation. The relationship between FGCRO-DR and FGMRES-DR has been well described by [32]. It is shown that they are algebraically equivalent as long as a collinearity condition is satisfied.In fact, FGL-GCRO-DR is a global generalization of FGCRO-DR. For the sake of descriptive integrality, we also present a global generalization of FGMRES-DR algorithm, referred to as FGL-GMRES-DR, in Algorithm 4. As for global versions, the FGL-GCRO-DR method is also closely related to the FGL-GMRES-DR approach in algebraical analysis, but their numerical behavior will be slightly different. Details will be shown in Section 3.

Input:, . Choose , , and initial guesses
Output: such that
(1), ,
(2)Apply Algorithm 2 to generate , , such that
(3)Solve for . Set ,
(4)whiledo
(5)Compute the smallest eigenpairs of .
(6)
(7)
(8)
(9)
(10)Apply steps of Algorithm 2 with the starting rectangular matrix , such that with
(11) with
(12)
(13)
(14)endwhile

Furthermore, we compare the main calculation cost and storage capacity for each cycle of FGL-GCRO-DR and FGL-GMRES-DR in Table 1. In practical applications, . Here, and denote floating-point calculation counts for the preconditioning application and the linear operations, respectively.

The computational cost of harmonic Ritz vectors and “” have not been induced in Table 1 since the cost is approximately , much more less than . is referred to as the total cost of FGL-GCRO-DR. One sees that the time complexities of two global algorithms are the same, that is, , but FGL-GMRES-DR requires slightly more computation per cycle than FGL-GCRO-DR in terms of computational cost.

2.5. Application to the General Coupled Matrix Equations

FGL-GCRO-DR substantially belongs to the family of the global Krylov subspace method, which can be also applied to solve the general coupled matrix equations of the formincluding the (coupled) Sylvester and the (coupled) Lyapunov matrix equations. The literature on solving (32) is particularly rich and still growing [3653], due to the prominent role of such matrix equations in stability theory [54], perturbation analysis [55], control theory [56, 57], and so forth. In particular, the global subspace solvers perform better than other approaches for solving (32), especially for the coefficient matrices which are large and sparse; for more details, see [4453, 58]. Therefore, we are interested in exploiting the FGL-GCRO-DR method for solving (32). To do so, the linear operator can be defined as follows:where . Exploiting the operator , we can rewrite (32) aswhere . The effectiveness of FGL-GCRO-DR for the solution of (32) is shown in Section 3.

3. Numerical Experiments

This section presents some numerical examples to illustrate the effectiveness of our algorithm for solving shifted linear system (1) and matrix equations (8) and compare the performance with the global GMRES approach (referred to as GL-GMRES ) [31, 50, 59] and GL-GMRES-DR [60], with or without preconditioning. In each example, the flexible preconditioning is dominated by 5 steps of the global full GMRES [50].

We consider three numerical illustrations here. The first two experiments are sequences of linear systems arising from lattice QCD [24, 61] and the finite difference scheme of the convection-diffusion-reaction equation [62], respectively. The last one is the general coupled Sylvester matrix equations. The comparisons are mainly made in two aspects: the execution time in seconds (T (s), for short) and number of matrix-vector products (mvps, for short).

All the numerical experiments are carried out on MATLAB 2017b. We choose , , and . The stopping criteria are for all the approaches, or when mvps exceed the set maximum of matrix-vector multiplications (, for short). Moreover, the symbol represents that the solver does not achieve the stopping criteria within and .

Example 1. We first consider two groups of complex Lattice QCD matrices from Matrix Market (please refer to http://math.nist.gov/Matrix-Market for details). One group has seven matrices and the other has seven matrices. For each matrix , a critical parameter is provided such that with is positive-real. For each set, we take as the sequence matrix, and . The numerical computation is carried out with . The corresponding numerical results are reported in Tables 2 and 3. In addition, we also depict the residual histories for linear systems solved using iterative algorithms with or without flexible preconditioning.
As seen from Tables 2 and 3 and Figures 1 and 2 , it is found that using deflation and flexible preconditioning is conducive to improving the convergence rate. FGL-GCRO-DR has a significant decrease in terms of mvps and T(s), which are similar to FGL-GMRES-DR, but performs slightly better. The GL-GMRES method is less effective and does not converge in some test problems. The above results show that FGL-GCRO-DR has the potential to improve the convergence, compared with other global Krylov subspace solvers.

Example 2. For the second case, we assess the performance of global solvers on three-dimensional separable model convection-diffusion problem in the unit cube with the homogeneous Dirichlet boundary conditions [62]:where and . For obtaining the sequence matrices, we set with and and then use the central finite difference scheme to discrete the convection-diffusion equation with the mesh size or . Thus, two groups of complex matrices are obtained. One group has five matrices and the other has five matrices. As the definition of , we choose .
The numerical results reported in Tables 4 and 5 and Figures 3 and 4 show that FGL-GMRES is slightly superior to FGL-GCRO-DR when the mesh size . However, as the number of grid points increases, FGL-GCRO-DR enjoys the significant decrease in the number of mvps and T(s). Meanwhile, according to Tables 4 and 5, we can see that the convergence is speeded up remarkably by considering the flexible preconditioning. Moreover, even the size of grid points increases, and the average number of mvps for FGL-GCRO-DR does not increase.

Example 3. For the last test case, we are interested in solving the general coupled Sylvester matrix equations [45, 47]whereHere, we consider different orders. As the definition of , we choose . The corresponding numerical results of mvps and T(s) are reported in Table 6 and the residual histories are depicted in Figures 5 and 6.
As seen from Table 5 and Figures 5 and 6, it is noted that the flexible deflated GMRES-type methods (FGL-GCRO-DR and FGL-GMRES-DR) are found to be efficient in terms of performance test problems of different orders, while GL-GMRES and FGL-GMRES fail to converge in the given steps with or without variable preconditioning, while the parameter is greater than 80. The convergence is improved remarkably by considering the eigenvalue deflation technique and the flexible preconditioning. One should note that FGL-GCRO-DR enjoys the significant decrease in the number of mvps and T(s) and performs slightly better than FGL-GMRES-DR method. In addition, when increases, the average number of mvps and T(s) for FGL-GCRO-DR do not increase.

4. Conclusions

In this study, a new flexible global GCRO-DR method is proposed for addressing sequences of shifted linear systems and general coupled matrix equations. The proposed method does not require the right-hand sides to be related and can also be compatible with general preconditioning. By performing an error analysis, we deduce that a much looser tolerance can be applied to save computation by limiting the flexible preconditioned work without sacrificing the closeness of the computed and the true residuals. Meanwhile, we also propose a global version of the FGMRES-DR method and then analyze the relationship between two flexible global methods. Finally, numerical experiments show that FGL-GCRO-DR significantly reduces the mvps, saves the runtime, and enjoys faster convergence than some other global GMRES-type approaches on tough problems.

As future work, we plan to investigate the numerical properties of FGL-GCRO-DR on large-scale realistic problems. Of interest are applications related to, for example, model adaptation in Gaussian process models [63] or stochastic finite element methods [64] in three dimensions where variable preconditioning using inexact solvers has to be usually considered. Another extension of this work is a distributed-memory parallel implementation and testing using MPI + OpenMP and MPI + CUDA. Investigation will focus on implementing, testing, and comparing the runtime of the introduced FGL-GCRO-DR on CPUs and GPUs with respect to other existent similar methods.

Data Availability

The data will be made available upon reasonable request via e-mail to the corresponding author.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

All authors collaborated in carrying out the study and all of them read and approved the final manuscript.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. 11601365 and 11801463), Project of Shandong Science and Technology Department Foundation (no. J16LI06), Foundation of National Key Laboratory of Computational Physics (no. 6142A05180203), Project of Natural Science Foundation of Shandong Province (no. ZR2016AM06), and the Scientific Research Fund of Hunan Provincial Education Department (no. 19B381).