Stochastic Systems: Modeling, Optimization, and ApplicationsView this Special Issue
A New GMRES() Method for Markov Chains
This paper presents a class of new accelerated restarted GMRES method for calculating the stationary probability vector of an irreducible Markov chain. We focus on the mechanism of this new hybrid method by showing how to periodically combine the GMRES and vector extrapolation method into a much efficient one for improving the convergence rate in Markov chain problems. Numerical experiments are carried out to demonstrate the efficiency of our new algorithm on several typical Markov chain problems.
The Markov chain is a very robust tool for studying stochastic systems overtime and is in a wide range of applications including queueing systems [1, 2], computer and communication systems , information retrieval, and Web ranking [4–7]. For both discrete and continuous Markov chains, it has always been one of the central work to get their numerical solutions by adopting appropriate methods.
In this paper, we consider a class of Krylov subspace methods, restarted GMRES (GMRES()) method, for the numerical solutions of the stationary probability vector of an irreducible Markov chain. Let be a column stochastic matrix; that is, and , with being the column vector of all ones. We seek the stationary vector that satisfies By the Perron-Frobenius theorem [8, 9], if is irreducible, then there exists a unique solution to (1), which is strictly positive ().
Equation (1) is equivalent to the following singular linear problem: where , with being the identity matrix, and is a singular -matrix with the diagonal elements being negative column sums of its off-diagonal elements.
Now there are many numerical methods for solving (2), such as the traditional iterative methods, the power method or weighted Jacobi method [5, 10–12], the aggregation/disaggregation algorithm [13–16], and the Krylov subspace methods like the Arnoldi method and the GMRES [3, 17–21]. However, these iteration methods for calculating may converge very slowly when the subdominant eigenvalue of satisfies [3, 11]. Thus, it becomes quite required to accelerate the calculation in Markov chain problems.
In this paper, we consider the restarted GMRES method and propose a new way to accelerate its numerical calculation by use of the polynomial-type vector extrapolation methods. In fact, seeking the vector extrapolation method as the accelerator is very popular; see [5, 12, 22] for details. In discussing this issue, we start from the mechanisms of the GMRES() method and vector extrapolation method and then illustrate how to periodically knit these two methods. In numerical experiments, the proposed extrapolation-accelerated GMRES() methods are tested on several Markov chain problems, and the experimental results show its effectiveness.
This paper is organized as follows. Section 2 briefly provides the mechanism of the GMRES methods and some acceleration ones. In Section 3, we first review the vector extrapolation method and then consider how to periodically combine the extrapolation method with the GMRES method for numerical calculation of Markov chains. Section 4 provides experimental evidence of the effectiveness of our approach. Section 5 summarizes the thesis and points out the direction of our future research.
In this section, we briefly introduce the mechanism of the GMRES methods and describe some existing modifications to the standard GMRES aimed at accelerating its convergence.
The GMRES method, proposed by Saad and Schultz in , is a popular method for the iterative solution of sparse linear systems with an unsymmetrical matrix, If is an initial guess for the true solution of (3) and is the initial residual, we have the equivalent system: where . Let be the Krylov subspace: GMRES is to find an approximate solution such that Here, denotes the Euclidean norm on , as well as the associated-induced matrix norm on .
GMRES is often referred to as an “optimal” method in the sense that it finds the approximate solution in the Krylov subspace that minimizes the residual . However, the amount of storage and computational work grow quadratically with the number of steps. So the restarted version of the algorithm is often used as suggested in . In restarted GMRES (GMRES()), the method is restarted after each cycle of iteration steps, and the current approximate solution becomes the new initial guess for the next iterates. The mechanism of the GMRES() for Markov chains as system (2) is described as follows. For more details, see [20, 23].
Algorithm 1. The Restarted GMRES. (1)Start: choose , , tol (2)For , do (3)Form the approximate solution: , where minimizes . (4)Restart: while , do and go to step 2.
In general, the convergence rate depends on the restart parameter [24, 25]. Even in the situation in which the appropriate restart parameter results in the satisfactory convergence, the convergence behavior may not be the optimal; since an iterate is restarted, the current approximation space is discarded at each restart, and the orthogonality between approximation spaces generated at successive restarts is not preserved at each restart. For that reason, slow convergence can occur. Therefore, it is necessary to develop more efficient algorithms for enhancing the robustness of restarted GMRES, and thereof now its improvements and modifications continue to receive considerable attentions. Augmented methods are a class of acceleration techniques which seek to avoid stalling by improving information in GMRES at the time of the restart. These acceleration methods are presented by Morgan [19, 26], Saad  and Chapman and Saad . The precondition technique is also often used to accelerate the numerical calculation of GMRES; see [17, 23, 29–32] for more details. Meanwhile, this paper focuses on undertaking the similar works, that is, speeding up the process and improving the robustness of the restarted GMRES.
3. The Main Algorithm and Practical Implementations
Now let us first present the rationale and theory behind vector extrapolation method. When considering the solutions of systems of linear or nonlinear equations by an iterative method, a sequence of vectors (approximation solutions) is yielded. As the classical iteration process may converge slowly, extrapolation strategy can often be applied to enhance its convergence rates. A detailed review of extrapolation methods can be found in the works of Smith et al. , Sidi , and Jbilou and Sadok . Now there are mainly two classes of vector extrapolation methods: (1) polynomial-type methods, namely, the minimal polynomial extrapolation (MPE) of Cabay and Jackson , the reduced rank extrapolation (RRE) of Eddy  and Mešina , and the modified minimal polynomial extrapolation (MMPE) method of Sidi et al. , Brezinski , and Pugachev ; and (2) epsilon algorithms, namely, the topological epsilon algorithm of Brezinski  and the scalar and vector epsilon algorithms of Wynn . Numerical experiences have illustrated that the polynomial type vector extrapolation methods are in general more economical than the epsilon ones, with respect to the computing time and storage requirements. And for the special touch, Kamvar et al. have recently proposed a new extrapolation, quadratic extrapolation, for computing the PageRank in , and Sidi  has generalized this method (the generalization one is marked by GQE) and proved that the resulting generalization is very closely related to MPE. Therefore, we focus on the two polynomial extrapolation methods, namely, RRE and GQE, in this paper. Now let us display the implementations of these two algorithms. For more details, we refer the reader to the paper  by Sidi.
Vector extrapolation algorithms are derived by considering vector sequences , generated from a fixed-point iterative method of the following form: where is a fixed matrix, is a fixed -dimensional vector, and is an initial vector.
Suppose that we have produced a sequence of iterates , where . Then at the th outer iteration, let be a matrix consisting of the last iterates with being the newest, where is called the window size in usual. It is evident that has the following properties:
The problem to be solved is transformed into obtaining a vector satisfying , and thus we have an updated probability vector a linear combination of the last iterates. Assume , which results in the unique solution of the linear system As illustrated in , the efficient implementations of the vector extrapolation methods (RRE and GQE) are presented as follows.
Algorithm 2. The Reduced Rank Extrapolation Method (RRE). (1)Input vectors . (2)Compute , set , and compute the -factorization of , namely, . (3)Solve the linear system . It amounts to solving two triangular systems: for and for . Set , and compute , with . (4)Compute by . (5)Compute . Compute the RRE approximation for the exact solution of system (15) by .
Algorithm 3. The Generalization of Quadratic Extrapolation (GQE).(1)Input vectors . (2)Compute , set , and compute the -factorization of , namely, . (3)Solve the linear system . (4)Set and compute by (thus, ). (5)Compute .
Clearly, Algorithms 2 and 3 share the common feature: they both contain a factorization in step 2 for , where is unitary and is an upper triangular matrix with positive diagonal elements. In addition, the -factorization is always carried out inexpensively by applying the modified Gram-Schmidt process (MGS) to the vectors ; see  for more details. The MGS algorithm is given as follows.
MGS Algorithm(1)Compute , and set .(2)For , set .(3)For , compute and .(4)Compute and .
As previously mentioned, the restarting of GMRES may entail the slow convergence since the current approximation space is discarded at each restart cycle. Meanwhile, the change of the restart parameter will affect the number of iterations and also the execution time in the run time of GMRES. Furthermore, the increasing will decrease the number of iteration needed to converge while increasing the computing work and storage amount required per iteration, especially for large systems of equations, like large-scale Markov chains, and PageRank computation in information retrieval. So it is very necessary to enhance the robustness of GMRES(). And our work falls into the category of accelerating techniques. Next, we will discuss the idea and implementation of the new method.
Recall that in restarted GMRES, once the Krylov subspace reaches dimension , the current approximate solution becomes the new initial guess for the next iterations. The motivation of our new method comes from the successive restarts every iterations. To be specific, we will implement the current approximation, after iterations in a restart cycle, as the initial vector for the extrapolation procedure. And furthermore, the resulting vector by extrapolation method, in turn, can be the new improved starting vector for the next iterations by GMRES(). In summary, the mechanism of our new restarted GMRES method can be characterized as follows.
Algorithm 4. The Extrapolation-Accelerated GMRES() (EGMRES) (1)Choose , , , , and set . (2)Compute using Algorithm 1 (GMRES()). (3)Set . (4)Compute from by applying Algorithm 2 (RRE) and Algorithm 3 (GQE), respectively. (5)If , then . (6)Check convergence. If , stop. Otherwise, set and go to step 2.
Note that, in step 1 tol is the user described tolerance for residual vector 1-norms by GMRES(), is the restart number in GMRES, and is the window size for extrapolation. From Algorithms 2 and 3, both the RRE and GQE methods require vectors as inputs. For instance, when , four approximate vectors are needed as input in the extrapolation procedure. Step 5 is to check the efficiency of extrapolation strategy in current iterations. Step 6 is devised to determine when to flip flop between the extrapolation procedure and GMRES procedure. The performances of our new algorithm and comparisons with the other original algorithms will be discussed in detail below.
4. Numerical Results in Markov Chain Problems
In this section, we will report the numerical experiments to examine the efficiency of our new accelerated algorithm in numerical solution of stationary probability vector for Markov chains. Three typical Markov chain problems have been used in our experiments. All the numerical results are obtained with a MATLAB R2010a implementation on Windows 8 with 2.5 GZ i5 processor and 4 GB memory.
For the sake of justice, the same starting vector , with being the column vector of all ones, is used for all the algorithms. All the iterations are terminated when , where are the approximations obtained by the current iteration. For convenience, in all the tables and figures below, we have abbreviated the RRE-GMRES algorithm and the GQE-GMRES algorithm as RGMRES and GGMRES, respectively. We denote by “rest” the restart number in GMRES, “CPU” the CPU time used in seconds, “it” the iteration counts, and “Res” the residual 1-norms.
Example 1. In this example, we compare the GMRES() algorithm, the RGMRES algorithm, and the GGMRES algorithm for the Markov chain problem. This test problem is a one-dimensional (1D) Markov chains often resulting from a queueing system. The simplest graph of this problem is displayed in Figure 1, where the transition rates are identical. Numerical results are presented in Table 1.
It is easy to see from Table 1 that the accelerated GMRES is more effective than the unaccelerated one, both in terms of CPU time and the number of iteration and regardless of the restart number size for GMRES. In particular, the GMRES accelerated by GQE (GGMRES) performs best in most cases. For instance, when the restart number is 20, in comparison with the results of unaccelerated GMRES, the CPU time has been saved by for FGMRES and for GGMRES, and the iteration count has been reduced by for RGMRES and for GGMRES. In Figure 2, we compare the converging speed among the three algorithms. It is clear that the convergence rate has been enhanced greatly when using the accelerating technique in GMRES, and especially that the GGMRES has faster convergence speed than the other algorithms for meeting the convergence precision quickly.
Example 2. This test problem is displayed in Figure 3, which is a 1D chain with uniform weights, except for two weak links with weight in the middle of the chain. This is a typical nearly completely decomposable (NCD) Markov chain problem, and it has been discussed in [41–43]. We run the GMRES algorithm, the accelerated GMRES algorithm by vector extrapolation method, namely, RGMRES and GGMRES, on this problem, when the window size in extrapolation takes different values. All the algorithms will be stopped as soon as the residual 2-norms are below . Let in the numerical simulation, and the experimental results are listed in Table 2.
It is seen from Table 2 that the accelerated GMRES methods by vector extrapolation method perform better than the unaccelerated GMRES method both in terms of iteration counts and CPU time, regardless of their window size. Obviously the GMRES method accelerated by GQE outperforms the other two methods. For instance, when the widow size is 4, the iteration number in GGMRES is only of that in RGMRES and less than of that in GMRES. Especially from the view point of CPU time, the GGMRES method has more obvious advantages than the GMRES method and the RGMRES method, for it reduces the convergence time by compared with RGMRES method and by nearly compared with GMRES method.
Example 3. This problem is a 2D lattice (grid) with uniform weights, which has been discussed in [43, 44]. Aggregates of size are used and the gauge variables are defined on the lattice edges and are scalars valued as one, as shown in Figure 4.
In this problem, we run the GMRES, RGMRES and GGMRES, and compare their convergence rates while changing the problem size, with the restart number being 10 and the window size being 2. Numerical results are presented in Table 3. The numerical simulation given in Table 3 for the 2D lattice problem clearly demonstrates the effectiveness of acceleration by vector extrapolation methods. It is seen that the RGMRES algorithm converges faster than the GMRES method, while the GGMRES algorithm performs the best.
In this paper, we have presented a new GMRES method accelerated by vector extrapolation techniques to get the numerical solutions of the stationary probability vector of an irreducible Markov chain, using vector extrapolation techniques. Experimental results on several typical Markov chain problems demonstrate that the extrapolation method is an attractive option for accelerating the convergence of calculation of Markov chain, especially with the GQE (proposed by Sidi in ) being the accelerator. We note that, as mentioned previously, preconditioning technique may also be an appropriate strategy for improving convergence rate in Markov chains and would be one of our research topics of this field in future.
This research is supported by Chinese Universities Specialized Research Fund for the Doctoral Program (20110185110020), NSFC (61170309), Sichuan Province Sci. and Tech. Research Project (2012GZX0080), and Scientific Research Fund of Sichuan Provincial Education Department (T11008).
W. J. Stewart, An Introduction to the Numerical Solution of Markov Chains, Princeton University Press, Princeton, NJ, USA, 1994.View at: MathSciNet
S. D. Kamvar, T. H. Haveliwala, C. D. Manning, and G. H. Golub, “Extrapolation methods for accelerating PageRank computations,” in Proceedings of the 12th International World Wide Web Conference, Budapest, Hungary, May 2003.View at: Google Scholar
L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank citation ranking: bringing order to the web,” Tech. Rep. 1999-0120, Computer Science Department, Stanford, Calif, USA, 1999.View at: Google Scholar
A. Berman and R. J. Plemmons, Nonnegative Matrices in the Mathematics Science, SIAM, Philadelpha, Pa, USA, 1987.
H. De Sterck, T. A. Manteuffel, S. F. McCormick, Q. Nguyen, and J. Ruge, “Multilevel adaptive aggregation for Markov chains, with application to web ranking,” SIAM Journal on Scientific Computing, vol. 30, no. 5, pp. 2235–2262, 2008.View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
B. Philippe, Y. Saad, and W. J. Stewart, “Numerical methods in Markov Chain modeling,” Operations Research, vol. 40, pp. 1156–1179, 1992.View at: Google Scholar
I. Marek and P. Mayer, “Convergence theory of some classes of iterative aggregation/disaggregation methods for computing stationary probability vectors of stochastic matrices,” Linear Algebra and Its Applications, vol. 363, pp. 177–200, 2003.View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
M. Embree, “The tortoise and the hare restart GMRES,” Tech. Rep. 01/22, Oxford University Computing Laboratory Numerical Analysis, Oxford, UK, 2001.View at: Google Scholar
M. Benzi and B. Uçar, “Product preconditioning for Markov chain problems,” in Proceedings of the 2006 Markov Anniversary Meeting, Charleston, SC, USA, A. N. Langville and W. J. Stewart, Eds., pp. 239–256, Boson Books, Raleigh, NC, USA.View at: Google Scholar
B. P. Pugachev, “Acceleration of the convergence of iterative processes and a method of solving systems of non-linear equations,” USSR Computational Mathematics and Mathematical Physics, vol. 17, no. 5, pp. 199–207, 1978.View at: Google Scholar
C. Isensee and G. Horton, “A multi-level method for the steady state solution of Markov chains,” in Simulation and Visualization, SCS, Magdeburg, Germany, 2004.View at: Google Scholar
C. Isensee and G. Horton, “A multi-level method for the steady state solution of discretetime Markov chains,” in Proceedings of the 2nd Balkan conference in informatics, pp. 413–420, Ohrid, Macedonia, November 2005.View at: Google Scholar