A Fast Newton-Shamanskii Iteration for a Matrix Equation Arising from M/G/1-Type Markov Chains
For the nonlinear matrix equations arising in the analysis of M/G/1-type and GI/M/1-type Markov chains, the minimal nonnegative solution or can be found by Newton-like methods. We prove monotone convergence results for the Newton-Shamanskii iteration for this class of equations. Starting with zero initial guess or some other suitable initial guess, the Newton-Shamanskii iteration provides a monotonically increasing sequence of nonnegative matrices converging to the minimal nonnegative solution. A Schur decomposition method is used to accelerate the Newton-Shamanskii iteration. Numerical examples illustrate the effectiveness of the Newton-Shamanskii iteration.
Some necessary notation for this article is as follows. For any matrix , if for all ; for any matrices , if for all ; the vector with all entries equal to one is denoted by ; that is, ; and the identity matrix is denoted by . An M/G/1-type Markov Chain (MC) is defined by a transition probability matrix of the formwhile the transition probability matrix of a GI/M/1-type MC is as follows:where and , respectively. is the smallest index such that , for , is (numerically) zero. The steady-state probability vector of an M/G/1-type MC, if it exists, can be expressed in terms of a matrix that is the element-wise minimal nonnegative solution to the nonlinear matrix equation Similarly, for the GI/M/1-type MC, a matrix is of practical interest, which is the element-wise minimal nonnegative solution to the nonlinear matrix equation For a large number of stochastic models, most steady-state and certain transient characteristics of the process can be expressed in terms of the minimal nonnegative solution described above. This problem arises in the analysis of queues with phase type service times, as well as in queues that can be represented as quasi-birth-death processes . It is known that any M/G/1-type MC can be transformed into a GI/M/1-type MC and vice versa through either the Ramaswami  or Bright  dual, and the matrix can be obtained directly in terms of the matrix of the dual chain. The drift of the chain is defined bywhere is the stationary probability vector of the irreducible stochastic matrix and . The MC is positive and recurrent if , null and recurrent if , and transient if ; and throughout this article it is assumed that .
Available algorithms for finding the minimal nonnegative solution to (3) include functional iterations , pointwise cyclic reduction (CR) , the invariant subspace (IS) approach , the Ramaswami reduction (RR) , and the Newton iteration (NI) [3, 9–11]. For a detailed comparison of these algorithms, we refer the readers to  and the references therein. In , Newton’s iteration is revisited and accelerated. From numerical experience, the fast Newton’s iteration in  is a very competitive algorithm.
In this paper, we consider the Newton-Shamanskii iteration for (3). It is shown that, starting with a suitable initial guess, the sequence generated by the Newton-Shamanskii iteration is monotonically increasing and converges to the minimal nonnegative solution of (3). Similar to Newton’s iteration, equation involved in the Newton-Shamanskii iteration step is a linear equation of the form , which can be solved fast by a Schur decomposition method. The Newton-Shamanskii iteration differs from Newton’s iteration as the Fréchet derivative is not updated at each iteration; therefore the special coefficient matrix structure form can be reused.
The paper is organized as follows. The Newton-Shamanskii iteration and its accelerated iterative procedure using a Schur decomposition method are given in Section 2. Then M/G/1-type MCs with low-rank downward transitions and low-rank local and upward transitions are considered in Sections 3 and 4, respectively. Numerical results in Section 6 show that the fast Newton-Shamanskii iteration can be more efficient than the fast Newton’s iteration proposed in . Final conclusions are presented in Section 6.
2. Newton-Shamanskii Iteration
In this section, we present the Newton-Shamanskii iteration for (3). First we rewrite (3) asThe function is a mapping from into itself and the Fréchet derivative of at is a linear map given byThe second derivative at , , is given by
For a given initial guess , the Newton-Shamanskii iteration for the solution of is as follows.
For , is the solution towhich, after rearranging the terms, can be rewritten asFollowing the notation of , we define ; then the above equation iswhich is a linear equation of the same form as the Newton’s iteration step. It can be solved fast by applying a Schur decomposition on the matrix , which is the matrix here, and then solving linear systems with unknowns and equations. For the detailed description for solving , we refer the reader to [11, 12]. We stress that, for Newton-Shamanskii iteration, the coefficient matrices are updated once after every iteration step and the special coefficient structure can be reused, so the cost per iteration step is reduced significantly.
3. The Case of Low-Rank Downward Transitions
When the matrix is of rank , meaning that it can be decomposed as with and , we refer to the MC as having low-rank downward transitions. If Newton-Shamanskii iteration is applied to this case, all the matrices can be written as . This can be shown by making induction on the index . can be written as and we assume that it is true for all for and . Hence, can be written as , since . Then (12) can be rewritten astherefore can be decomposed as the product of an matrix and an matrix . The inverse on the right-hand side exists since and the spectral radius of is strictly less than one . Therefore we will concentrate on finding as the solution towhich can be rewritten asWe can use the Schur decomposition method in [11, 12] to solve the above equation. Different from Newton’s iteration in , the special coefficient structure can be reused here, thus saving the overall computational cost. We will report the numerical performance of the Newton-Shamanskii iteration in Section 6.
4. The Case of Low-Rank Local and Upward Transitions
In this section, the case of low-rank local and upward transitions is considered, where the matrices can be decomposed as with and . To exploit low-rank local and upward transitions, we introduce the matrix , which is the generator of the censored Markov chain on level , starting from level , before the first transition on level . The following equality holds based on a level crossing argument:For the case of low-rank local and upward transitions, we can rewrite aswhich means that is of rank , while is generally of rank .
Therefore, we find as the solution toand get from [11, 14]. The Newton-Shamanskii iteration step for (19) is as follows.
For , is the solution toIf we define and rearrange the terms, (21) can be rewritten aswhich is of the form . This iteration enables us to exploit low-rank local and upward transitions. The iterates , where solves (21), can be rewritten as . This can be shown by making induction on the index . It obviously holds for . Assuming that , from (21), we getwhich tells us that can be decomposed as , and the same holds for . Therefore, from (21), we will focus on finding as the solution to
Defining , we can rewrite the above equation aswhich is of the form .
5. Convergence Analysis
There is monotone convergence when the Newton-Shamanskii method is applied to (3).
Let us first recall that a real square matrix is a -matrix if all its off-diagonal elements are nonpositive and can be written as with . Moreover, a -matrix is called an -matrix if , where is the spectral radius; it is a singular -matrix if and a nonsingular -matrix if . The following result from  is to be exploited.
Lemma 1. For a -matrix , the following statements are equivalent:(a) is a nonsingular -matrix.(b).(c) for some vector .(d)All eigenvalues of have positive real parts.
The following result is also well known .
Lemma 2. Let be a nonsingular -matrix. If is a -matrix, then is a nonsingular -matrix. Moreover, .
The minimal nonnegative solution for (3) may also be recalled—see  for details.
Theorem 3. If the rate defined by (5) satisfies , then the matrixis a nonsingular -matrix.
5.2. Monotone Convergence
The following lemma displays the monotone convergence properties of the Newton iteration for (3).
Lemma 4. Consider a matrix such that(i),(ii),(iii) is a nonsingular -matrix.Then the matrixis well defined, and(a),(b),(c) is a nonsingular -matrix.
Proof. is invertible and the matrix is well defined from (iii) and Lemma 1. Sincefrom (iii) and Lemma 1 and , we get that and thus . From (27) and the Taylor formula, there exists a number , , such thatso (a) is proven, where the last inequality is obtained from and (8). (b) may be proven as follows. Fromwhere , we havewhere the last inequality is from by (ii). It is notable thatis a nonsingular -matrix, so from Lemma 1; that is, . Now , so (b) follows. Next we prove (c). From , we haveand we know that is a nonsingular -matrix. Consequently, from Lemma 2, is a nonsingular -matrix.
A generalization of Lemma 4 provides the theoretical basis for the monotone convergence of the Newton-Shamanskii method for (3).
Lemma 5. Consider a matrix such that (i),(ii),(iii) is a nonsingular -matrix.Then, for any matrix , where , the matrixexists, such that(a),(b),(c) is a nonsingular -matrix.
Proof. Since , we haveFrom (iii) and Lemma 2, is invertible and the matrix is well defined such that Letwe know that from Lemma 2. Because from Lemma 4, (b) follows. Nowis a nonsingular -matrix from Lemma 4 and ; therefore is a nonsingular -matrix from Lemma 2. Next we show that (a) is true. From the Taylor formula, there exists two numbers and , where , such thatwhere the latter inequality holds, since and .
The monotone convergence result for the Newton-Shamanskii method applied to (3) follows.
Theorem 6. Suppose that a matrix is such that(i),(ii),(iii) is a nonsingular -matrix.Then the Newton-Shamanskii algorithm (9)-(10) generates a sequence such that for all , and .
Proof. The proof is by mathematical induction. From Lemma 5,is a nonsingular -matrix. Assuming thatand that is a nonsingular -matrix, from Lemma 5,and is a nonsingular -matrix. By induction, the sequence is therefore monotonically increasing and bounded above by and so has a limit such that . Letting in , it follows that . Consequently, since and is the minimal nonnegative solution of (3).
6. Numerical Experiments
The Newton-Shamanskii iteration differs from Newton’s method in that the evaluation of the Fréchet derivative is not done at every iteration step. So, while more iterations will be needed than for Newton’s method, the overall cost of the modified Newton method could be less. Our numerical experiments confirm the feasibility of the Newton-Shamanskii iteration for (6).
About how to choose the optimal scalars in the Newton-like algorithm (see (9) and (10)), now we have no theoretical results. This is a goal for our future research. In our extensive numerical experiments, we update the Fréchet derivative every two iteration steps. That is, for , we choose in the Newton-like algorithm (9).
The elapsed CPU time in seconds (denoted as “time") is used to measure the feasibility of our new method. In our numerical experiments, we use zero matrix as the initial iteration value and choose the following stopping criterion:where and denote the iteration values of solution after and before one iteration step, respectively, and denotes the infinity-norm of a matrix. The numerical tests were performed on a laptop (2.4 GHz and 2 G Memory) with MATLAB R2013b. Numerical experiments show that the modified Newton method could be more efficient than Newton iteration proposed in . We present the numerical results for a random-generated problem in Figure 1. In this figure, we fix the number of coefficient matrices () and vary the problem size () and plot the CPU time of the two algorithms in seconds for different parameters .
The MATLAB code used for the problem construction is reported as follows, which generates the coefficient matrices for (6), that is, , : function [A,er,rho]=problem(n,N) rand(’state’,0); A=rand(n,n(N+1)); e=ones(n(N+1),1); s=Ae; e=ones(n,1); for i=1:n A(i,:)=A(i,:)/s(i); end B=zeros(n); beita=0; for i=1:(N+1) B=B+A(:,((i-1)n+1):(in)); beita=beita+(i-1)A(:,((i-1)n+1):(in))e; end [V,D]=eig(B’); pai=V(:,1); s=sum(pai); pai=pai/s er=max(B’pai-pai) rho=pai’beita endUp to now, we do not have theoretical results about when this modified Newton method can outperform the Newton iteration. This is not an easy question. We leave it as a goal for future research.
Conflicts of Interest
The author declares that there are no conflicts of interest regarding the publication of this paper.
This research is supported by the Fundamental Research Funds for the Central Universities in China University of Geosciences, Beijing (2652017140).
M. F. Neuts, Structured Stochastic Matrices of the M/G/1 Types and Their Applications, vol. 5 of Probability: Pure and Applied, Marcel Dekker, New York, NY, USA, 1989.View at: MathSciNet
M. Neuts, Matrix-Geometric Solutions in Stochastic Models—An Algorithmic Approach, The Johns Hopkins University Press, Baltimore, Md, USA, 1981.View at: MathSciNet
V. Ramaswami, “Nonlinear matrix equations in applied probability - solution techniques and open problems,” SIAM Review. A Publication of the Society for Industrial and Applied Mathematics, vol. 30, no. 2, pp. 256–263, 1988.View at: Publisher Site | Google Scholar | MathSciNet
V. Ramaswami, “A duality theorem for the matrix paradigms in queueing theory,” Communications in Statistics. Stochastic Models, vol. 6, no. 1, pp. 151–161, 1990.View at: Publisher Site | Google Scholar | MathSciNet
P. G. Taylor and B. Van Houdt, “On the dual relationship between Markov chains of GI/M/1 and M/G/1 type,” Advances in Applied Probability, vol. 42, no. 1, pp. 210–225, 2010.View at: Publisher Site | Google Scholar | MathSciNet
D. Bini and B. Meini, “On the solution of a nonlinear matrix equation arising in queueing problems,” SIAM Journal on Matrix Analysis and Applications, vol. 17, no. 4, pp. 906–926, 1996.View at: Publisher Site | Google Scholar | MathSciNet
N. Akar and K. Sohraby, “An invariant subspace approach in M/G/1 and G/M/1 type Markov chains,” Communications in Statistics. Stochastic Models, vol. 13, no. 3, pp. 381–416, 1997.View at: Publisher Site | Google Scholar | MathSciNet
D. Bini, B. Meini, and V. Ramaswami, “Analyzing M/G/1 paradigms through QBDs: the role of the block structure in computing the matrix G,” in Advances in Algorithmic Methods for Stochastic Models, G. Latouche and P. Taylor, Eds., pp. 73–86, Notable Publications: Neshanic, Station, NJ, 2000.View at: Google Scholar
G. Latouche, “Newton's iteration for non-linear equations in Markov chains,” IMA Journal of Numerical Analysis (IMAJNA), vol. 14, no. 4, pp. 583–598, 1994.View at: Publisher Site | Google Scholar | MathSciNet
M. F. Neuts, “Moment formulas for the Markov renewal branching process,” Advances in Applied Probability, vol. 8, no. 4, pp. 690–711, 1976.View at: Publisher Site | Google Scholar | MathSciNet
J. F. Pérez, M. Telek, and B. Van Houdt, “A Fast Newton’s Iteration for M/G/1-Type and GI/M/1-Type Markov Chains,” Stochastic Models, vol. 28, no. 4, pp. 557–583, 2012.View at: Publisher Site | Google Scholar | MathSciNet
J. F. Pérez and B. Van Houdt, “The M/G/1-type Markov chain with restricted transitions and its application to queues with batch arrivals,” Probability in the Engineering and Informational Sciences, vol. 25, no. 4, pp. 487–517, 2011.View at: Publisher Site | Google Scholar | MathSciNet
D. A. Bini, G. Latouche, and B. Meini, Numerical methods for structured Markov chains, Numerical Mathematics and Scientific Computation, Oxford University Press, New York, 2005.View at: Publisher Site | MathSciNet
G. Latouche and V. Ramaswami, Introduction to matrix analytic methods in stochastic modeling, ASASIAM Series on Statistics and Applied Probability; SIAM, Philadelphia, PA, USA, 1999.View at: Publisher Site | MathSciNet
R. S. Varga, Matrix Iterative Analysis, Prentice Hall, Englewood Cliffs, NJ, USA, 1962.View at: MathSciNet