Abstract

It is time consuming to numerically solve fractional differential equations. The fractional ordinary differential equations may produce Toeplitz-plus-band triangular systems. An efficient iteration method for Toeplitz-plus-band triangular systems is presented with computational complexity and memory complexity in this paper, compared with the regular solution with computational complexity and memory complexity. is the discrete grid points. Some methods such as matrix splitting, FFT, compress memory storage and adjustable matrix bandwidth are used in the presented solution. The experimental results show that the presented method compares well with the exact solution and is 4.25 times faster than the regular solution.

1. Introduction

Fractional differential equation (FDE) plays an important role in dynamical systems [1] and has more than 300 years of research history [2]. Many analytical solutions and numerical solutions [36] have been proposed for FDE, such as finite difference method [7, 8], finite element method [9], and spectral method [10, 11]. In recent times, interest in fractional ordinary differential equations (FODE) has increased [1215]. The derivatives in the FODE are approximated by linear combinations of function values at the discrete grid points. Compared with integer ordinary differential equations, the FODE has nonlocal effect, which means a grid point may rely on the grid points far away from its position. And a grid point of the classical integer equations may only rely on its several neighboring grid points.

For integer order equations, the coefficient matrices are often sparse. Because of the nonlocal property of fractional differential operators, the numerical methods for fractional diffusion equations often generate dense or even full coefficient matrices [16]. This nonlocal property makes the computation of FODE and FDE much heavier than that of the traditional integer equations. The short memory principle [17], parallel computing [1821], fast Fourier transformation (FFT) [22, 23], multigrid method [24], and preconditioner technologies [25, 26] are used to overcome this heavy computation. Gong et al. presented many parallel algorithms for different FDEs on both traditional and heterogeneous parallel platforms [16, 18]. Diethelm [19] proposed a parallel second-order Adams-Bashforth-Moulton method for a FODE. Wang and Du [26] proposed a superfast-preconditioned iterative method for steady-state two-side space-fractional diffusion equations.

The fractional ordinary differential equations may produce Toeplitz-plus-band triangular systems. Toeplitz-plus-band systems were studied by professors Chan and Ng  [27]. They considered the solutions of Hermitian Toeplitz-plus-band systems , where are -by- Toeplitz matrices and are -by- band matrices with bandwidth independent of . and are both Hermitian matrix. The authors proved that if is generated by a nonnegative piecewise continuous function and is positive semidefinite, then there exists a band matrix , with bandwidth independent of , such that the spectra of are uniformly bounded by a constant independent of . The band preconditioner was developed for Hermitian Toeplitz systems [28]. The recursive blocked algorithms were proposed for triangular systems and the recursive algorithms lead to an automatic variable blocking that has the potential of matching the memory hierarchies of today’s HPC (high performance computing) systems  [29, 30].

This paper focuses on the fractional ordinary differential equation [13]: where , , , and . The fractional derivative is in the Caputo form [31].

Define for , where is a positive integer, and are step size. Assume to be the numerical approximation to and the numerical approximation to . Using the Grünwald approximation, the finite difference scheme for (1) is shown as follows: where the normalized Grünwald weight is defined by

Equation (2) results in a linear system of equations where and . If , the term should be included in . is the coefficient matrix. is defined by

2. Method

2.1. Analysis

In a more explicit format, matrix can be represented as

The linear system (4) can be solved with computational complexity , shown in Algorithm 1. The output equals .

 input:
 output:
(1) for      to     by  1 do
(2) 
(3) for     to     by  1 do
(4)   

From (6), we can see that has some properties.(1)is a low triangular, diagonal dominant matrix.(2)One has for , . This property is determined by the normalized Grünwald weight and is the mathematical background of short memory principle. This property means that for grid point , if the distance of grid point is smaller than that of grid point , has more impact on than .(3)If is split into two matrices and , . is a banded matrix and the bandwidth (number of diagonals) . Matrix can be factorized into a product . is a diagonal matrix . is a Toeplitz matrix with nonzero diagonals on its left-bottom part.(4)The Toeplitz matrix can be stored with memory space compared with for a general low triangular matrix with order .

2.2. Efficient Iteration Method

Equation (4) evolves as follows:

So the linear algebra can be solved iteratively, shown in (10). Because is a diagonal matrix, keeps associative law and commutative law for matrix-matrix multiplication. The rate of convergence associated with (10) depends on the eigenvalues of the iteration matrix [32]:

Assuming error with satisfies , then with norm [32]:

So the spectral radius of () determines the asymptotic behavior of . From Theorem of [32], we can conclude that if and only if , (10) will converge to . Generally speaking, the iteration is expected to work well with small .

Assume the bandwidth of matrix is and . Solving needs about arithmetical operations. If is near , there are about arithmetical operations with forward substitution. So the computational complexity of is .

Assume , , and . The computation of and needs multiplications and additions, respectively. Because only has nonzero diagonals on its left-bottom part, only the front elements of are effective for the multiplication . The back elements of are zero. So can be regarded as a Toeplitz matrix vector multiplication with order . It is well known that Toeplitz matrix vector multiplication with order can be finished with operations [33]. The Toeplitz matrix vector multiplication can be computed by FFTs by first embedding into a -by- circulant matrix. The cost of circulant matrix vector multiplication is by using FFTs of length .

So the cost of each iteration of (10) is . If is a diagonal dominant matrix, we can expect (10) can converge with not too many iterations. The efficient iteration method is shown in Algorithm 2.

 input:
 output:  
(1)
(2) while     do
(3)    set with
(4)   
(5)   
(6)   
(7)   
(8)   for     to     by 1   do
(9)     
(10) for     to     by 1   do
(11)   
(12)   for     to     by 1   do
(13)    
(14) 
(15) 

In Algorithm 2, stands for the value of previous iteration and stands for the current iteration. stands for with . equals the reciprocal of with . stands for with . and are arrays. equals .

The value of can affect the performance of Algorithm 2 shown in Table 1.

Algorithm 2 has five features/advantages compared to Algorithm 1.(1)Split the coefficient matrix and solve the triangular system iteratively.(2)Use FFT to compute matrix vector multiplication.(3)Precompute .(4)Compress storage.(5)Adjust parameter .

3. Numerical Example

The experiment platform is a laptop with Intel(R) Core (TM) i3-3110M CPU, 2 GB main memory, and Windows 7 operating system. The CPU clock frequency is 2.40 GHz. The code is developed with MATLAB R2012a and runs on default double precision floating point operations.

The following fractional () ordinary differential equation [13] was considered: where .

The exact solution of (14) is

The efficient iteration method of Algorithm 2 compares well with the exact solution to the FODE in the test case of (14), shown in Figure 1. The is . The maximum absolute error is . The difference between the efficient iteration method and the forward substitution Algorithm 1 is only . The efficient iteration method and the regular forward substitution solution have no noticeable artifacts.

The performance comparison between regular forward substitution solution of Algorithm 1 and efficient iteration method of Algorithm 2 is shown in Table 2. Columns 2 and 3 of Table 2 are the runtime and the runtime is recorded in seconds. With , the maximum speedup is 4.25. Because the speedup increases with , the bigger is, the higher the speedup that can be expected is. Because of the 2 GB memory limitation, the compress memory usage is also used in Algorithm 1.

The impact of on the performance of Algorithm 2 is shown in Table 3. The runtime of the presented method varies with . So is a key parameter for the performance of Algorithm 2. In real fractional ordinary applications, the proper should be chosen.

The presented iteration method should be regarded as an iteration method to solve not only the system generated from FODE but also the more general Toeplitz-plus-band triangular systems. The technology of parallel computing is very useful, but with less mathematical background. Parallel computing is attractive for fractional differential equations [34]. As a part of future work, first, we would like to parallelize the presented solution on shared memory or distributed memory systems. Second, accelerating the presented efficient iteration method on heterogeneous architecture [3538] should also be interesting.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research work is supported in part by the National Natural Science Foundation of China under Grant nos. 11175253 and 61170083, in part by the Specialized Research Fund for the Doctoral Program of Higher Education under Grant no. 20114307110001, and in part by the 973 Program of China under Grant no. 61312701001. The authors would like to thank the anonymous reviewers for their helpful comments as well.