Journal of Applied Mathematics

Journal of Applied Mathematics / 2014 / Article
Special Issue

Iterative Methods and Applications 2014

View this Special Issue

Research Article | Open Access

Volume 2014 |Article ID 467672 | 9 pages | https://doi.org/10.1155/2014/467672

MRILDU: An Improvement to ILUT Based on Incomplete LDU Factorization and Dropping in Multiple Rows

Academic Editor: D. R. Sahu
Received30 May 2014
Accepted06 Aug 2014
Published20 Aug 2014

Abstract

We provide an improvement MRILDU to ILUT for general sparse linear systems in the paper. The improvement is based on the consideration that relatively large elements should be kept down as much as possible. To do so, two schemes are used. Firstly, incomplete LDU factorization is used instead of incomplete LU. Besides, multiple rows are computed at a time, and then dropping is applied to these rows to extract the relatively large elements in magnitude. Incomplete LDU is not only fairer when there are large differences between the elements of factors L and U, but also more natural for the latter dropping in multiple rows. And the dropping in multiple rows is more profitable, for there may be large differences between elements in different rows in each factor. The provided MRILDU is comparable to ILUT in storage requirement and computational complexity. And the experiments for spare linear systems from UF Sparse Matrix Collection, inertial constrained fusion simulation, numerical weather prediction, and concrete sample simulation show that it is more effective than ILUT in most cases and is not as sensitive as ILUT to the parameter p, the maximum number of nonzeros allowed in each row of a factor.

1. Introduction

The solving of sparse equations is the core issue of many scientific and engineering calculations, and the time required to solve sparse linear systems generally accounts for a large proportion in the whole numerical simulation. More seriously, with the improvement of the simulation precision, the spatial resolution becomes higher and higher, the grid points become more and more dense, and thus, the size of the linear equations eventually also becomes larger and larger. For the large-scale sparse linear equations, especially for the sparse linear system during the simulation of the three-dimensional problems, the storage requirement and the amount of computation from the traditional direct solution methods are very large; besides, the sparsity of the coefficient matrix cannot be fully utilized [1] and it is difficult to be parallelized efficiently.

The iterative method can be controlled easily and the iterative method can make full use of the sparsity of the matrix; thus the storage requirement and the amount of computation are both very small in a single iteration. In addition, it can solve the corresponding linear equations only if the computational rule of coefficient matrix-vector product is known, not having to know the specific structure or the specific elements of the coefficient matrix, which is impossible for the direct method. Based on these considerations, the iterative method, especially the Krylov subspace method, has drawn more and more attention in recent years. Because the projection scheme is adopted in the Krylov subspace method, the convergence rate is relatively quicker. However, the convergence rate of the Krylov subspace method depends on the eigenvalue distribution of the coefficient matrix; the more concentrated the eigenvalues are, the quicker the convergence will be. To accelerate the convergence rate and reduce the total computation time, efficient preconditioning techniques should be used to convert the original sparse linear equations into another with the same solution, but with narrower eigenvalue distribution region of the coefficient matrix [1, 2].

In general, the narrower the region the eigenvalues are located in, the more effectively the number of iterations can be reduced. However, the amount of computation in a single iteration must also be less so as to make the ultimate solution process efficient. On the other hand, when preconditioning is not used, the amount of computation in a single iteration lies mainly in the operation that the coefficient matrix is applied to a vector. Therefore, an efficient preconditioner should ensure that the distribution region of the eigenvalues is improved significantly, and at the same time, the additional amount of computation in a single iteration is almost equal to that of a matrix-vector product. The incomplete factorization is one kind of such preconditioners, which aims to solve the general sparse linear equations.

Many incomplete factorizations have been developed since Meijerink and van der Vorst introduced incomplete Cholesky factorization into the conjugate gradient as a preconditioner [3]. The differences among them are mainly in the dropping rules for the elements in the factors. Further, many block forms, diagonal modifications, stabilized versions, and parallel implementations have been developed.

The simplest incomplete factorization is ILU(0). It requires that the sparse structure of the incomplete factor is the same as the original coefficient matrix [3]. The implementation of ILU(0) can be very cheap, and when the coefficient matrix is diagonally dominant or a matrix, it is effective and robust. However, for more complex issues encountered in actual applications, ILU(0) is too rough, and thus it inspires people to design more efficient ones under the way of allowing more fill-ins. ILU() is one of them; it can be seen as an extension of ILU(0) [1, 2, 4], in which the fill-ins with level number greater than are set to zero. Generally speaking, ILU(1) is quite effective. When the level threshold is higher, the improvement is usually very small. However, with the increase of , the amount of computation increases very fast and thus ILU() with greater than 1 is not considered in general.

When the diagonal dominance of the coefficient matrix is poor, ILU() may drop many relatively large elements,while retaining many elements with small absolute values, making the effectiveness of the method degraded greatly for general sparse matrices. It is due to the fact that ILU() completely starts from the nonzero structure, while the size of element in the factors is not considered. Based on this consideration, the researchers put forward another incomplete factorization in which the dropping is applied based on the magnitude of the nonzero elements in the factors. However, if the dropping rule is based only on the magnitude of elements, the total number of nonzero elements in the derived factors is very difficult to control; thus the storage requirement and the amount of computation are very difficult to control. The double threshold incomplete factorization (ILUT), put forward by Saad, effectively deals with the above-mentioned problem. When the factorization is performed at the th step, the elements with relatively small absolute value in the th row of the factor are dropped according to the threshold value firstly and then at most elements with the largest absolute value are retained in the nondiagonal elements in each row of incomplete factors [5].

The LU factorization is not the only way to construct incomplete factorization preconditioner. Many others can also be used. Based on A-orthogonal factorization, Benzi computed a factorization type sparse approximate inverse preconditioner AINV, and the experimental results show that it has similar quality as that of ILU preconditioner but has better potential parallelism [6]. Aiming at the sparse linear equations needing to be solved in the numerical simulation of Markov chains, an incomplete WZ factorization is introduced in reference [7] and the experimental results show that this method is quicker than the ILU factorization preconditioner [7, 8]. For the sparse linear systems with finite element structure, Vannieuwenhoven and Meerbergen provided an incomplete multiple wave-frontal LU factorization preconditioning process, which is obtained through dropping in the process, and the computation performance and the robustness are improved through full use of the dense structure of the elemental stiffness matrix and the selection of local efficient principals [9].

Up to now, various incomplete factorizations are provided. However, ILUT is one of the most widely used incomplete factorizations. Although the effectiveness of the ILUT proposed for the general sparse linear equations is very high, there is still room for improvement. Recently, Maclachlan, Osei-kuffuor, and Saad studied the measures to improve its accuracy and stability through applying compensation [10]. This paper can also be seen as an improvement to ILUT. Having realized that the dropping rules in ILUT are not fair, an improvement is proposed in this paper from the following two aspects. First, the incomplete LDU factorization is used to replace the incomplete LU factorization, so as to make the dropping in incomplete factors and fairer. Second, when multiple rows computed at a time for incomplete factors and , the dropping rules are applied, respectively, to and . The elements with maximum magnitude are retained in the computed rows.

2. Description of MRILDU

The ILUT preconditioner has been widely and effectively used to solve many difficult sparse linear systems. The specific descriptions of the double threshold incomplete factorization ILUT() are shown in Algorithm 1 [2].

(1)   For Do
(2)   
(3)   For and when Do
(4)    
(5)    If then
(6)    If then
(7)     
(8)    Endif
(9)   Enddo
(10) If then for
(11)  Extract the maximum elements from
(12) Extract the maximum elements from
(13)  for
(14)  for
(15) 
(16) Enddo

In Algorithm 1, three dropping strategies are used. First, in the fifth row of the described algorithm, if the absolute value of some element is less than the threshold , the element will be dropped and replaced by zero. Second, in the tenth row, another rule is applied to drop the elements whose absolute values are less than , where is taken as and is equal to the quotient of the 1-norm of the th row in the original coefficient matrix to the number of the nonzero elements in the row. Third, in the eleventh and twelfth rows, for the th row of , namely, the strictly lower triangular part, at most largest elements are retained, and at the same time, for the strictly upper triangular part in the row of , at most largest elements are retained too. In addition, the diagonal elements are always retained.

The effectiveness of ILUT preconditioner originates from the following two aspects. First, the number of the nonzero elements in the incomplete factors is reduced through dropping the elements whose relative magnitude is less than σ and retaining at most nonzero elements in each row of the factors. This reduces not only the time used in incomplete factorization, but also the preconditioned iteration time greatly. Second, when the diagonal dominance of the matrix is poor, the quality of the incomplete factors can be improved through increasing the parameter and decreasing the parameter σ, especially if σ is equal to zero and is equal to the order of the matrix and the related incomplete factorization is just the LU factorization.

Although ILUT is effective, it still has deficiencies in practical applications. First, it is very difficult to specify the best values for the parameters in ILUT. If is too small or σ is too large, the effectiveness of the preconditioner will be very poor. However, when is very large and σ is very small, although the effectiveness of the obtained preconditioner is good, the cost of the construction of the preconditioner and the overhead in a single iteration are both very large, which is unbearable. Second, when the magnitudes of the elements in each row of the matrix differ greatly, the magnitudes of the elements in each row of the incomplete factors and may also differ greatly. However, ILUT almost averagely retains the same number of nonzero elements in each row and this may drop some relatively large elements while retaining many relatively small elements. Although the scheme to retain relatively large elements is not always superior to that to retain relatively small elements, the existing experiments show that it has more advantages to retain relatively large elements in general, especially for diagonally dominant matrices.

Based on the above considerations, here we propose an improvement to ILUT, which is multirow ILDU (MRILDU). The idea can be outlined as follows. Perform incomplete LDU factorization for the matrix, compute multiple rows of factors and every time, and then apply dropping strategy to the obtained rows of and . It has the following two advantages to use the incomplete LDU factorization to replace the incomplete LU factorization. First, when LDU factorization is applied, the elements in and the elements in have been, respectively, proportioned to the diagonal elements in the same column or same row and thus it is more equitable to use the same dropping strategy in and . Second, because the elements in have already been scaled, it is more equitable for each row when the dropping strategy is used for multiple rows. Furthermore, when multiple rows are computed at a time and then dropping rule is applied, the unified rule can be used to drop nonzero elements in these rows. Compared to the algorithm that one row is computed at a time and simultaneously the dropping strategy is applied, MRILDU is more favorable in retaining the elements with relatively large magnitudes in the incomplete factors. Assuming that every time rows of the matrix are factorized and then the dropping strategy is used for them, MRILDU() can be specifically described in Algorithm 2.

(1)   For Do
(2)   
(3)   For and when Do
(4)     and
(5)    If then
(6)    If then
(7)     
(8)    Endif
(9)   Enddo
(10) For Do
(11)   
(12)  If then
(13) Endfor
(14)  for 1
(15) 
(16)  for
(17) 
(18) If mod() = 0 then
(19)  Extract the maximum elements from
(20) Extract the maximum elements from
(21) Endif
(22) Enddo

During the implementation, the algorithm MRILDU() needs to efficiently solve the following three problems. The first one is the linear combination of the sparse vectors on step 7. The second one is to select several elements with the largest magnitude from the given vectors on step 19 and step 20, and the third one is that the elements in th row of must be accessed in ascending order on step 3. It can be found from the comparison of the steps in Algorithms 2 and 1 that the first problem encountered in the algorithm MRILDU() is the same as that of ILUT(). Therefore, the same technology can be used. See literatures [2, 4] for specific implementation details.

For the second problem, we adopt the quick sort method which is slightly different from that in ILUT(). When the sorting is applied to step 19 and step 20 in MRILDU(), the exchange operation is not applied in deed, while another integer array is used to track the sorting process, and ultimately its first elements are used to record the positions of the elements with the largest magnitude. After the end of the sorting, the elements on other positions in the column number array are set to zero, and those elements are dropped accordingly. In this way, the remaining elements can be preserved directly without having to spend too much time to determine the row numbers of the nonzero elements.

For the third problem, from its appearance, it should be completely identical in MRILDU and in ILUT. However, it needs to note that, in MRILDU, rows are taken as a whole at a time to use the dropping strategy. We may as well assume that the dropping strategy is used to the rows from the th to the th altogether, when the th row is computed, similar to ILUT; the elements in the final incomplete factor are used. However, when the rows from th to th are computed, the elements in the th row as well as the rows in front of it in the factor are the elements in the eventual incomplete factor. But when the elements in the rows from the th to th are used, these elements are only temporary values, and the drooping strategy has not been applied yet. In order to facilitate the calculation, initially, additional storage units are allocated in advance for each incomplete factor and in the calculation the elements of continuous rows are stored according to the format of the th and the previous rows to facilitate the implementation of the algorithm. When the dropping strategy is applied, the elements in these rows are processed directly. The dropped elements are set to zero, and after that, these zero elements are dropped completely and their positions are filled with the subsequent nonzero elements.

In the following, we analyzed the differences of the storage requirement and the amount of computation of MRILDU from ILUT. In ILUT, every time one row is computed and nonzero elements with largest magnitudes are selected and retained from the current row of each triangular factor. There may be only at most nonzero elements in each row of the factors before applying the dropping rule. Therefore, it can be considered that the temporary space complexity is , which needs to be allocated additionally. In MRILDU, the nonzero elements in rows need to be stored before applying dropping rule, and thus, the temporary space complexity which needs to be allocated additionally may be up to . As previously described, the storage space has been allocated in advance. For the final incomplete factors, due to the fact that the storage units of MRILDU() and ILUT() are both at most, therefore, the storage requirements are almost invariant, especially when is small or is large. Accordingly in the preconditioned iteration, the computing time in a single iteration will also be almost the same. On the other hand, the quality of MRILDU is higher in general; that is to say, when MRILDU is used, the time spent in iteration will be less than that when ILUT is used. Therefore, if the construction of the preconditioner is not considered, the time elapsed in the iteration itself will be less with MRILDU than that with ILUT.

The time complexity of the construction of MRILDU differs ILUT mainly from two origins. First, in ILUT, the dropping is applied for one row each time. After the dropping is completed, the retained elements of the row are stored in the incomplete factor. When the calculation is applied for the subsequent rows, the elements in front of the current row in the factor need to be used and these elements are in the final incomplete factor. In MRILDU, rows of the factors are calculated at a time. It may be assumed that they are the rows from the th to th. When applying dropping to these rows as a whole, the calculation process is as described previously. Therefore, the additional temporary storage units are greater than that in ILUT in general. At the same time, the amount of computation will be also slightly larger than that with ILUT.

The second origin is the different dimension of the vectors to be split and the different number of elements which need to be extracted from them. For ILUT, if there are nonzeros in each of the rows from the th to the th before dropping is applied, the time complexity for each row is about when the quick sort is used to extract the required elements from the factor. Therefore, the time complexity for rows is . For MRILDU, because the dropping strategy is applied to rows, this operation is equivalent to the extraction of elements from elements. Thus, the time complexity should be , namely, . This shows that just from the perspective of extracting the elements with largest magnitudes, it seems that MRILDU will be quicker. However, it needs to note that, when the rows from the th to the th are computed, the temporarily storage units in each row are likely more than those in ILUT. Therefore, even if MRILDU will be slightly quicker, it is not very significant, especially when is large.

Furthermore, it can be proved that similar to other ILU factorizations, when is a matrix or diagonally dominant, MRILDU can always continue, and the lower right corner submatrix to be factorized will also be always a matrix or diagonally dominant. For each of these two issues, after is split into and , the iterative method with as the iterative matrix will be always convergent. Correspondingly, the condition number of the preconditioned coefficient matrix will be better than that of the original coefficient matrix.

3. Numerical Experiments

In this section, for several sparse linear equations from some scientific engineering applications, the proposed MRILDU is compared to ILUT. The experiments are applied on the high performance server with Intel Xeon CPU E5-2670 0 @ 2.60 GHz CPU, 20480 KB cache, and main memory of 48 G. The operating system is Red Hat 4.4.5-6 Linux. The Intel Fortran compiler ifort 11.1.059 is used as compiler and -O3 is used for optimization.

In this paper, for convenience, the number of iterations will be denoted by IT, and the time used for the construction of the preconditioner and the iteration process are denoted by CTME and ITME, respectively. The total solution time is denoted by TTME, which is the sum of CTME and ITME. NNZ represents the number of nonzero elements in the preconditioner. All the time results are in seconds. In all iterations, the initial solution vector is selected as all zeros.

3.1. Sparse Linear Systems from Some Test Matrices

The coefficient matrices of the sparse linear equations here come from the UF sparse matrix collection (http://www.cise.ufl.edu/research/sparse/matrices/index.html) website, and the information of these matrices can be briefly described in Table 1. Among them, the last four matrices starting from orsirr_1 can also be downloaded from the website Martrixmarket (http://math.nist.gov/MatrixMarket/browse.html). The right-hand side vectors are obtained through applying the matrix to the given true solution vector, in which the th component is , where is the order of the matrix. During the iteration process, BiCGSTAB is used, and the convergence criterion is that the 2-norm of the residual vector is reduced by 10 orders of magnitude.


MatrixNNZApplication field

af_shell10150806527090195Sheet metal forming
bone01098670336326514Model reduction
ldoor95220342493817Structural problem
thermal212280454904179Thermal problem
airfoil_2d14214259688CFD problem
cvxbqp150000349968Optimization problem
sparsine500001548988Structural problem
orsirr_110306858CFD problem
pores_212249613CFD problem
saylr310003750CFD problem
sherman3500520033CFD problem

The results with ILUT and MRILDU are listed in Table 2. In all the tests of this subsection, the threshold value is taken as all the time.


ITNNZCTMEITMETTME

ILUT5768139920.20501.25171.4566
MRILDU51683110040.24201.10741.3494
MRILDU52585297560.24640.94941.1958
MRILDU55583139070.22280.93621.1590
ILUT10698485980.20541.11581.3213
MRILDU1014112829190.21620.80191.0181
MRILDU1023113691320.21500.60450.8196
MRILDU1054115792610.19280.79340.9863
ILUT206104648410.20271.14151.3442
MRILDU2012117469020.18720.42820.6154
MRILDU2022117492000.18760.42730.6149
MRILDU2052117492440.18550.43180.6173

As can be seen from Tables 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12, when the same and are used, the number of iterations with MRILDU is less than that with ILUT in general; namely, the effectiveness of MRILDU is higher. Besides, when is small, due to the fact that the number of iterations with MRILDU is less than that with ILUT, the time elapsed for iteration is also less than that with ILUT. However, for some matrices, because the number of the nonzero elements retained by MRILDU is relatively more, while the number of iterations is not reduced significantly, especially when is large, the time elapsed for iteration is slightly longer than that with ILUT. From the perspective of total execution time including preconditioner construction and iteration process, although it is difficult to say that MRILDU is absolutely better than ILUT, MRILDU is superior in general. Furthermore, when ILUT is used, it is difficult to solve the linear systems for some matrices. However, when MRILDU is used, the solution is more robust with slightly larger . Moreover, the experiments also show that when is large, MRILDU is not as sensitive as ILUT to the parameter .


ITNNZCTMEITMETTME

ILUT51649003350.31192.55862.8704
MRILDU511258844400.40671.95922.3659
MRILDU521258947750.43461.96492.3995
MRILDU551159154030.46941.84802.3174
ILUT10997027080.35301.60851.9615
MRILDU1018106868130.43441.45311.8875
MRILDU1028107229720.44451.47011.9146
MRILDU1057107959360.49151.29331.7848
ILUT205187848930.39511.07521.4703
MRILDU2015197689980.44771.10301.5507
MRILDU2024198179150.47250.89381.3663
MRILDU2054199234840.57480.90481.4797


ITNNZCTMEITMETTME

ILUT51941654540.35162.01672.3683
MRILDU511750300850.47381.81352.2873
MRILDU521552206180.48551.68052.1660
MRILDU551353601380.48061.52972.0103
ILUT101075335770.37551.16451.5400
MRILDU101983651120.48801.08591.5740
MRILDU102786834590.49470.87851.3732
MRILDU105789799810.49380.90681.4007
ILUT205124433360.38780.68641.0742
MRILDU2015131805550.46080.70381.1647
MRILDU2025135653020.45950.72091.1804
MRILDU2055139322750.44770.73541.1830


ITNNZCTMEITMETTME

ILUT5635892940.08080.64830.7291
MRILDU51347016670.07910.34850.4276
MRILDU52348363740.07810.35350.4316
MRILDU55348819730.07660.35490.4315
ILUT10637855380.07990.66120.7411
MRILDU101248960780.06800.24750.3154
MRILDU102248960780.07170.24760.3193
MRILDU105248960780.07060.24740.3180
ILUT20637855380.07990.66160.7416
MRILDU201248960780.06870.24770.3164
MRILDU202248960780.07230.24740.3198
MRILDU205248960780.07050.24740.3180


ITNNZCTMEITMETTME

ILUT5451409330.01190.05830.0702
MRILDU51381552800.01780.04860.0664
MRILDU52411558630.01880.05630.0751
MRILDU55381559920.01880.05610.0750
ILUT10222765590.02860.03540.0640
MRILDU101212898500.03260.03270.0653
MRILDU102222946960.03340.03720.0706
MRILDU105212970000.03350.03820.0718
ILUT20134910440.05090.02910.0800
MRILDU201144553030.04150.03020.0717
MRILDU202144578580.03990.03060.0706
MRILDU205144588510.03970.03060.0703


ITNNZCTMEITMETTME

ILUT591444100.00210.02060.0227
MRILDU5131888280.00230.00770.0100
MRILDU5221971270.00220.00550.0078
MRILDU5521987830.00230.00550.0077
ILUT1091554990.00200.02070.0227
MRILDU10111998840.00210.00320.0053
MRILDU10211998840.00210.00320.0053
MRILDU10511998840.00220.00320.0054
ILUT2091554990.00200.02070.0227
MRILDU20111998840.00210.00320.0053
MRILDU20211998840.00210.00320.0053
MRILDU20511998840.00220.00320.0054


ITNNZCTMEITMETTME

ILUT5162292310.01210.07410.0861
MRILDU51142780490.01680.06580.0826
MRILDU52122838960.01750.05930.0768
MRILDU55122864170.01720.06110.0782
ILUT10134106000.01300.06840.0814
MRILDU10194594130.01690.04930.0662
MRILDU10294691410.01750.05090.0685
MRILDU10594737560.01650.05110.0677
ILUT20106317340.01270.07120.0839
MRILDU20166803870.01230.04560.0580
MRILDU20247090440.01250.03240.0449
MRILDU20547232480.01250.03220.0447


ITNNZCTMEITMETTME

ILUT516753330.00020.00940.0097
MRILDU511067120.00040.00080.0011
MRILDU521167820.00040.00080.0012
MRILDU551069320.00040.00080.0012
ILUT1016556780.00030.00950.0098
MRILDU1011075170.00040.00080.0012
MRILDU1021176980.00040.00090.0013
MRILDU1051079310.00040.00080.0012
ILUT2016559610.00030.00970.0100
MRILDU2011078390.00040.00080.0012
MRILDU202981000.00040.00070.0011
MRILDU2051181830.00040.00090.0013


ITNNZCTMEITMETTME

ILUT59573390.00030.00710.0074
MRILDU515695870.00040.00450.0049
MRILDU5248114350.00060.00410.0048
MRILDU5530113150.00070.00280.0035
ILUT105897780.00050.00500.0055
MRILDU10128145200.00080.00270.0035
MRILDU10224171930.00110.00260.0037
MRILDU10520172060.00120.00220.0034
ILUT2059100020.00050.00500.0055
MRILDU20119193840.00110.00220.0033
MRILDU20217211370.00120.00210.0033
MRILDU20517213070.00130.00210.0034


ITNNZCTMEITMETTME

ILUT54549400.00040.00280.0032
MRILDU51961040.00060.00060.0013
MRILDU52776020.00080.00060.0014
MRILDU55785600.00090.00070.0015
ILUT104374550.00060.00300.0037
MRILDU101693740.00100.00060.0015
MRILDU1025110020.00100.00050.0016
MRILDU1055118280.00100.00060.0016
ILUT204789680.00070.00360.0043
MRILDU2015120240.00100.00060.0016
MRILDU2025124300.00100.00060.0016
MRILDU2055124660.00100.00060.0016


ITNNZCTMEITMETTME

ILUT5137284590.00210.03650.0386
MRILDU5144320970.00290.01210.0150
MRILDU5244331800.00330.01240.0158
MRILDU5530355260.00380.00920.0130
ILUT10128504810.00510.04040.0456
MRILDU10128552980.00600.00890.0150
MRILDU10227560840.00640.00920.0156
MRILDU10523581370.00660.00840.0150
ILUT20113857900.00910.04620.0553
MRILDU20118881740.00960.00730.0169
MRILDU20217887700.00980.00720.0170
MRILDU20518904540.01000.00800.0179

3.2. Energy Equation in Inertial Constrained Fusion Simulation

The energy equations in the numerical simulation of inertial confinement fusion can be described as in the literatures [11, 12]. When the discretization is applied to the equations, the velocity and coordinates are given on grid points, while the temperature, density, pressure, and energy are given at the center of the cells. When uniform quadrilateral mesh is used to discretize the continuous equations, we can derive the discrete nonlinear equations. When Newton iteration is used to solve the nonlinear system and natural ordering is used to grid points, the problem is converted into the solution of block tridiagonal linear equations which corresponds to Jacobi matrices [12].

In this subsection, we perform experiments for two typical sparse matrices (namely, MAT17 and MAT20) extracted from the discrete solution of the two-dimensional three temperature energy equations. The size of the related discrete grid is 21 by 75. Since each grid point corresponds to three temperature variables, namely, electron, ion, and photon, the number of the dimensions of the equations is 4,725. In the experiments, the stop criterion is that the Euclid norm of the residual vector is reduced by 10 orders of magnitude.

For matrices MAT17 and MAT20, the results are listed in Tables 13 and 14, respectively. In these experiments, the parameter is taken as .


ITCTMEITMETTME

ILUT5220.00050.00510.0057
MRILDU51120.00070.00280.0036
MRILDU5580.00120.00210.0033
MRILDU51070.00120.00190.0030
ILUT10170.00090.00430.0052
MRILDU10190.00110.00230.0034
MRILDU10560.00130.00160.0029
MRILDU101060.00120.00160.0028


ITCTMEITMETTME

ILUT5170.00070.00420.0049
MRILDU51120.00080.00290.0037
MRILDU5570.00140.00200.0034
MRILDU51070.00130.00200.0034
ILUT10140.00120.00380.0050
MRILDU10180.00120.00220.0034
MRILDU10570.00140.00200.0034
MRILDU101070.00140.00200.0034

As can be seen from Tables 13 and 14, although MRILDU with corresponds to ILUT, similar to the previous experiments, the experimental results also show that MRILDU with has great advantages and both the number of iterations and the time used in iteration are reduced significantly. Furthermore, when the parameter is given and the parameter increases, the convergence rate of the iteration is improved gradually; however, when the parameter is increased to a certain extent, further improvement is trivial when it continues to increase. Moreover, it can be noted that after the parameter is increased to a certain extent, the quality of MRILDU will be very close for different parameters .

3.3. Helmholtz Equation in Numerical Weather Prediction Model

When the finite-difference model is used to perform the numerical weather forecast, the sparse linear equations related to the pressure deviation in the three-dimensional space need to be solved on each time step, which are called discrete Helmholtz equations [13]. In this paper, we perform experiments for the grid point model GRAPES [14] developed by Chinese Academy of Meteorological Sciences. In this model, the full compressible atmosphere motion system is used as the control equation, the semi-implicit semi-Lagrangian scheme is used as discrete scheme for time derivative, and the discretization adopts Arakawa-C grid in horizontal direction and Charney-Phillips grid in vertical direction. See literature [14] for details.

Although the Helmholtz equations are diagonally dominant, the diagonal dominance is very weak. For this kind of sparse linear equations, the GCR iteration is often used to solve the equations [2], but the solution is very time-consuming, which occupies a large proportion in the whole numerical simulation time. For example, in GRAPES, when the diagonal scaling GCR is used and the grid size is , namely, the number of the grid points in the latitude, longitude, and vertical direction is 144, 73 and 31, respectively, this proportion can be up to about 70%.

In the experiments, the grid size used is 144 by 73 by 31, the preconditioned GCR is used to perform the iterations, and the stop criterion is that the 2-norm of the residual vector is less than . Due to the fact that the coefficient matrix of the sparse linear equations remains unchanged in the simulation process, the preconditioner just needs to be constructed in the setting stage and its information can be directly referenced in the solution processes of the sparse linear equations at subsequent time steps. During the simulation, the length of time step is taken as 1,800 seconds and the experiment is integrated for one day, namely, 96 time steps. Thus, there are 96 linear systems to be solved in all.

The experimental results can be described as in Table 15, where the average number of iterations and the average iteration time for the solution of a linear system with preconditioned GCR are provided. The parameter is taken as in all tests.


Avg. #itersAvg. time

ILUT5104.58334.6204
MRILDU5142.17712.5964
MRILDU5539.80212.5092
MRILDU51039.38542.4978
ILUT1054.83333.3918
MRILDU10126.26042.2461
MRILDU10525.81252.2306
MRILDU101025.63542.2229

As can be seen from Table 15, MRILDU is significantly superior to ILUT, and this is mainly thanks to the facts that the incomplete ILDU factorization is used to replace the ILU factorization and the dropping strategy for the elements in in MRILDU is different from that in ILUT. At the same time, it benefits partly from the multirow strategy.

3.4. Mesoscale Simulation of Concrete Sample

In the mesoscale simulation of concrete specimens, the specimens are seen as three-phase composite materials which are composed of aggregate, mortar, and the interface between them and the finite element method is used. In this paper, we perform experiments for a cubic wet-sieved specimen and the size of the specimen is 550 mm by 150 mm by 150 mm. See literature [8] for specific descriptions. There are 71,013 discrete nodes and 78,800 finite elements in all, respectively. The two supporting columns are, respectively, located at the places where  m and  m as well as  m and  m, namely, the places which are, respectively, 0.05 m away from the left and right boundaries at the bottom of the specimen, while the two loading columns are located at the places where  m and  m as well as  m and  m, respectively, namely, the places which are, respectively, 0.2 m away from the left and right boundaries at the top of the specimen.

In this subsection, for the linear equations encountered in the static loading test, we compare the efficiency of ILUT and MRILDU. The load is increased step by step, and the increased load for each step is 0.25 kN. When loaded to the 59th step, the damaged elements will appear in the concrete specimen and the specimen is completely damaged at the 94th step. If some damaged element appears at some loading step, it may need to solve multiple sparse linear equations with the same coefficient matrix at this step, to correct the displacements. Therefore, there are 178 sparse linear systems to be solved in all during the whole simulation process. Although the obtained sparse linear systems are symmetric positive definite, due to the fact that ILUT and MRILDU are not symmetric, BiCGSTAB is selected as the iterative method. The stop criterion is that the 2-norm of the residual vector is reduced by six orders of magnitude.

In Table 16, for the sparse linear systems occurring in the static loading simulation, the average number of iterations and the average time elapsed for iteration of each sparse linear system are provided. In the tests, the parameter is taken as 25 and the parameter is taken as . Due to the fact that when there are no damaged elements, the global stiffness matrix, namely, the coefficient matrix of sparse linear equations, remains unchanged and even if there are damaged elements, the coefficient matrix is also unchanged during the corrections at each loading step. As long as the coefficient matrix is unchanged, the preconditioner does not need to reconstruct. Therefore, the time consumed by the construction of the preconditioner is relatively trivial and thus the time used to construct the preconditioner is not listed in the table, while only the iteration results are listed.


Avg. #itersAvg. time

ILUT142.651713.7294
MRILDU1144.112412.9234
MRILDU10143.174212.2766
MRILDU20142.651710.9980

As can be seen from Table 16, for sparse linear equations which need to be solved in the mesoscale numerical simulation of concrete specimen, MRILDU does not have advantages in the number of iterations compared to ILUT, but the average iteration time used for each linear system is improved slightly. This may be due to the fact that the utilization ratio of Cache is better when MRILDU is used.

4. Conclusion

In this paper, aiming at the disadvantage that ILUT may drop some relatively large elements and retain relatively small elements, we propose an improved version MRILDU based on the following two techniques. First, the incomplete LDU factorization is used to replace the incomplete LU factorization and the same dropping strategy is used for the elements in and , so as to make the dropping rules be more equitable for and . Second, multiple rows are factorized before each dropping, so as to effectively deal with great differences between the diagonal dominance and to retain the relatively large elements, thereby improving the quality of incomplete factorization. The experimental results show that when the same parameters are used, the number of iterations with MRILDU will be significantly smaller than that with ILUT. And in most cases, the iteration time and total solution time with MRILDU can be reduced in general. Furthermore, MRILDU is not as sensitive as ILUT is to the parameter , and thus, in the case that the parameter cannot effectively be determined, the advantages of MRILDU are more significant.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This project is supported by the National Natural Science Foundation of China under Grants nos. 61379022 and 51079164, China Water Conservation Special no. 201201053, and Research Special of China Institute of Water Resources and Hydropower Research no. KJ1242.

References

  1. M. Benzi, “Preconditioning techniques for large linear systems: a survey,” Journal of Computational Physics, vol. 182, no. 2, pp. 418–477, 2002. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
  2. Y. Saad, Iterative Methods for Sparse Linear Systems, PWS, Boston, Mass, USA, 1996.
  3. J. A. Meijerink and H. A. van der Vorst, “An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix,” Mathematics of Computation, vol. 31, no. 137, pp. 148–162, 1977. View at: Google Scholar | MathSciNet
  4. I. Gustafsson, “A class of first order factorization methods,” BIT, vol. 18, no. 2, pp. 142–156, 1978. View at: Publisher Site | Google Scholar
  5. Y. Saad, “ILUT: a dual threshold incomplete LU factorization,” Numerical Linear Algebra with Applications, vol. 1, no. 4, pp. 387–402, 1994. View at: Publisher Site | Google Scholar | MathSciNet
  6. M. Benzi and M. Tüma, “A robust incomplete factorization preconditioner for positive definite matrices,” Numerical Linear Algebra with Applications, vol. 99, pp. 1–20, 2001. View at: Google Scholar
  7. B. Bylina and J. Bylina, “Incomplete WZ factorization as an alternative method of preconditioning for solving Markov chains,” in Parallel Processing and Applied Mathematics: 7th International Conference, PPAM 2007, Gdansk, Poland, September 9–12, 2007, R. Wyrzykowski, J. Dongarra, K. Karczewski, and J. Wasniewski, Eds., vol. 4967 of Lecture Notes in Computer Science, pp. 99–107, Springer, Berlin, Germany, 2008. View at: Publisher Site | Google Scholar
  8. B. Bylina and J. Bylina, “The incomplete factorization preconditioners applied to the GMRES(m) method for solving Markov chains,” in Proceedings of the Federated Conference on Computer Science and Information Systems (FedCSIS '11), pp. 423–430, September 2011. View at: Google Scholar
  9. N. Vannieuwenhoven and K. Meerbergen, “IMF: an incomplete multifrontal LU-factorization for element-structured sparse linear systems,” SIAM Journal on Scientific Computing, vol. 35, no. 1, pp. A270–A293, 2013. View at: Publisher Site | Google Scholar | MathSciNet
  10. S. MacLachlan, D. Osei-Kuffuor, and Y. Saad, “Modification and compensation strategies for threshold-based incomplete factorizations,” SIAM Journal on Scientific Computing, vol. 34, no. 1, pp. A48–A75, 2012. View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
  11. S. Weber, P.-H. Maire, R. Loubère et al., “A transport simulation code for inertial confinement fusion relevant laser-plasma interaction,” Computer Physics Communications, vol. 168, no. 3, pp. 141–158, 2005. View at: Publisher Site | Google Scholar
  12. T. X. Gu, Z. H. Dai, X. D. Hang, S. Fu, and X. Liu, “Efficient algebraic methods for two-dimensional energy equations with three temperatures,” Chinese Journal of Computational Physics, vol. 22, no. 6, pp. 471–483, 2005. View at: Google Scholar
  13. L. Li, W. Xue, R. Ranjan, and Z. Jin, “A scalable Helmholtz solver in GRAPES over large-scale multicore cluster,” Concurrency Computation Practice and Experience, vol. 25, no. 12, pp. 1722–1737, 2013. View at: Publisher Site | Google Scholar
  14. D. H. Chen, J. S. Xue, X. S. Yang et al., “New generation of multi-scale NWP system (GRAPES): general scientific design,” Chinese Science Bulletin, vol. 53, no. 22, pp. 3433–3445, 2008. View at: Publisher Site | Google Scholar

Copyright © 2014 Jian-Ping Wu and Huai-Fa Ma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

729 Views | 393 Downloads | 0 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19.