Partial Refactorization in Sparse Matrix Solution: A New Possibility for Faster Nonlinear Finite Element Analysis

Song, Qi; Chen, Pu; Sun, Shuli

doi:https://doi.org/10.1155/2013/403912

Mathematical Problems in Engineering

On this page

Abstract Introduction Analysis Conclusions Acknowledgments References Copyright Related Articles

Special Issue

Advances in Finite Element Method

View this Special Issue

Research Article | Open Access

Volume 2013 | Article ID 403912 | https://doi.org/10.1155/2013/403912

Partial Refactorization in Sparse Matrix Solution: A New Possibility for Faster Nonlinear Finite Element Analysis

Qi Song,¹Pu Chen,¹and Shuli Sun¹

Academic Editor: Song Cen

Received08 Jul 2013

Accepted07 Aug 2013

Published11 Sept 2013

Abstract

This paper proposes a partial refactorization for faster nonlinear analysis based on sparse matrix solution, which is nowadays the default solution choice in finite element analysis and can solve finite element models up to millions degrees of freedom. Among various fill-in’s reducing strategies for sparse matrix solution, the graph partition is in general the best in terms of resultant fill-ins and floating-point operations and furthermore produces a particular graph of sparse matrix that prevents local change of entries from wide spreading in factorization. Based on this feature, an explicit partial triangular refactorization with local change is efficiently constructed with limited additional storage requirement in row-sparse storage scheme. The partial refactorization of the changed stiffness matrix inherits a big percentage of the original factor and is carried out only on partial factor entries. The proposed method provides a new possibility for faster nonlinear analysis and is mainly suitable for material nonlinear problems and optimization problems. Compared to full factorization, it can significantly reduce the factorization time and can make nonlinear analysis more efficient.

1. Introduction

Nonlinearities [1, 2] occur in practical applications of many engineering fields. For example, in design of steel structures, elastoplastic analyses are necessary to compute the limit loads of truss, frame, or shell structures. In area of concrete structural design or soil mechanics, complicated nonlinear constitutive relations have to be involved for realistic descriptions. In case of cable design, geometrically nonlinear effects have to be considered for large displacements. In numerical simulation of rainfall effect on stability of slope, the changes in pore-water pressure will alter the elastoplastic behavior of geologic media. In analysis of seepage [3], the coefficient of permeability changes with capillary pressure and can be also treated as nonlinear problems.

The nonlinear equilibrium equations are typified by the nonlinear system of equations: Generally, this can be achieved in finite element methods by the so-called Newton-Raphson method, which involves the linearization of the equilibrium equations, given by where indicates directional derivative of at in the direction of an increment and indicates the tangent matrix. In each iteration step , a linear system of equations has to be solved. Nowadays, for up to about two million equations, direct methods may be still dominant for the solution, compared with iterative methods. As stated in [1], the advantage of direct solvers lies on their capability to solve even ill-conditioned and indefinite system of equations, which is especially interesting for nonlinear applications. Moreover, direct methods are generally robust.

The convergence behavior of Newton-Raphson method is advantageous. Generally only a few iterations are needed to obtain the solution. However, in every iteration step, triangularization of the tangent matrix in (2) involves extensive computational effort for large finite element models.

To reduce time consumption of triangular factorization of , the modified Newton method is proposed in which is not changed in several successive iteration steps. Therefore the number of refactorizations of decreases and total computing time might be saves. In practice, this saving may be counterbalanced by the fact that modified Newton method converges much slower than Newton-Raphson method, while repetitive residuum load vector building, forward reduction, and back substitution in each iteration step become then time consuming.

Fortunately, it can be noticed that between two adjacent iteration steps, nonlinear phenomenon usually appears locally. For example, stress concentration often occurs around one or several points; plastic deformation usually starts from local elements and then spreads around; in analysis of seepage, only elements around the interface of liquid need to change the coefficient of permeability, and so forth. All these features are equivalent to local change of tangent matrix ; that is to say, only a small or very small percentage (but the number might be large such as five thousand rows) of stiffness coefficients will be changed in each linearized step.

In light of this, a new algorithm is proposed in this study to refactorize tangent matrix with local change. Our goal is to save time of triangular factorization, to make Newton-Raphson method more efficient and thus to provide a new possibility for faster nonlinear analysis. Generally speaking, local change occurs frequently in material nonlinear problems and optimization problems; thus the proposed algorithm is mainly suitable for them. In the worst case, a global change causes that the proposed algorithm degrades to full factorization.

An outline of this paper follows. In Section 2, background on sparse direct solution techniques is given. In Section 3, we discuss the effect propagation of local change in the matrix factor and conclude that only part of matrix factor needs to be recalculated. Based on all preparations, a new algorithm for refactorization is described in Section 4. The cost of computation is predictably reduced, and the precision is unaffected. In Section 5, numerical examples are given to illustrate performance of the proposed algorithm. Finally in Section 6, it is concluded that the algorithm proposed in this paper is significantly efficient and suitable for local change of structures. This method can be applied to a wide range of engineering problems and could be the foundation of faster nonlinear finite element analysis.

2. Sparse Direct Solution Techniques

Since 1990s, direct solution techniques of finite element analysis [4] have developed from conventional variable bandwidth [5] or frontal solutions [6] to various sparse solutions [5, 7–11], such as backward-reference (left-looking), forward-reference (right-looking), and multifrontal [12, 13] methods. They have yielded a breakthrough in terms of solution speed and storage requirement in finite element analysis. It is quite safe to conclude that solvers based on various row pivoting sparse storage schemes are, in terms of the solution time, memory requirement and scale of problems, much more efficient than those based on variable bandwidth or skyline storage schemes. With the achievements in hardware, the analysis of 10,000 to 500,000 nodes finite element calculations on microcomputers becomes popular. Since 1995, commercial FEA packages have turned to sparse direct solvers and gained a speedup of more than 10 times. The scale of problems that can be solved on certain machines increased about 3 to 10 times.

Sparse solution has two essential features. One is sparse index storage scheme, also called compact storage scheme, storing nonzero entries of matrix in a row or column compact form, which significantly improves the efficiency of time and space because only nonzero entries are considered in triangular factorization. The other is fill-in’s reducing, an optimization procedure before numerical factorization, which finds an optimized pivoting sequence by permuting rows and columns of matrix, so that fill-ins as well as float-point operations of factorization will be decreased greatly.

Generally, a stiffness matrix of finite element analysis can be considered as an adjacency list of graph, in which a vertex presents an equation and a nonzero off-diagonal entry implies that the two corresponding vertices are adjacent. At present, one of practical fill-reducing algorithms is the so-called multilevel graph partition [14–16], a special case of graph partition. It is a divide and conquer algorithm, which partitions global optimization cost functions into some unrelated cost functions of subproblems and additional cost function between these subproblems. Approximately, the global cost function is minimized if the additional cost function is minimized and the scales of subproblems are roughly equal. The division guarantees that local change on one side does not spread to the other side. In other words, the graph partition prevents local change from wide spreading.

Taking advantage of both row-sparse solution and fill-reducing, a new algorithm to refactorize the tangent matrix for local change is designed. Only a part of rows of the matrix factor involves recalculation, and therefore the computational cost decreases. It is to note that no approximation is assumed in the presented algorithm; therefore accuracy is guaranteed as the same of full refactorization.

For simplicity, we assume in following discussions that a pivoting sequence of fill-in’s reducing of a positive definite global stiffness matrix has been already found through a graph partition algorithm, such as METIS [19].

3. Effect of Local Change in Sparse Solution

In each iteration step of Newton-Raphson method, nonlinear finite element analysis yields to a system of linear equations as (2) or closely where denotes the tangent stiffness matrix, which is symmetric positive definite generally in finite element method, the residuum vector, and the displacement increment vector.

At present, general solution procedure of finite element equation (3) is , which decomposes into product of lower triangular matrix , diagonal matrix , and upper triangular matrix , as

With the result of , the displacement can be obtained easily by forward reduction and back substitution. In the process, factorization of stiffness matrix is the most time consuming. In nonlinear analysis, repetitive factorizations are carried out at each iteration step. It is to point out that, in two adjacent iteration steps, the tangent stiffness matrices differ frequently only from each other on a small portion of elements. Besides, the difference is frequently local.

Suppose , the factors , and , which are a lower triangular matrix with unit diagonal and a diagonal matrix, respectively, where is the number of equations. With regard to memory management, shares the same memory locations as and together, since the unit diagonal of does not necessarily occupy any storage space. Practically, we denote before factorization and after factorization as . The bulk of the work in the factorization occurs in a triple nested loop around a single statement [20]:

There are six different forms [17] to implement factorization, which are six permutations of the indices , , . Row-sparse factorization [21] and its improvement [8, 18] are foundations of the present sparse solution implementation, in which form and its variations are used. The simple form can be expressed in terms of tasks [18, 22]:(1) RowTask : reduction of the target th row by a multiple of the th row, , that is, elimination of ;(2) RowTask : division of the off-diagonals of the target th row by its diagonal.

The RowTask itself involves an -loop from to . We can symbolically write this procedure as shown in Algorithm 1.

do row ,
do = all appropriate rows ()
RowTask ()
end do
RowTask ()
end do

As stiffness matrix is sparse, most entries in as well as in or vanish, and therefore, only nonzero entries are actually involved in calculation. Besides nonzero entries in , the factor or contains additional nonzero entries generated by factorization. Thus, it is necessary to predetermine symbolically, not numerically, which entries will become nonzero in or . This step is called symbolic analysis. After this, it can be analyzed by (5) how the variation of entries of stiffness matrix will affect the factors or and and how the effect will propagate in the or .

Since the matrices differ on a small portion of elements, consider a change of row of which is caused by previous eliminations or entries change of directly. It will cause change of row that as shown in Algorithm 1. Take (6), for example. Change of row 1 generally causes change of row 3 and row 7, since change of and spreads to and . Change of row 3 spreads changes to row 5 and row 7:

In general, change of diagonal entry directly affects value of all nonzero entries of this row in the matrix, and therefore, for convenience, let a row be the smallest unit of change, corresponding to an equation or a general displacement of the structure. Through the previous analysis, the rule of direct effect of change is summarized as follows.

Rule. In factorization, change of a row in stiffness matrix directly affects values of this row in factorization results and , and then affects rows corresponding to the columns where nonzero entries of this row are.

Consider (6) again. The change of row 1 will affect rows 3 and 7. The effect will propagate in the matrix, as row 3 will affect rows 5 and 7, and row 5 will affect rows 6 and 7. Due to sparseness of stiffness matrix, the number of nonzero entries in each row is finite. As a result, change of one row is able to affect only part of the matrix. In this case, change of row 1 will eventually affect rows 3, 5, 6, and 7, whereas rows 2 and 4 remain unchanged.

In practice, fill-ins’ reducing, which minimizes the number of fill-ins (additional nonzero entries produced by factorization) through equation reordering, is always required before numerical factorization. A typical optimized result delivered by the divide and conquer graph partition algorithm is as follows:

In this case, if stiffness of group 2 is changed, only groups 2, 3, and 7 are necessary to be recalculated; similarly, if stiffness of group 6 is changed, only groups 6 and 7 are affected.

Now it is concluded that if the stiffness matrix changes locally, based on fill-ins’ reducing, only part of values in and need to be recalculated. The cost of computation is predictably reduced, and the precision is unaffected.

4. Algorithm Design for Local Change of Sparse Matrix

Algorithm for refactorization involves sparse storage scheme, which takes extremely advantage of sparseness of numerical part. However, index can be compressed furthermore, and indirect addressing of data access can be reduced substantially [5]. Row-sparse factorization [21] and its improvement [18] are the foundation of present sparse solution, which consist of four steps: symbolic assembly, symbolic factorization, numerical assembly, and numerical factorization. Instead of the elimination tree, an updated linked list is used in factorization.

In this refactorization algorithm, structural topology is unchanged, so symbolic assembly and symbolic factorizationinherit the original results. Thus, three steps need to be redesigned.(a)Modification analysis: determines which rows will be changed in upper triangular matrix .(b)Numerical assembly: reassembles entries of matrix in rows which are changed.(c)Numerical factorization: modifies rows of upper triangular matrix and recalculates the value of corresponding entries.

The detailed steps of this algorithm are shown as Algorithm 2.

Step 1. Input original stiffness matrix , upper triangular matrix and diagonal matrix ;
Step 2. Create array CHANGE() with the initial value zero, where is the number of
equations. CHANGE indicates row is not changed, while CHANGE indicates
row is changed;
Step 3. Input the change of each element stiffness matrix and assemble them to original
total stiffness matrix. Then let CHANGE() be 1 if row is changed;
Step 4. According to Rule, spread the effect of changes over the whole matrix and mark
all the changed rows:
Step 5. Assemble matrix to the original factor. If CHANGE, fill in row with
original factorization result; if CHANGE, fill in row with entries of new stiffness matrix;
Step 6. Numerically factorize the matrix, only recalculating entries in changed rows
(CHANGE) according to (5). Only changed row is added to the elimination tree, in
implementation being represented by the linked list.
Step 7. The solution procedure after factorization stays the same.

For further illustration steps 4 and 6 are refined as follows in Algorithms 3 and 4.

do ,
if (CHANGE() = 1) then
do ,
if ( is nonzero) then CHANGE() = 1
end do
end if
end do

do row ,
if (CHANGE() = 1) then
do = all appropriate row
RowTask ()
end do
RowTask ()
end if
end do

In step 6 (Algorithm 4), only parts of factor are recalculated for elimination, and as a result, computation efficiency compared to full refactorization is improved.

5. Numerical Examples and Analysis

Two building structures are selected to show the efficiency of proposed algorithm, in size of 321,210 and 442,331 equations or DOFs (degrees of freedom), in which up to 5000 DOFs are changed. Both numerical examples are tested on Windows 7 system with Intel Xeon CPU E5-2620 (2.00 GHz, cores, but only one core used) and 32 GB memory. The codes have not integrated in nonlinear solver, and therefore only refactorization part is presented. Moreover, only computational effort and elapsed time of factorization are discussed, since the algorithm proposed in this paper involves no approximation and the solution is as accurate as a full recalculation.

Example 1. A multitower building, as shown in Figure 1(a), is calculated. The number of DOFs or equations is 321,210; the number of nonzero entries in stiffness matrix is 17,169,579, and that in factor is 62,281,728.

(a)

(b)

Consider that nonlinear phenomenon occurs between two adjacent iteration steps. For example, part of steel structures reaches plastic phase. For simplicity, in each test one group’s element stiffness is changed to 90% of standard value, and one group here normally consists of a certain number (from 1 to 2000) of well-connected elements in one floor which means a local change.

The performance of the proposed algorithm is shown in Table 1. The first row of data shows the result of full refactorization, a violent method to factorize the new matrix, introduced as reference of comparison. The rest of the rows show the result of partial refactorization using the proposed algorithm. For each case (change of certain number of elements) about 30 samples are selected randomly. For statistical purpose, minimum, maximum, and average affected DOFs, (million floating-point operations), MFLOP and computing time are given. The percentage next to average values is the ratio between this item and corresponding value of full refactorization.

The numerical results indicate that the local change, up to 2% DOFs in this example, will only spread to a little bit large percentage of DOFs, up to 10% here. The elapsed time or computational effort is usually less than half of full recalculation, which demonstrates that this method is more efficient than full recalculation. The MFLOP ratio of this algorithm and full recalculation is always much larger than the ratio of affected DOFs and total DOFs, because reduction of some DOFs or equations refers a large number of DOFs, both changed and unchanged. Also, it is shown that the MFLOP ratio is coincident with elapsed time ratio.

Comparing the cases of different numbers of changed elements, it is concluded that computational effort of refactorization depends slightly on the number of modified/affected DOFs and the relation is roughly monotonic but not necessarily linear. In this numerical example, changing less than 200 elements, the computational efforts of refactorization are almost the same, about 15%, and even changing 500 or more elements, the computational effort does not increase much. In fact, in the worst case, changing all DOFs, it degrades to full refactorization.

Example 2. A two-tower building, as shown in Figure 1(b), is calculated. The number of DOFs or equations is 442,331; the number of nonzero entries in stiffness matrix is 27,160,169 and that in factor is 152,414,429.
Like Example 1, this structure is changed by a certain number (from 1 to 1000) of well-connected elements. The results are shown in Table 2. For each case about 100 samples are selected randomly.

The results demonstrate again that the proposed algorithm is efficient and factorization time decreases. Furthermore, in each case of these two examples, maximum (affected DOFs, MFLOP, and computing time) is usually less than twice the average, which indicates that the algorithm is robust.

Summarizing the results of numerical tests yields qualitatively the relationship between computational effort ratio and ratio of changed DOFs in Figure 2. It shows clearly that the algorithm proposed in this paper is efficient most of the time.

6. Conclusions

In this paper, a partial refactorization algorithm based on row-sparse solution and fill-in’s reducing is proposed for nonlinear finite element analysis, especially material nonlinear problems and optimization problems. Instead of full refactorization in traditional nonlinear analysis, the proposed procedure finds the changed factor of the tangent matrix with much lower cost.

It is concluded that this algorithm can significantly improve the efficiency compared with full refactorization. There are several advantages as follows.(a) The algorithm does not affect the precision of final results, since it involves no approximation, just skipping repetitive computation.(b) A large number of elements or DOFs can be changed.(c) The amplitude of change is not limited to being small, as usually being required by approximate approaches.(d) Only one additional array CHANGE is required to perform the computation.(e) The implementation is simple, if one understands row-sparse solution procedure.

We expect the proposed algorithm could be integrated in nonlinear analysis, especially Newton-Raphson method. It can significantly reduce the refactorization time and make Newton-Raphson method more efficient. In other words, it opens a new possibility for faster nonlinear analysis. The proposed algorithm can also be applied to many engineering problems, in which a series of linear systems of equations with step-by-step local change has to be solved, including structural optimization, progressive collapse analysis, and analysis of seepage.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (no. 10972005) and National Basic Research Program of China (2010CB731503). The authors thank Mr. XL Wang from YJK Building Software Co. who provided the data of examples.

References

P. Wriggers, Nonlinear Finite Element Methods, Springer, Berlin, Germany, 2008.
View at: Zentralblatt MATH | MathSciNet
J. Bonet and R. D. Wood, Nonlinear Continuum Mechanics for Finite Element Analysis, Cambridge University Press, Cambridge, UK, 1997.
View at: Zentralblatt MATH | MathSciNet
L. Lam and D. G. Fredlund, “Saturated-unsaturated transient finite element seepage model for geotechnical engineering,” in Finite Elements in Water Resources, pp. 113–122, Springer, Berlin, Germany, 1984.
View at: Google Scholar
H. Zhou, S. Wu, and P. Chen, “Advances in direct solution technique of finite element analysis,” Advances in Mechanics, vol. 37, no. 2, pp. 175–188, 2007 (Chinese).
View at: Google Scholar
E. L. Wilson, K. Bathe, and W. P. Doherty, “Direct solution of large systems of linear equations,” Computers and Structures, vol. 4, no. 2, pp. 363–372, 1974.
View at: Google Scholar | MathSciNet
B. M. Irons, “A frontal solution program for finite element analysis,” International Journal for Numerical Methods in Engineering, vol. 2, pp. 5–32, 1970.
View at: Google Scholar
P. Chen, D. Zheng, S. Sun, and M. Yuan, “High performance sparse static solver in finite element analyses with loop-unrolling,” Advances in Engineering Software, vol. 34, no. 4, pp. 203–215, 2003.
View at: Publisher Site | Google Scholar
D. T. Nguyen, J. Qin, T. Y. Chang et al., “Efficient sparse equation solver with unrolling strategies for computational mechanics,” in Proceedings of the 4th International Conference on Concurrent Enterprising (ICES '97), pp. 676–681, San Jose, Costa Rica, August 1997.
View at: Google Scholar
O. Schenk, K. Gärtner, W. Fichtner, and A. Stricker, “PARDISO: a high-performance serial and parallel sparse linear solver in semiconductor device simulation,” Future Generation Computer Systems, vol. 18, no. 1, pp. 69–78, 2001.
View at: Publisher Site | Google Scholar
S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith, “Efficienct management of parallelism in object oriented numerical software libraries,” in Modern Software Tools in Scientific Computing, pp. 163–202, Birkhauser Press, Boston, Mass, USA, 1997.
View at: Google Scholar
I. S. Duff and J. K. Reid, “MA47, a Fortran code for direct solution of indefinite sparse symmetric linear systems,” Tech. Rep. RAL 95-001, Rutherford Appleton Laboratory, Oxford, UK, 1995.
View at: Google Scholar
I. S. Duff and J. K. Reid, “The multifrontal solution of indefinite sparse symmetric linear equations,” ACM Transactions on Mathematical Software, vol. 9, no. 3, pp. 302–325, 1983.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
J. W. H. Liu, “The multifrontal method for sparse matrix solution: theory and practice,” SIAM Review, vol. 34, no. 1, pp. 82–109, 1992.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
B. Hendrickson and R. Leland, “A multilevel algorithm for partitioning graphs,” Tech. Rep. SAND93-1301, Sandia National Laboratories, Albuquerque, NM, USA, 1993.
View at: Google Scholar
A. Gupta, “Fast and effective algorithms for graph partitioning and sparse matrix ordering,” Tech. Rep. RC, 20496 (90799), IBM T. J. Watson Research Center, Yorktown Heights, NY, USA, 1996.
View at: Google Scholar
S. Wu, Algebraic multigrid analysis and its application in finite element method [M.S. thesis], Peking University, Beijing, China, 2007, (Chinese).
J. M. Ortega, “The ijk forms of factorization methods. I. Vector computers,” Parallel Computing, vol. 7, no. 2, pp. 135–147, 1988.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
P. Chen and S. Sun, “A new high performance sparse static solver in finite element analysis with loop-unrolling,” Acta Mechanica Solida Sinica, vol. 18, no. 3, pp. 248–255, 2005.
View at: Publisher Site | Google Scholar
G. Karypis and V. Kumar, Metis: A Software Package for Partitioning Unstructured Graphs, Partitioning Meshes, and Computing Fill-Reducing Ordering of Sparse Matrices Version 4. 0, University of Minnesota, Department of Computer Science, Minneapolis, Minn, USA, 1998.
G. H. Golub and C. F. V. Loan, Matrix Computation, The Johns Hopkins University Press, Baltimore, Md, USA, 3rd edition, 1996.
S. Pissanetzky, Sparse Matrix Technology, Academic Press, London, UK, 1984.
View at: MathSciNet
D. Zheng and T. Y. P. Chang, “Parallel cholesky method on MIMD with shared memory,” Computers and Structures, vol. 56, no. 1, pp. 25–38, 1995.
View at: Google Scholar

Copyright

Copyright © 2013 Qi Song et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1389

Downloads

1062

Citations