Mathematical Problems in Engineering

Volume 2017 (2017), Article ID 2580820, 11 pages

https://doi.org/10.1155/2017/2580820

## Parallel Multiprojection Preconditioned Methods Based on Subspace Compression

Department of Electrical and Computer Engineering, School of Engineering, Democritus University of Thrace, University Campus, Kimmeria, 67100 Xanthi, Greece

Correspondence should be addressed to Christos K. Filelis-Papadopoulos; rg.htud.ee@dapapc

Received 1 February 2017; Revised 5 April 2017; Accepted 11 April 2017; Published 30 July 2017

Academic Editor: Damijan Markovic

Copyright © 2017 Byron E. Moutafis et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

During the last decades, the continuous expansion of supercomputing infrastructures necessitates the design of scalable and robust parallel numerical methods for solving large sparse linear systems. A new approach for the additive projection parallel preconditioned iterative method based on semiaggregation and a subspace compression technique, for general sparse linear systems, is presented. The subspace compression technique utilizes a subdomain adjacency matrix and breadth first search to discover and aggregate subdomains to limit the average size of the local linear systems, resulting in reduced memory requirements. The depth of aggregation is controlled by a user defined parameter. The local coefficient matrices use the aggregates computed during the formation of the subdomain adjacency matrix in order to avoid recomputation and improve performance. Moreover, the rows and columns corresponding to the newly formed aggregates are ordered last to further reduce fill-in during the factorization of the local coefficient matrices. Furthermore, the method is based on nonoverlapping domain decomposition in conjunction with algebraic graph partitioning techniques for separating the subdomains. Finally, the applicability and implementation issues are discussed and numerical results along with comparative results are presented.

#### 1. Introduction

Let us consider a sparse linear system of the following form:where is the coefficient matrix, is the right-hand-side vector, and is the solution vector. In order to solve the linear system, in (1), a direct or an iterative method can be used. However, a direct method is computationally expensive with excessive memory requirements. Krylov subspace iterative methods, compare [1], have gained the attention of the scientific community during the recent decades, due to their efficiency and memory requirements for solving large sparse linear systems. Preconditioned iterative methods improve the convergence rate and have been used extensively for solving large sparse linear systems. The left preconditioned form of the linear system, in (1), is as follows:where is a suitable preconditioning matrix. Moreover, there is the right and the symmetric preconditioned form, compare [1]. The preconditioner has to satisfy the following conditions: (i) should have a “clustered” spectrum, (ii) can be efficiently computed in parallel, and (iii) the product of “ by vector” should be fast to compute in parallel, compare [1, 2]. The domain decomposition method has been extensively used for solving (very) large sparse linear systems in parallel systems, compare [3–8]. Furthermore, domain decomposition-type methods have been combined with Krylov subspace methods or have been used in hybrid schemes in conjunction with direct solvers, compare [6, 9–11]. During the last decade, research efforts were directed towards solvers for general sparse linear systems that have improved convergence behavior and are scalable.

Various overlapping domain decomposition methods based on Lagrange multipliers have been proposed. The Finite Element Tearing and Interconnect (FETI), compare [12], and its variants, compare [13, 14], are scalable and efficient solvers for second-order and fourth-order partial differential equation problems. Related methods are the Balancing Domain Decomposition (BDD), compare [15], and the Balancing Domain Decomposition by constraints (BDDC), compare [16], which are multiplicative and additive domain decomposition methods, respectively. The aforementioned methods are suitable for solving linear systems derived from the finite element method. Additionally, there are domain decomposition-type solvers using graph partitioning algorithms. These algebraic domain decomposition methods do not require any geometric property of the problem as the partitioning is based on the corresponding graph of the coefficient matrix. Parallel hybrid algebraic domain decomposition-type solvers that combine direct and iterative methods have been proposed, compare [17, 18]. Additionally, algebraic domain decomposition method in conjunction with algebraic multigrid method, compare [19–22], has satisfactory convergence behavior and is suitable for parallel systems. An algebraic domain decomposition method, based on semiaggregation and subspace compression techniques, namely, multiprojection method with subspace compression (MPMSC), for solving large sparse linear systems is proposed. Several efficient software libraries for graph partitioning, such as Metis, compare [23], or Scotch, compare [24], can be used for algebraic partitioning. The aggregation techniques have been previously used in the context of algebraic multigrid methods and have been shown to be efficient for a wide variety of model problems, compare [25–27]. In the MPMSC scheme, a direct method is used for solving the local linear systems derived from the semiaggregated domains. The semiaggregation procedure results in multiple grids composed of a fine part, corresponding to the unknowns of a subdomain, and aggregated part, corresponding to unknowns related to other subdomains. The proposed scheme utilizes approximations for the fine part of each subdomain, obtained with respect to its aggregated components, as opposed to aggregation based algebraic multigrid methods, which obtain approximations for the solution of the entire domain, through smoothing, based on a hierarchy of coarser grids. Moreover, the proposed scheme does not require the computation of a hierarchy of coarser grids. The subspace compression technique is an algorithmic procedure, based on breadth first search with limited depth on the adjacency matrix of the subdomains. By this technique the neighboring subdomains are discovered and aggregated together to an aggregated (coarse) component. The aggregation process reduces the number of aggregated (coarse) unknowns of the local linear systems, resulting in a substantial reduction in memory requirements. The computation of the rows and columns corresponding to the aggregates, during the formation of the local coefficient matrices as well as the subdomain adjacency matrix, is reused to avoid recomputation. Avoiding the recomputation enhances the performance of the proposed scheme. Moreover, the rows and columns corresponding to the newly formed aggregates have a large number of nonzero elements and thus are ordered last to further reduce fill-in during the factorization of the local coefficient matrices. Further, the number of aggregates of the multiprojection method with subspace compression is controlled by a parameter that describes the level sets of aggregates to be compressed in a single aggregate.

In Section 2, the multiprojection method with subspace compression is presented, implementation details are given, and the aggregation strategy is discussed. In Section 3, numerical results illustrating the applicability and efficiency of the MPMSC scheme are presented. Moreover, comparative results for the performance and convergence behavior are given.

#### 2. Multiprojection Method with Subspace Compression

The multiprojection method with subspace compression is a domain decomposition-type method for solving large sparse linear systems (see (1)). The graph, corresponding to the coefficient matrix , has vertices, the unknowns of the linear system, in (1), and edges, the nonzero elements of the coefficient matrix . Since is expressed as a graph, the algebraic properties of the problem can be utilized in order to partition the graph into nonoverlapping subgraphs (subdomains). The graph partitioning process is performed using the algorithm provided in the Metis software library, compare [23]. Consider the domain partitioned into nonoverlapping subdomains with , so thatThe number of vertices, in each subdomain , is denoted by and is the index set defined by the following expression:where are the indices corresponding to the vertices of each subdomain.

##### 2.1. Subspace Compression

Let us consider the subspace , of the domain , which contains aggregated (coarse) components, corresponding to the subdomains , and the prolongation matrix from onto , , which is denoted according to the following expression:The projection of to subspace is the matrix of dimension () and is given by the following equation:The graph , corresponding to the matrix , concerns the relations between the aggregated (coarse) components. The subspace compression technique is based on the reaggregation of the (number of subdomains) aggregated (coarse) components into (number of aggregates). The number of aggregates is not known a priori and thus is a result of the subspace compression algorithm. Breadth first search with limited depth (BFS-LD) is performed on to find the distance neighborhood. The components in a distance neighborhood of are reaggregated together. Then the neighborhood’s vertices are removed from the set and the BFS-LD is performed again on the new graph. When every vertex is removed from the set , the procedure is terminated. The order in which the reaggregation process is performed may result in a slight variation of the number of aggregates; however this does not affect the efficiency or the convergence behavior of the proposed scheme significantly. The order of the vertices used in the reaggregation scheme is lexicographical with respect to the enumeration of the subdomains by the graph partitioning scheme. Other approaches can be used for the reaggregation process, compare [26, 27]; however the computational work of these schemes will affect negatively the performance of the preprocessing phase, especially for large numbers of subdomains. The subspace compression algorithm is as follows:(1), .(2)** For each ** or** until ** , where is the -distance neighborhood of component and is the index of the aggregated (coarse) components **End For**The parameter of the neighborhood specifies the maximum distance between the vertex and any other vertex in the neighborhood. The value of is selected with respect to the adjacency matrix , considering that the larger the value of , the less the number of aggregates. Distance one neighborhood is chosen by default; however for parallel systems with limited memory resources, distance two or distance three neighborhood should be chosen to reduce the memory requirements of the proposed preconditioning scheme. An increase in the value of might affect negatively the convergence behavior of the proposed scheme. The computational complexity for finding the aggregates depends on the number of subdomains and it is a searching procedure on the graph with negligible execution time compared to the whole initialization phase of the preconditioning scheme.

##### 2.2. Semiaggregation

Let us consider the semiaggregated subdomains , containing the fine components of subdomain and aggregated components. Furthermore, the semiaggregated subdomains are associated with their respective prolongation matrices , with . The matrices are restriction operators mapping an arbitrary full vector of the domain into a vector of the semiaggregated subdomain with size (). The first columns of the matrix are the columns of the identity matrix with dimension (), corresponding the indices . The remaining () column vectors of correspond to the aggregates. The components of the column vector that are included in the corresponding aggregated subdomain have value and the rest of the components are zero, where , with and .

Hence, the matrix is defined as follows:where is the column vectors corresponding to the aggregates as they are described above. Moreover, additional prolongation matrices are required, namely, . The prolongation matrices of dimension () are used to map a vector defined on the subdomain to a vector defined on the domain by prolonging only the components corresponding to the subdomain and discarding the aggregated (coarse) components. The th column of the matrix is given as follows:The linear systems corresponding to each subdomain are as follows:where is a coefficient matrix of order ().

The proposed method is used as a preconditioner to a Krylov subspace iterative method, such as preconditioned GMRES(), compare [28, 29]. The product of the preconditioning matrix by a vector can be described by the following compact algorithmic scheme: **For ** **End For**Considering that is the exact solution and that , the error at th iteration can be written as follows:or equivalentlywhere is an orthogonal projector with respect to the Euclidean inner product. Considering that if is symmetric positive definite, is an orthogonal projector with respect to the -norm inner product, compare [8, 30]. is a product of the two orthogonal projections, which are orthogonal to different inner products; thus the projection is not orthogonal, compare [31]. Taking into consideration the above, the method is categorized into the class of oblique projection procedures, compare [1].

The preconditioned matrix using the MPMSC scheme is given as follows:which is a sum of projections.

Let us consider the unit square domain , which is discretized with mesh size as depicted in Figure 1. The resulting grid is partitioned into 16 subdomains, containing 4 vertices each. The corresponding matrix is projected onto the subdomains , in (10), and the grid corresponding to the projected subdomain is presented schematically in Figure 2.