Research Article | Open Access
Norapon Sukuntee, Saifon Chaturantabut, "Parametric Nonlinear Model Reduction Using -Means Clustering for Miscible Flow Simulation", Journal of Applied Mathematics, vol. 2020, Article ID 3904606, 12 pages, 2020. https://doi.org/10.1155/2020/3904606
Parametric Nonlinear Model Reduction Using -Means Clustering for Miscible Flow Simulation
This work considers the model order reduction approach for parametrized viscous fingering in a horizontal flow through a 2D porous media domain. A technique for constructing an optimal low-dimensional basis for a multidimensional parameter domain is introduced by combining -means clustering with proper orthogonal decomposition (POD). In particular, we first randomly generate parameter vectors in multidimensional parameter domain of interest. Next, we perform the -means clustering algorithm on these parameter vectors to find the centroids. POD basis is then generated from the solutions of the parametrized systems corresponding to these parameter centroids. The resulting POD basis is then used with Galerkin projection to construct reduced-order systems for various parameter vectors in the given domain together with applying the discrete empirical interpolation method (DEIM) to further reduce the computational complexity in nonlinear terms of the miscible flow model. The numerical results with varying different parameters are demonstrated to be efficient in decreasing simulation time while maintaining accuracy compared to the full-order model for various parameter values.
Viscous fingering is the formation of patterns in a morphologically unstable interface between two fluids in a porous medium. It generally refers to the onset and evolution of instabilities that occur in the displacement of fluids . The mathematical description of viscous fingering can be given in terms of several partial differential equations. Many numerical methods are used to find the approximation of pressure, velocity, concentration, or temperature. In , the coupled set of partial differential equations is approximated by the semidiscrete SUPG stabilized finite element formulation plus a discontinuity capturing technique to improve stability around the moving sharp fronts. The pressure equation is discretized by the standard Galerkin method, and a postprocessing scheme is used to improve the numerical evaluation of Darcy’s velocity. Branston et al. developed a finite volume numerical simulator with a level of certainty where fingering perturbations in stochastic fields do not result into severe channeling as discussed in . The simulator development involves discretization of convection–diffusion and pressure equations with total variation diminishing and central differencing schemes, respectively. Correlated stochastic fields for simulation runs were created using convolution with a Gaussian filter achieved through 2D fast and inverse Fourier transform, Dykstra–Parsons coefficient, random porosity generator, and autocorrelation lengths. In , the Oldroyd model has been studied using a generalized incompressible Navier-Stokes equation via the interpolating stabilized element free Galerkin technique. In this work, the Crank-Nicolson method is first utilized to discrete the temporal direction, and then, the meshless element free Galerkin technique is employed to approximate the spatial direction. The equation contains the integral term which is approximated by using a difference scheme. Although the numerical methods discussed earlier are effective and accurate, the resulting simulations generally require enormous memory storage and consume excessively computational time. In order to manage this problem, model order reduction techniques have been proposed in many previous works for various applications.
Proper orthogonal decomposition (POD) is a commonly used technique for model order reduction. POD is used to generate the optimal basis that can describe the dominant features of a dynamical system. This basis is utilized with the Galerkin projection to construct a reduced-order model. As in , a surrogate model based on the proper orthogonal decomposition is developed in order to enable fast and reliable evaluations of aerodynamic fields. In 2017, Chang et al. introduced a reduced-order model for wall shear stress in abdominal aortic aneurysms by proper orthogonal decomposition . A more interesting detail on POD can be found in [7–9].
Using POD still has a limitation when the model has complicated nonlinear property. POD cannot handle the complexity issue arising in computing nonlinear terms. A powerful technique called the discrete empirical interpolation method (DEIM) is used here to overcome this problem. In particular, the DEIM is employed to construct selected interpolation indices, which will allow much less expensive evaluation for the nonlinear terms. The DEIM can be used to combine with POD to improve the efficiency of the nonlinear reduced-order model. In , the POD-DEIM approach has been applied to the groundwater equation and the time-dependent incompressible Navier-Strokes equation with variable density based on the meshfree RBF-FD technique. In , POD-DEIM is applied to a dynamic 2D catalytic reactor model. There are also many other applications of POD-DEIM on various problems, which can be found, for example, in [12–14].
In this paper, we apply the POD-DEIM technique on a finite difference discretization of a nonlinear miscible flow with viscous fingering in porous media. This model contains many parameters which can be changed in many different situations. Our additional effort is to construct a reduced-order model, which can be used to approximate the solution with any given parameter in the domain of interest. This issue in model reduction is crucial for various parametrized dynamical systems. There are a number of works that have been done for parametric model order reduction (PMOR). In , the PMOR method for laminated composite structures is presented. The parametrization is carried out for the angle between the layers of structure and the thickness of the structure. In , this work proposed the PMOR scheme for second-order systems that does not require a priori sampling of the parameter space. For the flow model, PMOR for probabilistic analysis of unsteady aerodynamic applications is discussed in . Also in , the PMOR approach is applied to the fluid flow governed by the Navier-Stokes equations at varying Reynolds number. In this work, we use -means clustering to efficiently construct a reduced system for a parametrized miscible flow model.
This paper is arranged as follows. We first formulate the model equations in Section 2. In Section 3, a general setup for constructing POD-DEIM reduced systems is discussed. Section 4 introduces a parametrized nonlinear model reduction using -means with the POD-DEIM approach for the miscible flow model. In Section 5, the numerical experiments are used to illustrate the efficiency of the proposed approach. We finally conclude the results and discuss some future research directions in Section 6.
2. Problem Formulation
We consider a viscous fingering in a horizontal flow through a 2D porous media domain  with a constant permeability given by where is the velocity with components in and coordinates, is the global pressure, is the concentration of the injected fluid, is the temperature, is the viscosity depending on and , is the reaction rate, and , , , , and are assumed to be constant. To solve (1)–(4), a general procedure nondimensionalizes the system which will be converted into the equations in terms of stream function and vorticity as follows: where and are constant parameters determining the effects of concentration and temperature to the viscosity, Da (Damköhler number) and Le (Lewis number) are dimensionless parameters, and . These equations (5)–(8) are defined on the spatial domain , where Pe (Péclet number) is a dimensionless constant aspect ratio and time domain . For this system, the boundary conditions are along top and bottom sides. The boundary conditions are along the remaining two sides. The initial conditions are where is the interface location. Let and be equally spaced grid points on -axis and -axis with and , respectively. Define , , , and as a vector of dimension containing approximate solutions , , , and , respectively, for and . The central finite difference is applied to construct the spatial discretization of (5)–(8), and it becomes a system of nonlinear ODEs in the form of where are parametrized constant matrices; is a parametrized constant vector; is a nonlinear function with is the vector of state variables, and is the vector of parameters. Note that are constant coefficient matrices, is the zero matrix, is the identity matrix, , are vectors indicating the boundary conditions, and are nonlinear functions that maps some state variables to . We strongly recommend that the reader should read more details in the derivation of (13) in Appendix A. The solution will generally depend on the parameter vector , and therefore, we will also denote as . More details on discretization and numerical schemes can be found in [19, 20].
3. Reduced-Order Modeling by the POD-DEIM Approach
This section provides a brief overview of the general procedure for constructing a reduced-order model of a differential equation using proper orthogonal decomposition (POD) and a discrete empirical interpolation method (DEIM).
3.1. Subspace from Proper Orthogonal Decomposition
Consider a matrix of snapshots where , . Suppose we want to approximate a snapshot by using a set of orthonormal vectors , which has rank . Then, the approximation is in the form where is a matrix of basis vectors and is the vector of unknown coefficients. To find , we use the fact that has orthonormal columns, i.e., , and the minimum error occurs when the residual is orthogonal to the column span of , i.e., , which implies that and the approximation becomes
Proper orthogonal decomposition (POD) provides an orthonormal basis that minimizes this approximation error in 2-norm for a given basis rank ; i.e., POD basis is the optimal solution to the minimization problem where is the identity matrix. It can be shown  that the POD basis defined above can be obtained from the left singular vector of the snapshot matrix . Let be the singular value decomposition of , where matrices and are matrices with orthonormal columns and is a diagonal matrix with . Then, the POD basis matrix of dimension is , . Moreover, this minimum error is given by  the sum of neglected singular values squared, i.e.,
As a result, the decay of singular values can be used to specify the dimension of the subspace of the POD basis used in the approximation. In most existing applications, POD can provide a basis that explains the overall dynamics of a given snapshot set and we can use reduced dimension that is much less than the original dimension , while maintaining high accuracy in the approximation.
3.2. Discrete Empirical Interpolation Method (DEIM)
In the context of nonlinear model reduction for dynamical systems, the DEIM  is generally used for reducing the complexity for evaluating the projected nonlinear term in the POD-Galerkin reduced system. To illustrate this issue, we consider the projected nonlinear function of defined by
For convenience, the notation will be simplified by leaving the dependence on time parameter . Notice that, for a given , the computational complexity for computing the above nonlinear term depends on the dimension when multiplying with . This can be avoided by using the DEIM approximation of the form where the matrix is the POD basis matrix of rank of the nonlinear snapshot matrix , for with coming from the available snapshot in ( from POD approximation), , and the matrix is constructed from the interpolation indices , which are generated by the DEIM algorithm shown in Algorithm 1. In particular, where is the column of the identity matrix for all for . Note that the complexity for evaluating (22) can depend only on the reduced-order dimension and because the term can be precomputed for any given input vector and the term can be considered as selecting components corresponding to the indices .
From Algorithm 1, the procedure initializes the first interpolation index by using the first input basis entry which has the largest magnitude. The rest of the indices for are selected from the residual with the largest magnitude. More details on this procedure as well as the proof for invertibility of can be found in .
3.3. The POD-DEIM Reduced System
Consider an initial value problem of the form where , , and are constant matrices; is the nonlinear function; and is the state variable. The POD reduced system can be obtained by applying the Galerkin projection with the POD basis of dimension (see Section 3.1), and this system can be written in the form where , , and are constant matrices; is the projected nonlinear function; and is the state variable that can be used to approximate the original state variable by . Finally, after approximating the nonlinear term as shown in Section 3.2, we obtain the POD-DEIM reduced system in the following form: where and as discussed in (22). The next section extends the previous setup for reduced-order modeling for the parametrized miscible flow model.
4. Model Reduction for the Parametrized Miscible Flow Model by Using the POD-DEIM Approach with -Means Clustering
This section applies -means clustering with the POD-DEIM reduced-order modeling described in Section 3 on the parametrized miscible flow model given in Section 2. In this work, the numbers of spatial grid points are and , i.e., the dimension of the full-order model is . The computation is terminated at the time with time stepsize . Moreover, we are interested in an exothermic reaction, which implies that . The parameters , , Da, and Pe are fixed to be 2, 0.1, 0.01, and 250, respectively. Furthermore, the random noise between 0 and 1 is added to the initial condition for concentration at the collocation points ( for all ) to generate the instability which employs the viscous fingering.
4.1. -Means Clustering
Let and be the closed intervals for the values of and Le, respectively. Two parameters are generated randomly in for 1000 samples and clustered into centroids using the -means clustering algorithm as shown in Figure 1.
The -means clustering algorithm for generating centroids is shown in Algorithm 2 as follows. From Algorithm 2, the requirements are number of desire cluster , number of iteration iter, and the data set which is the set RcLe that contains 1000 training samples of parameters in this work. The procedure starts with first randomly picking initial centroids, . Then, every point in training set measured the distance between each centroid. The points that have a minimum distance to centroid are gathered together in the cluster. The new centroids will be updated by averaging the whole samples in each cluster. This procedure continues and terminates at the . As discussed previously, we use in this work.
4.2. The POD-DEIM Parametrized Reduced Systems
The snapshots for constructing the POD basis are obtained from the full-order models corresponding to all selected centroids from the -means clustering algorithm. Let be the numerical solution of at time from the discretized system of (26) with , which is the th parameter centroid from -means clustering, i.e., we obtain the snapshots from solving the following 20 full-order systems:
The snapshots , , and are the numerical solutions of (26) that approximate , , , and , respectively, for and .
Let be the snapshot matrix for concentration. In our numerical experiment, and . To obtain the POD basis, the singular value decomposition (SVD) of is computed, i.e., . Then, the POD basis of dimension is the first columns of the left singular matrix of denoted by . Similarly, let , , and be the POD basis of dimension for temperature, stream function, and vorticity, respectively. The POD reduced system is constructed by applying the Galerkin projection on (26) with the basis . The state variable is approximated by , where is the state variable of the POD reduced-order system, i.e., , , , and . The resulting POD reduced system is in the form of where and are constant matrices for a given parameter vector , and is the projected nonlinear term. As described in Section 3, we can obtain a POD-DEIM reduced system from the POD reduced system (27) as shown below: where is the nonlinear term obtained from using the DEIM approximation for each nonlinear term in in (27). Note that is not directly constructed as explained in Section 3, since it consists of many nonlinear functions and given in (13) that depends on different state variables. Therefore, we separately apply this approximation to each nonlinear term. For example, let us first construct the nonlinear snapshot matrix for , where . The DEIM is then applied on the nonlinear term . Other nonlinear terms are done in the same manner. Details on the construction of each nonlinear term by the DEIM can be found in Appendix A which follows from . Finally, the construction of is just the combination of the DEIM results in each nonlinear term. The numerical simulations based on the techniques described in this section are illustrated and analyzed in Section 5.
5. Numerical Experiments
In this section, there are two main numerical tests that consider accuracy and efficiency in terms of reducing the simulation time of the model reduction framework presented in this paper. In Section 5.1, the numerical solutions of the proposed method and the original full-order system are compared for different parameter values that are not in the training set. In Section 5.2, the overall computational time reduction and average errors are considered for the solutions from the proposed method when using various randomly selected parameter values in the interested domain. The relevant MATLAB code contributed in this work is available at https://bit.ly/3airQ2u.
In this work, the number of spatial grid points is and , i.e., the dimension of the full-order model is . The computation is terminated at the time with (1500 time steps). The system temperature is set to be constant (isothermal case), i.e., it is equivalent to set .
To investigate the possibility of reducing the dimension, we first consider the singular values of both solution snapshots and nonlinear snapshots as depicted in Figure 2. Note that this work illustrates only the approximations of concentration. Other state variables have similar accuracy results since they are all coupled in the differential equations.
5.1. Numerical Test I
In this section, the POD and POD-DEIM reduced-order model are compared to the full-order model for different given parameters . To demonstrate the efficiency of the reduced-order model, the error at a certain time and the global error and its average are defined in terms of the Frobenius norm as follows: where and are the concentrations obtained from the full-order and reduced-order models, respectively, at the time step or the time , and is the number of snapshots. In this experiment, . The approximation can be the solution from the POD or POD-DEIM reduced-order model. Some cases of parameters are chosen arbitrarily, and the corresponding numerical results are shown below. In particular, we will consider the cases of , and , in Sections 5.1.1, 5.1.2, and 5.1.3, respectively.
In Sections 5.1.1 and 5.1.2, the parameters are chosen from different corners of our interested 2-dimensional parameter domain, i.e, and . These parameters are not in the set of training parameters. As shown in Figures 3 and 4, the resulting POD-DEIM approximation with dimension of concentration for both cases is compared to the full-order model at the time () and the final time (). Since these solutions are visually indistinguishable, the absolute errors at all grid points are also provided in Figures 3 and 4. These errors are shown to be in the order of or . More details on the error at specific time , , and average global error are given in Tables 1 and 2, respectively, for Sections 5.1.1 and 5.1.2. As expected, the increasing number of POD basis provides more accuracy as the error seems to be decreasing. However, this can increase the simulation time as shown in Table 3, which will be discussed in more detail later in Section 5.2.
Similarly, in Section 5.1.3, all results are demonstrated in the same manner as seen in Figure 5 and Table 4. The difference is the parameters that are chosen inside the parameter domain, i.e., . Table 4 shows that the error at a specific time and the average global error are smaller compared to the previous cases.
5.2. Numerical Test II
In this numerical test, we randomly generate 10 distinct parameters in our interested parameter domain . The solutions are obtained from solving the reduced-order model corresponding to those parameters. CPU time and global error are recorded and averaged as shown in Table 3.
From Table 3, applying the POD method can make the simulation faster, but it is only approximately twice or three times speedup due to the inefficiency in computing the nonlinear term. After the DEIM is applied, the average CPU time speedups by roughly the order of at equivalent accuracy. In particular, the POD method with uses approximately 1.3 hours (49,2821 seconds), while the POD-DEIM method with , uses roughly only 1.3 seconds. In this case, the average error for both POD and POD-DEIM approach is approximately the same. As shown in the previous numerical test, an increasing dimension can improve the accuracy with a trade-off in computational time.
The results in this section suggest that the POD-DEIM reduced-order model with the basis constructed from the -means approach is reliable in approximating the solution for any given parameter in the interested parameter domain , even though they are not in the training set.
This work provides a model order reduction technique for simulating the parametrized viscous fingering model in a horizontal flow through a porous media domain by using several techniques. The -means clustering algorithm is first used to cluster parameters equipped in the model. The snapshots are collected from solving the solution from each centroid. Then, the POD is applied in conjunction with the DEIM to construct a reduced-order model. The resulting reduced-order model can be calculated entirely in the reduced dimension without dependence on the original dimension for various parameter values that are not used in the training set obtained from the -means algorithm. The approximation admits faster evaluation while trading a small amount of accuracy. The error estimates of the proposed approach for this nonlinear miscible flow model can be obtained by incorporating the accuracy of -means clustering with the error analysis from  and from  for, respectively, a posteriori error and a priori error estimates. These theoretical results are beyond the scope of this work and left to be considered in the future.
A. Details on Discretized Systems
A.1. Discretized Full-Order System
The central finite difference is applied to construct the spatial discretization of (5)–(8), and it becomes a system of nonlinear ODEs as follows: where the nonlinear terms are defined as with denotes the componentwise multiplication in MATLAB, are constant coefficient matrices, and , are vectors indicating the boundary conditions. Then, the forward time integration with the predictor-corrector scheme is applied to (A.1)–(A.4) to obtain the approximation at each time step. However, the dimension of system can be very large to obtain an accurate numerical result which demands tremendous memory storage and overabundant computational time.
A.2. POD Reduced System
The POD reduced system is constructed by applying the Galerkin projection on (A.1)–(A.4) by first substituting their approximations , , , and and then premultiplying (A.1) and (A.2) by , (A.3) by , and (A.4) by . The POD reduced-order model is of the form where