Abstract

A third order accurate cellwise relaxation implicit Discontinuous Galerkin (DG) scheme for RANS simulations using unstructured hybrid meshes is presented. A scalar parabolic equation is first examined to clarify what is really important in construction of implicit matrix to keep its diagonal dominance for the third and fourth order cellwise relaxation implicit DG schemes. In addition, discussions are given to approximated construction of implicit matrix for reducing computational cost. Then, the third order accurate cellwise relaxation implicit DG scheme for RANS simulations is successfully developed utilizing the expertise learned in the study of solving the parabolic equation. Superior spatial accuracy of the third order accurate cellwise relaxation implicit DG scheme for RANS simulations, while retaining reasonable convergence properties, is demonstrated for typical aerospace applications.

1. Introduction

Computational fluid dynamics (CFD) is used in many engineering areas, where accurate predictions of flowfield over various vehicles such as aircrafts, automobiles, and ships, are highly required. Unstructured mesh methods are widely used because of its effectiveness in treating complicated vehicle geometries. For unstructured meshes involving various cell geometries, it is customary to employ finite volume method (FVM) for satisfying conservation laws rigorously. However, there is a serious problem that spatial accuracy of FVM is usually lower than that assumed in the formulation even for the second order case. The cause of this lower spatial accuracy comes from poor reconstruction of dependent variables especially when geometries of nearby cells are highly skewed. In addition, slope limiter can degrade the spatial accuracy of the reconstructed dependent variables at local minimum/maximum points. Furthermore, the choice of cell stencils for local reconstruction of dependent variables in two- or three-dimensional space is not unique. These issues make FVM for unstructured meshes not really effective for high order computing for flowfield of engineering interest.

Recently, Discontinuous Galerkin (DG) finite element method attracts attention because of its ability in achieving formal order of spatial accuracy even on unstructured meshes [1]. In this method, basis functions are introduced independently in each cell. The dependent variables are then reconstructed as a sum of these basis functions multiplied by the corresponding degree of freedoms (DOFs). Because DOFs are also independently obtained in each cell as solution unknowns, the resulting reconstructed dependent variables become discontinuous at cell interfaces. The numerical flux is then evaluated by using upwind schemes for convective flux function and by weakly continued viscous flux function. Note that high order spatial accuracy is achieved with information contained solely in each computational cell in DG method. This is in contrast with conventional high order FVMs which require information in nearby cells in reconstruction of dependent variables.

However, DG method has an obvious drawback that computational cost is very high due to increased number of equations for DOFs to be solved in each cell. In addition to implicit integration for achieving faster convergence for steady flow problems, parallel computation is definitely needed for DG method to reduce computing time. It is often pointed out in past works that DG method is quite suitable for parallel computation because of its compactness. Such compactness, however, has not been fully exploited in implicit DG method utilizing, for example, LU-SGS scheme [2]. Yasue et al. proposed the cellwise relaxation implicit DG scheme where implicit time integration was carried out in the same way as the point Jacobi scheme [3]. The second order accurate cellwise relaxation implicit DG scheme was shown to be stable with very large CFL number for solving the RANS equations. It was shown that the cellwise relaxation implicit DG scheme could be easily parallelized using MPI library to achieve more than 120 times speedup when 128 PEs were employed.

Efforts were then made to improve the spatial accuracy of the cellwise relaxation implicit DG scheme for the RANS equations, but a straightforward attempt to improve spatial accuracy was ended at emergence of numerical instability. In the work of Sawada and Yasue, a fourth order cellwise relaxation implicit DG scheme for computing wave propagation was shown to be stable [4]. Therefore, it seems likely that the numerical instability comes from the treatment of viscous terms. In order to clarify the cause of the numerical instability that appeared in the RANS solver, a scalar parabolic equation is instead considered to critically examine the implicit matrix of the cellwise relaxation implicit DG scheme.

In this paper, by examining a scalar parabolic equation, we will discuss rigorous treatment of implicit terms in cellwise relaxation scheme up to a fourth order case and how such treatment can be approximated while retaining numerical stability. We will then apply the rigorous and approximated treatments to third order accurate cellwise relaxation implicit DG method for the RANS equations. The outline of this paper is the following. In Section 2, the cellwise relaxation implicit DG method for a scalar parabolic equation is briefly described. In Section 3, we will show that the rigorous treatment of implicit matrix keeps the diagonal dominance for the third and fourth order accurate schemes, while an easier construction of the implicit matrix actually fails. We will then discuss some approximated construction of implicit matrix for reducing computational cost. In Section 4, the rigorous and approximated constructions of implicit matrix for the third order accurate cellwise relaxation implicit DG scheme are attempted for the RANS equations. The computed results for typical aerospace applications are shown. Finally, conclusions are given in Section 5.

2. Cellwise Relaxation Implicit DG Method for Scalar Parabolic Problem

2.1. Governing Equation

As a typical parabolic equation, we choose a scalar heat conduction equation given by where denotes the scalar variable and the scalar coefficient, assumed as for simplicity in the present study. In discretizing the above equation, we introduce the three different coordinate systems, namely, the physical coordinate , the reference coordinate , and the tensor coordinate . Surface integrals of the above equation are evaluated in the reference coordinate using Gaussian quadrature formulae [5, 6], while volume integrals are in the tensor coordinate. The cell geometries considered in this study are tetrahedrons, prisms, pyramids, and hexahedrons. For these cell geometries, the abovementioned three coordinate systems are shown in Figure 1.

2.2. Basis Function

In this study, the orthogonal bases functions in the tensor coordinate are chosen as described by Warburton [7]. For tetrahedral cells, the orthogonal bases functions are given by for hexahedral cells, for prismatic cells, and for pyramidal cells, In the above equations, denotes the Jacobi polynomials given as Then subscript in (2) denotes the index of the DOFs defined by a set of subscripts , , and . For the 3D second order case, for , for , for , and for . Thus, 4 DOFs are needed for the 3D second order case. Similarly, 10 DOFs are needed for the third order case and 20 DOFs for the fourth order case.

2.3. Discontinuous Galerkin Discretization

Let us consider DG discretization of the heat conduction equation. We first transform (1) in the reference coordinate as where the transformed dependent variable and the flux functions are given by and the transformation Jacobian is given by The flux functions can further be rewritten as where For later use, we also write the flux function in the form of

At first, we consider the BR2 formulation [8, 9] for a weak treatment of (10). Assuming that represents a test function, we can obtain the following relation: where denotes the interface of computational cell in the reference coordinate, the numerical flux, the dependent variable at the interface, and the unit outward normal vector of . In the BR2 formulation, is given by where the superscripts and denote the inside value of the cell interface and that at the opposite side of the cell interface, respectively. On the other hand, the subscripts and denote the cell interface inside of the computational domain and that on the domain boundary, respectively. Therefore, the interface value in (15) for interior interface is simply evaluated by taking an average of dependent variables of both sides, while it is given as the boundary condition on the domain boundary. Then, (14) becomes Therefore, the numerical flux function for component is weakly determined as where denotes the global lifting operator in direction in reference coordinate given by Similarly, the numerical flux functions for and component can be written as where

Next, we consider the weak formulation of (7). Multiplying (7) by the basis function as the test function and integrating over the computational cell, we have which then yields where denotes the averaged value of those inside and outside of the cell interface and the stabilizing coefficient [9]. The surface integral of the global lifting term in (21) is replaced by the projected local lifting term in (22). It is assumed that the projected local lifting term can be decomposed by the local orthogonal basis as Therefore, by multiplying the local basis function and integrating over the computational cell, we have where we utilize the orthogonal property of basis function denoted by The coefficient in (23) is then obtained as which gives

We note here that taking an average value in (27) is not straightforward because the local lifting term is defined in the reference space where the relation for coordinate transformation and thus the metric terms are not the same for those inside and outside of the interface. Therefore, the average is taken in the physical space and then transformed back to the reference space as shown below. From (22), we can write where can be found by substituting the following relations into (28) as Then, (28) can be further rewritten as where Note that in (30) can be found either from (29) or by evaluating them directly in the physical space.

Finally, we write the scalar variable as a sum of basis functions multiplied by the related DOFs as Therefore, (22) can be read as

2.4. Cellwise Relaxation Implicit Scheme

We first write the time linearization of the flux functions in the reference space as where superscript denotes time step. Therefore, the implicit treatment of the second term in (33) is given by Similarly, the third term in (33) becomes Next, the fourth term corresponding to the surface integral of lifting terms in (33) is approximated as where In the above equation, the surface integration is taken only on the local interface to maintain the compactness. On the other hand, the fifth term corresponding to the volume integral of lifting terms in (33) is approximated as Finally, we approximate the first term in (33) as

We now obtain an algebraic equation for temporal change of transformed DOFs, , in each computational cell given by where The size of matrix depends on the number of DOFs. For example, the size becomes for 3D second order case and for 3D third order case.

Here, we emphasize that in (41) only needs the information of current cell. Therefore, (41) can be calculated in each cell independently as if it were an explicit scheme. We also note that a conventional implicit scheme such as LU-SGS scheme [2] probably can converge faster than the present cellwise one because the information over the entire computational domain is used for every time step through the implicit matrix inversion. However, the present implicit scheme can possibly have higher efficiency of parallelization than other conventional implicit schemes. We would like to see the consequence of choosing the present cellwise approach for DG scheme.

3. Computed Results for Scalar Parabolic Equation

In this chapter, we first show the computed results for the scalar parabolic equation using the second, third, and fourth order accurate cellwise relaxation implicit DG schemes. Spatial accuracies will be examined by comparing with the exact solution. Convergence properties will also be indicated. Then, we consider approximated construction of in (41).

3.1. Numerical Conditions

We consider a scalar heat conduction equation. At first, a cubic domain is divided by Cartesian meshes having 5, 10, and 20 intervals along each coordinate line. The resulting hexahedral cell is further divided into 6 tetrahedral cells. Therefore, the total number of cells becomes 750, 6,000, and 48,000, respectively. We denote these unstructured meshes by Mesh-1a, -1b, and -1c. Then, these meshes are compressed in one coordinate direction until the cell aspect ratio becomes as large as 1,000. We denote the compressed unstructured meshes by Mesh-2a, -2b, and -2c. Figure 2 shows Mesh-1c and Mesh-2c. The boundary condition assumed on all the boundary surfaces of computational domain is given by the following analytic solution for steady state of a scalar heat conduction equation: where is cyclically taken as The steady analytic solution for the entire computational domain is also given by (44). The initial condition is to assume for all computational cell. The time step is so determined that CFL number becomes 106 if a convection velocity of 1.0 is assumed.

The stabilizing coefficient is 1.0 for the case in which mesh aspect ratio is 1 and 1.5 for 1,000. All computations in this section are carried out using Intel(R) Core(TM) i7 CPU 3.47 GHz.

3.2. Computed Results

At first we examine the convergence characteristics and spatial accuracies of present schemes. In Table 1, the number of time steps needed for machine zero convergence and the CPU time needed for advancing one time step are summarized. For all spatial accuracies, the present implicit scheme gives machine zero convergence. However, the convergence becomes slower due to larger stabilizing coefficient when a cell aspect ratio of 1,000 is chosen. The computational cost of the fourth order scheme for advancing one time step is about 20 times higher than that for the second order scheme. In addition, because the convergence rate for the fourth order scheme is slower than that of the second order scheme, the computational cost of the fourth order scheme for attaining machine zero convergence is about 55 times higher than that of the second order scheme. The error norms obtained for converged solutions are also shown in Table 1. One can evaluate the actual spatial accuracy from the slope of fitted line for error norms with different cell intervals. The attained actual spatial accuracies are also shown in Table 1. The present second, third, and fourth order DG schemes will attain the formal spatial accuracy not only for a cell aspect ratio of 1 but also for 1,000.

Next, let us consider the possible reason why the second order cellwsie relaxation implicit DG scheme of Yasue et al. was numerically stable for solving the RANS equations while the third order scheme became unstable. In solving the RANS equations, Yasue et al. ignored the linearization of the derivative terms for conservative variables. Those second and third terms in (42) were not accounted for in their scheme. We denote the matrix where the second and third terms are ignored by reduction-matrix while the matrix in (42) by the original-matrix . In Figures 3 and 4, comparisons of the convergence histories utilizing and are shown. In all calculations, Mesh-2b is employed. For the second order case, the converging rate with the reduction-matrix is almost the same as that with the original-matrix , while the computational cost using is reduced. However, for the third and fourth order cases, the computing with the reduction-matrix becomes unstable.

It seems that the observed instability comes from loss of diagonal dominance of matrix . In order to see whether it is the case or not, let us consider for a sake of simplicity a cubic domain divided by Cartesian meshes and then compress only in direction to create a cuboid domain having hexahedral cells. Then, the matrix in (11) has a diagonal form as where can be much larger than and . When the second and third terms in (42) are ignored, the remaining terms containing become Because the first term in the above equation contains , the surface integration in the reference space becomes nonzero only on the interfaces and . Furthermore, on these surfaces, needs to be even function of and . Otherwise, the surface integral on these surfaces vanishes. Similarly, the second term in (47) contains . Again, the surface integration can have a nonzero value only on the interfaces and , where needs to be even function of and . For the second order case, the bases functions for a hexahedron are given by which are the first order polynomials. Because cannot be an even function of and , except for , the first term in (47) vanishes for . For , the sum of surface integration on and vanishes. Similarly, the second term in (47) vanishes for except for . For , the sum of surface integration on and again vanishes. Therefore, the diagonal dominance is maintained for the second order case. For the third order case, the bases functions are given in addition to (48) as which are the second order polynomials. For these bases functions, (47) can have a nonzero value when or become even functions of and . Although the diagonal term is also reinforced with these bases functions, we numerically observed that the resulting reduction-matrix has no diagonal dominance. The situation is the same for the fourth order case. Although Yasue et al. employed the reduction-matrix in their cellwise relaxation implicit DG scheme, the diagonal dominance was kept because the spatial accuracy of their code was second order.

An easier approach to maintain the diagonal dominance of the reduction-matrix is to remove all the off-diagonal terms. We denote this matrix by the diagonal-matrix . The convergence histories for computing with are also shown in Figures 3 and 4. For the third order case, the computing with is found to be stable, but convergence is slower than that with the original-matrix . On the other hand, for the fourth order case, the computing with is unstable. Therefore, we should take into account the terms arising from the linearization of the derivative terms for conservative variables in the third and fourth order cellwise relaxation implicit DG scheme.

Now that the linearization of the derivative terms for conservative variables should be included in implicit matrix for diagonal dominance; the computing cost to obtain the original-matrix in each cell is very high. Therefore, it is favorable to examine a possibility of simplifying the original-matrix in which some components of matrix are removed for reducing computational cost while retaining the diagonal dominance. For the second order case, we employ the simplified form of as while, for the third order case, In (2), a set of determines the basis function. The simplified matrices in the above equations are derived by considering the relation of , , and . For example, assume that . Because the first order polynomials of and in the corresponding basis function can be derived from the combination of , , and , we should retain these components of , , , and , while we all ignore , , , , , and . Similarly, a simplified form of the original-matrix for the fourth order case can be obtained. It is found that the present approach of simplifying the original-matrix is quite effective in reducing computational cost particularly for the higher order cases. In addition, we apply the successive overrelaxation (SOR) method in solving (41) because the size of matrices for higher order cases becomes very large. In the present SOR method, we first prepare the initial guess for the second order components as Using the initial guess for the second order components, the forward sweep to obtain the third order components is carried out as Then the backward sweep for improving the second order components is followed as In above equations, the relaxation coefficient is chosen as which can give fast convergence while retaining numerical stability. Solutions are iteratively improved by repeating (53) and (54) for 3 times in the present study. In Figures 5 and 6, the convergence histories for computing with the original-matrix and the matrix-simplification and that with SOR method are shown. One can find that the matrix-simplification is quite effective in reducing computational cost without affecting converging rate. In addition, the computational cost is further reduced when SOR method is used. As a result, we can successfully reduce 5%, 22%, and 39% of the total CPU times for the second, third, and fourth order cases, respectively, when we employ the matrix-simplification with the SOR method. We emphasize that the matrix-simplification with SOR method is more effective in reducing computational cost as the spatial accuracy becomes higher.

4. Computed Results for Navier-Stokes Equations or RANS Equations

In this section, we show the computed results for the Navier-Stokes equations using the second and third order accurate cellwise relaxation implicit DG schemes. The second order cellwise relaxation implicit DG scheme for RANS equations including the treatment of convection terms is thoroughly described in [3]. In this study, the second and third terms in (42) are included in the coefficient matrices, and those bases functions for high order terms, their associated DOFs, and sufficient number of Gaussian points are simply introduced to attain the third order spatial accuracy. In the following, we at first show the computed results for the laminar boundary layer flow and compare them with the Blasius solution. Then, we show the computed results for the turbulent boundary layer flow. Finally, the computed results for the vortical flowfield over a double-delta wing are shown.

4.1. Laminar Boundary Layer

The computational domain for the laminar boundary layer flow is set as A uniform inflow is set at the inlet boundary and the outflow condition with a prescribed pressure is set at the outlet boundary and at the top boundary . The symmetric boundary is assumed not only at the side boundaries but also at the bottom boundary of upstream region . At the bottom boundary of downstream region , the nonslip boundary condition is set. A typical example of computational mesh is shown in Figure 7. The present computational mesh consists of 1,688 tetrahedral cells and 1,000 prismatic cells. The minimum gird spacing for the prismatic layer on the wall is set to be and 10 layers with a growth rate of 1.3 are given in the boundary layer region. The free-stream Mach number is chosen to be 0.2 and the Reynolds number is for a unit length. The computation is carried out by the cellwise relaxation implicit scheme in which the matrix-simplification with SOR method is employed. In the present case, we choose the relaxation coefficient in SOR method to be 0.9 and the number of iterations to be 10 in order to avoid emergence of numerical instability. A CFL number of is assumed in the implicit time integration. The stabilizing coefficient is assumed as . All computations in this subsection are carried out using Intel(R) Core(TM) i7 CPU 3.47 GHz.

In Figures 8 and 9, one can find that the computed velocity profiles obtained by the second and third order schemes agree well with Blasius solution at several streamwise locations. On the other hand, in Figures 10 and 11, the velocity profiles at are shown to duplicate each other for both the second and third order cases suggesting that the use of the matrix-simplification with/without SOR method does not alter the converged solutions. The convergence histories determined by density residuals are shown in Figure 12. It is evident that the use of the matrix-simplification with/without SOR method does not influence convergence property. In Table 2, the CPU time needed for advancing one time step is summarized. One can find that the use of the matrix-simplification reduces the needed CPU time by 6.75% and 35.58% for the second and third order schemes, respectively. When the matrix-simplification with SOR method is employed, the CPU time is further reduced by 9.65% and 42.51% for the second and third order schemes, respectively.

4.2. Turbulent Boundary Layer

Next, we consider the turbulent boundary layer flow over a flat plate. An example of the computational mesh is shown in Figure 13. The computational domain is set as The boundary conditions are the same for those used in the calculation of the laminar boundary layer flow. The computational mesh consists of 1,496 tetrahedral cells and 2,100 prismatic cells. The minimum grid spacing is set to be . As many as 15 prismatic layers are employed with a growth rate of 1.7 in the boundary layer region. The free-stream Mach number is 0.5, and the Reynolds number for a unit length is . The cellwise relaxation implicit DG scheme is applied for time integration. A CFL number of is assumed. In order to reduce the computational cost, the matrix-simplification with SOR method is employed. The relaxation coefficient and the number of iterations are set to be 0.7 and 15 for this case. The stabilizing coefficient is assumed as . All computations are carried out again using Intel(R) Core(TM) i7 CPU 3.47 GHz.

The computed velocity profiles obtained by the second and third order schemes with asymptotic solutions of and are shown in Figure 14. For both the second and third order schemes, good agreements with the asymptotic solutions are obtained. In Figure 15, the computed results at obtained by the second and third order schemes are shown. As in the laminar boundary layer case, use of the matrix-simplification with/without SOR method does not influence the converged solutions. Figure 16 shows the velocity profiles of the converged solutions at given by the second and third order scheme. The velocity profiles obtained by the third order scheme better agree with those asymptotic solutions. The convergence histories determined by the density residuals are shown in Figure 17. Again, the use of the matrix-simplification with/without SOR method does not alter the convergence properties, although the density residual fluctuates in a slightly different way particularly in the third order case. The CPU time needed for advancing one time step for both of the spatial accuracies is summarized in Table 3. The use of the matrix-simplification reduces the needed CPU time by 4.30% and 35.90% for the second and third order schemes, while the matrix-simplification with SOR method by 9.95% and 43.18% for the second and third order scheme, respectively.

4.3. Vortical Flowfield over a Double-Delta Wing

Finally, the computed results for the flowfield over a double-delta wing are shown. The geometry of the double-delta wing is the same as the model used in the experiment conducted by Brennenstuhl and Hummel [10]. The root chord length is  [m]. The kink location is assumed to be the middle of the root chord. The swept-back angles are 80 degree in the upstream region of the kink position and 60 degree in the downstream region, respectively. The strake and main wing are flat plates having a thickness of  [m], while the leading edge is rounded by a radius of half wing thickness. An example of the computational mesh is shown in Figure 18. It has 305,619 tetrahedral cells, 343,469 prismatic cells, and 4,095 pyramidal cells. The minimum grid spacing is  [m]. As many as 31 prismatic layers are employed with a growth rate of 1.25 in the boundary layer region. The outer boundary is assumed as a half sphere whose radius is 20 times of the root chord length. The free-stream Mach number is 0.3, and the Reynolds number is defined by where denotes the free-stream velocity and the kinematic viscosity. The angle of attack is 12.0 degree. The cellwise relaxation implicit DG scheme is employed. A CFL number of is chosen with the local time stepping for a fast convergence. The stabilizing coefficient is assumed as . For the present case, a parallel computation is performed using 16 processors of the Intel(R) Xeon(R) CPU 3.40 GHz. The METIS grid partitioning [11] and the Message Passing Interface (MPI) library are used in parallelization. In the third order computation, the converged solution of the second order scheme is used as the initial flowfield.

The total pressure contour plots in several streamwise cross sections obtained by the second and third order accurate schemes are shown in Figures 19(a) and 19(b), respectively. In both results, two longitudinal vortices from the strake and the leading edge of the main wing are clearly captured which are almost merged at the downstream cross section. In addition, a secondary vortex can be found in this cross section. In Figures 20(a) and 20(b), the contour lines in 62.5% chordwise cross section computed by the second and third order accurate schemes, respectively, are compared. One can find that the outboard longitudinal vortex is better captured by the third order scheme. This is well indicated in the pressure coefficient distributions in 62.5% chordwise cross section shown in Figure 21. Apparently, two negative pressure peaks corresponding to these two longitudinal vortices are clearly observed in the third order computation resulting in better agreement with the experimental data.

In this case, the original-matrix is employed in the calculation. The CPU time needed for advancing one time step is 4.8 [s] and 32.3 [s] for the second and third order schemes, respectively, resulting in a ratio of 6.73. This value is close to the ratio of 6.62 obtained in the calculation of the turbulent boundary layer flow using the original-matrix without simplification. We expect to have a better computational efficiency when matrix-simplification and SOR method are employed in the simulation of vortical flowfield over the double-delta wing.

5. Conclusions

A third order accurate cellwise relaxation implicit DG scheme for RANS simulations using unstructured hybrid meshes is presented. In order to find a proper construction method of the implicit matrix for the viscous terms, we first consider a scalar parabolic equation. It is shown that those terms corresponding to linearization of the derivative terms for the conservative variables should be retained in the implicit matrix for the third and fourth order accurate cellwise relaxation implicit DG scheme. In addition, the matrix-simplification with SOR method is proposed to substantially reduce the computational cost. Utilizing the expertise learned in the study of solving the parabolic equation, the third order accurate cellwise relaxation implicit DG scheme for RANS equations is successfully developed. In the calculations of laminar and turbulent boundary layer flow and also vortical flowfield over a double-delta wing, not only the marked numerical stability but also the superior spatial accuracy of the present third order cellwise relaxation implicit DG scheme are well demonstrated.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.