#### Abstract

A surrogate model based on the proper orthogonal decomposition is developed in order to enable fast and reliable evaluations of aerodynamic fields. The proposed method is applied to subsonic turbulent flows and the proper orthogonal decomposition is based on an ensemble of high-fidelity computations. For the construction of the ensemble, fractional and full factorial planes together with central composite design-of-experiment strategies are applied. For the continuous representation of the projection coefficients in the parameter space, response surface methods are employed. Three case studies are presented. In the first case, the boundary shape of the problem is deformed and the flow past a backward facing step with variable step slope is studied. In the second case, a two-dimensional flow past a NACA 0012 airfoil is considered and the surrogate model is constructed in the (Mach, angle of attack) parameter space. In the last case, the aerodynamic optimization of an automotive shape is considered. The results demonstrate how a reduced-order model based on the proper orthogonal decomposition applied to a small number of high-fidelity solutions can be used to generate aerodynamic data with good accuracy at a low cost.

#### 1. Introduction

Despite flow nonlinearities and geometrical complexities, the level of maturity reached by the numerical methods of computational fluid dynamics enables performing detailed numerical simulations of problems of practical interest. However, there exist applications, such as aerodynamic shape optimization, which plays nowadays an important role in the design process for aerospace and automotive engineering, demanding multiple simulations to perform optimization loops. Optimization techniques can require the prediction of certain quantities, such as lift and drag, as functionals of the field variables associated with a parametric partial differential system describing the physical behaviour of the problem. The evaluation of the implicit relationship between the inputs (the problem parameters), which identify specific system configurations, and the outputs (quantities such as lift and drag) requires the solution of the partial differential system. Numerous optimization methods and models have been developed and can be distinguished between local, global, and hybrid methods, based on gradient, evolutionary, or genetic algorithms [1]. Whatever optimization algorithm is applied, many evaluations of the functional relationship are needed. To keep the requirements of computational resources within certain limits, reduced-order models (ROMs) (or surrogate models) can be employed. A ROM provides a rapid and reliable estimate of the input-output relationship, with a considerable reduction of the computational cost.

The choice of the particular ROM however is quite critical, as it must preserve the essential physics and predictive capability of the high-fidelity partial differential model. The ROM definition can be sample based, employing statistical analysis such as kriging, polynomial chaos expansions, or Principal Component Analysis, for global surrogate models [2], or based on dimensionality considerations: the solution of the partial differential problem evolves in a low-dimensional manifold induced by the parametric dependence; therefore, the high dimensionality of the discretization space can be reduced constructing an approximation of this manifold, leading to a reduced-basis method [3, 4].

The definition of the ROM used in this work relies on the proper orthogonal decomposition (POD), a statistical technique able to extract the essential physics of some input information. The statistical analysis of the data allows expressing the flow field in terms of a set of low rank basis vectors and the ROM is obtained using this reduced set. In literature, the POD appears in different equivalent forms such as Karhunen-Loève decomposition, Principal Component Analysis (PCA), and Singular Value Decomposition (SVD) [5]. According to POD, an optimal linear basis, named POD basis, may be interpreted as a solution of minimization of the projection error of the original system, equivalent to maximizing the energy in the projection [6]. Therefore, the optimality of the basis vector is in an energetic sense. If a problem is described by a representative number of high-fidelity calculations from which a set of basis vectors may be extracted, the singular values become rapidly small and a small number of basis vectors are sufficient to approximate the initial data. In this way, POD provides an efficient statistical tool to capture the dominant features of a model characterized by many degrees of freedom and to represent it to the desired accuracy by using a reduced set of modes. The ROM is derived by projecting the high-fidelity model onto a reduced space spanned by only some of POD modes.

The POD has been applied in many different fields: data compression, image processing, dynamical systems, and fluid mechanics. Its application in fluid dynamics was first introduced by Lumley [7] for the detection of coherent structures in a turbulent flow. In fluid dynamics, the POD is usually employed to find a basis for the projection of the Navier-Stokes equations and to obtain a ROM composed of a system of ordinary differential equations for the time dependent POD expansion coefficients [8]. Less commonly, the POD is applied in the frequency space [9] or in a parameter space. In this latter case, as an example, the POD can be used to describe flow fields around modified body shapes, using the information about the flow past few selected geometries of the body, which form the snapshots for the POD. Examples of this approach can be found in the works of LeGresley and Alonso [10], Bui-Thanh et al. [11], Mifsud et al. [12], and Tang et al. [13].

The application of POD-based ROMs in the parameter space for aerodynamic shape optimization is quite recent and still under active development. In the above-cited references, the focus is on high-speed inviscid flows, in the transonic and supersonic regimes. The main results of the present work are the extension of the POD application in the parameter space to low-speed viscous flows and the use of a low number of snapshots for the initial set. This reduction is possible for two reasons: firstly, the locations of the high-fidelity calculations in the parameter space are chosen using a central composite design exploiting the statistical inference theory, and secondly the errors of the surrogate model, with respect to the full model solution, have a rapid decay with the number of snapshots. The POD expansion coefficients are functions of the parameters and are continuously extended in the parameter space by the response surface method (RSM) [14] using different techniques, such as least-square regression and radial basis functions (RBF) method [15]. This approach is termed POD with interpolation (PODI) [16].

It is not trivial to underline that, with respect to the application of the response surface methodology (RSM), the PODI method is able to provide the description of the entire fields of the original variables from which the objective function is calculated. The information is much more complete and justifies the minimum rise of computational effort. With RSM, on the other hand, it is possible to obtain only the value of the objective function in the desired point. More detailed information about this comparison is presented in Section 3.

In the present work, the PODI is applied to three test cases. The first one is the low-speed viscous flow past a backward facing step, with the step slope as parameter. In the second case, the flow past a NACA 0012 airfoil is reconstructed by the ROM in the two-parameter space (Mach, angle of attack). Finally, in the third case, the POD is coupled to a genetic algorithm; four shape parameters are optimized in order to minimize the drag coefficient of an automotive shape. In all the applications, the high-fidelity calculations are obtained from steady solutions of the Navier-Stokes equations.

In the first part of the paper, POD, RSM, and the interpolation strategies are briefly described. The second part reports the results obtained for the three cases under study, and in the last section some concluding remarks are given.

#### 2. Proper Orthogonal Decomposition with Interpolation

##### 2.1. Proper Orthogonal Decomposition

Several POD methods can be found in literature: the Karhunen-Loève decomposition, the Principal Component Analysis, and the Singular Value Decomposition (SVD). It can be shown that they are all equivalent [6]. For steady-state problems, each high-fidelity calculation, the* snapshot*, is represented as a set of discrete data, a vector of dimension , and the number of grid points or cells. In this case, the POD described as SVD is more straightforward [17].

###### 2.1.1. SVD as Proper Orthogonal Decomposition

The central issue of the POD is to approximate the snapshots simultaneously by a single, normalized vector * as well as possible*, that is, to solve the optimization problemwhere .

In (1), is equal to the number of snapshots.

Considering the Lagrangian functional associated with (1), the following eigenvalue problem can be written:with the condition

The orthonormal vectors , found solving eigenvalue problem (3), are called* POD basis of rank *. The approximation of the snapshots , columns of , by the first vectors is* optimal among all rank ** approximations* to the columns of . This method is the so-called* method of snapshots* [18].

The choice of is of central importance for the POD. No a priori rule is available; rather, it is possible to apply a heuristic criterion based on the ratio between the modeled energy and the total energy contained in the system which can be expressed as

##### 2.2. Continuous Extension of the POD Projection Coefficients

The vector represents a vector of scalar functions of grid points (or cells), such as the primitive variables of the flow field. The POD is applied to each nondimensional variable to compute a distinct basis. As shown in the previous section, each snapshot can be expanded as, setting , where the projection coefficients are discrete functions in the parameter space, with values defined at the points corresponding to the individual snapshots . To use the derived ROM as a prediction tool, it is necessary to extend the discrete functions in continuous functions in the parameter space, and the field variable in a generic point of the parameter space may be approximated by the linear combination

The combination of the POD and the continuous extension of the projection coefficients is termed POD with interpolation (PODI) [16].

In the case of one-dimensional parameter spaces, the continuous extension is obtained by linear or spline interpolation. For multidimensional spaces, the method of the response surface is adopted. The response surface method (RSM) describes the continuous behaviour of a dependent variable by a set of simple basis functions [14]. Response surfaces are generally valid in a large region only in the case of few parameters; when a great number of parameters are involved, such as in the case of optimization problems, the RSM must be treated carefully. RSM has been originally developed for experimental data, employing regression techniques. In this way, random experimental fluctuations are smoothed out. When the data set is provided by numerical simulations, a response surface obtained with a regression method in general does not fit exactly the data, introducing an undesirable smoothing. Therefore, in this work, other than a least square regression technique, interpolating methods, based on radial basis functions, are studied.

It is important to underline that number and topology of the discretization should not vary in the high-fidelity calculations. In this way, the correct correlations between the different CFD fields, corresponding to different solutions, can be performed. Only the position of the nodes on the geometrical mesh can change, exploiting the procedure of morphing.

###### 2.2.1. Least Square Regression

In the framework of response surface methodology, least square regression can be a simple technique to estimate a best fit approximation of the POD coefficients .

The second-order model of a response surface in the space defined by the parameters can be written as

In (8), with are indicated the regressors of the model and is an estimate of the error of the approximation. The second-order model can be useful in the event of a strong curvature in the true input-output relation. On the other hand, there can be cases where a first-order expression can fit properly the system behaviour with the relation

In the present work, both regression types are used.

Applying relation (8) or relation (9) to the points where the coefficient is evaluated for specific settings of the parameters , we have the system in matrix notation:where , is the model matrix, , and contains the regressors of the model. System (10) can be solved through least square minimization, assuming that has zero mean:

Once the vector of the regressors is found, the response surface is defined (8) and the POD coefficients are known as continuous functions in the parameter space.

###### 2.2.2. Radial Basis Functions

Least squares method does not provide an exact fit of the computed values , which is provided if interpolation is used. One of the primary tools for interpolating multidimensional data is the radial basis functions (RBF) method. The guiding principle behind this general method is to use translations of a single basis function that depends only on the Euclidean distance from its center, therefore radially symmetric about its center, in order to create a multidimensional interpolant [15]. The model of response surface for the POD coefficients that can be built using RBF has the general formwith being the centers and the radial basis functions, each one weighted by the expansion coefficient . In our work, two typical radial basis functions are used: The Gaussian . The multiquadric .

The parameter is called shape parameter and it is related to the width of the basis function. In Figure 1 [19], the influence of the shape parameter can be understood for Gaussian e multiquadric basis.

**(a)**

**(b)**

The expansion coefficients are determined from the interpolation conditions for , leading to a symmetric linear system, unconditionally nonsingular if the data points are distinct [20]:

Formulation (12) implies the constraint of using as many radial basis functions as the number of data points that in the present work is equal to the number of snapshots. If the snapshot number is much bigger than the number of freedom degrees required to generate an acceptable fit, the linear system can be ill-conditioned [20]. To overcome this problem, a relaxation of the interpolation conditions can be made. In order to do this, it is necessary to distinguish between the radial basis function centers and the data points. In this way, the problem becomes overdetermined, the matrix is no longer square, and a unique inverse no longer exists. The previous exact problem becomes a problem of linear optimization. The Moore-Penrose pseudoinverse of matrix can be introduced; the problem becomes which can be solved in a least square sense. Another way to overcome the problem of the ill-conditioning of matrix is to add polynomial terms to the RBF approximation [21]. In the so-called augmented RBF method, (12) becomeswhere are the monomial terms in the polynomials and are additional constants. If we consider a two-dimensional problem where , the monomial terms are equal to . The order of the polynomials has to be one degree less than the RBF. Normally linear or quadratic polynomials are adopted due to the fact that the use of high-order polynomials can be too computationally expensive.

For the case , the additional constant term is unique. Since now the problem is underdetermined, the orthogonality condition should be imposed [12]. Equation (14) in matrix form becomes where the coefficients of the matrix can be obtained from

#### 3. Computational Results

In this section, the results of three test cases are presented. In these applications, the capability of the POD-ROM to obtain some fields of interest not belonging to the initial snapshot simulations is verified.

In the first problem, the snapshot set is obtained varying a single geometry parameter. In the second case, the two-dimensional flow field past a NACA 0012 airfoil is investigated. The Mach number of the uniform flow and the angle of attack of the airfoil are chosen as parameters of the POD surrogate model. In the last case, a ROM is applied to the optimization of a particular automotive geometry with four shape parameters as design variables.

##### 3.1. Backward Facing Step with Variable Step Slope

###### 3.1.1. Problem Setting

This test case is set according to the experimental work of Ruck and Makiola [22]. The flow enters a channel from the inlet at a prescribed velocity and then encounters a step. The section of the channel increases causing the generation of a recirculation bubble in the flow. The length of the recirculation bubble is strongly dependent on length and slope of the step. A visualization of the problem setting is shown in Figure 2. The geometry of the problem is characterized by the Expansion Ratio (ER), that is, the ratio between the height of the channel () and the height of the inlet . In this work, ER is equal to 2 with an inlet height of . In order to have a fully developed channel flow before the step, the value of the length of the first part of the channel has to be chosen fulfilling the inequality ; therefore, in this work, is equal to 1. Similarly, to have a fully developed flow in the channel behind the step, the length of the duct, measured from the step, is set to 40.

A set of four snapshots is obtained, modifying the slope angle of the step. The angle assumes the values of , , , and . For each snapshot, a full CFD simulation is realized. The initial velocity is set equal to 2.5 m/s in all cases with a corresponding Reynolds number, referring to the height of the step , . The simulation is done with the open source software OpenFOAM using the* simpleFoam* solver with a turbulence model. A visualization of the four different geometries used for the snapshot set is shown in Figure 3. The number of cells is about 350000. The geometry variation is obtained through a morphing of the mesh from the base configuration, keeping the number of cells constant.

**(a)**°

**(b)**°

**(c)**°

**(d)**°###### 3.1.2. POD Reconstruction

In Figure 4, a visualization of the four POD basis vectors generated from the decomposition of the -component of the velocity field is shown. These basis vectors are the basic components of the surrogate model and are calculated as explained in Section 2.1.1. Then, velocity and pressure fields for a geometry characterized by a slope angle of the step of , not belonging to the initial set of snapshots, are reconstructed using the POD surrogate model. The parameter space is one-dimensional and piecewise linear interpolation is used to compute the POD coefficients.

**(a) POD mode 1**

**(b) POD mode 2**

**(c) POD mode 3**

**(d) POD mode 4**

We computed a measure of the error aswhere indicates the value of the field of interest calculated with the CFD full model and the corresponding value obtained with PODI. The values of the error are for the pressure field and for the velocity field. The surrogate model therefore is able to predict the behaviour of the system for the point in the parameter space, within an acceptable accuracy.

The first and most energetic POD mode visualized in Figure 4 is qualitatively identical to the reconstructed field. An estimate based on (4) indicates that the first POD mode contains 99.8% of the total field energy. The fact that the first POD mode is able to represent the most field energy is intrinsic in the POD technique because it maximizes the mean of the norm of the squared field projection along the POD modes. Nevertheless, in this work, all the POD modes were used for the reconstruction because the saving of computational effort and time of retaining fewer modes was negligible.

In addition, selecting two sections (Figure 5), the first one placed in the recirculation bubble and the second one close to the end of it, it is possible to compare the velocity profile obtained with the POD reconstruction with the high-fidelity solution. In Figure 6, this comparison is represented. The maximum relative error , calculated asis 1.3% in the recirculation bubble. indicates the corresponding value of the undisturbed flow. For example, for the velocity, m/s. Probably the maximum error takes place in this area because it is a low energy zone and the POD technique is optimal in the energetic sense.

**(a)**Section (recirculation bubble)

**(b)**Section (end of the recirculation bubble)###### 3.1.3. POD Reconstruction with Different Sets of Snapshots

To quantify the influence of the number of snapshots on the accuracy of the surrogate model, a further test is made. The problem setting is the same as that of Section 3.1.1 but now four different surrogate models are constructed and compared using 3, 6, 11, and 21 snapshots. In this case, the geometry is fixed for all the snapshots, with step slope of , and each snapshot is calculated imposing a different inlet velocity. The inlet velocity varies in the range between 10 and 30 m/s. In Table 1, a summary of the four sets of snapshots used in the different types of reconstruction is reported.

The surrogate model is used to reconstruct pressure and velocity fields for an initial velocity of 15.5 m/s not belonging to any of the initial sets of snapshots. Based on estimate (4), it can be remarked how the energy captured by the first mode decreases with the number of snapshots composing the set. Moreover, the first mode is always able to get more than the 99.99% of the total energy.

A comparison between the reconstruction error generated by the four surrogate models for the pressure field is shown in Figure 7. The error is calculated along the center line of the duct. As expected, this difference is decreasing as the number of snapshots increases. However, in the three-snapshot case, the maximum error is already acceptable and below 2.5 Pa.

In Figure 8, the variation of the reconstruction error of the pressure field using the four different snapshot sets can be seen. The error trend is clear and the use of 11 snapshots for the surrogate model seems to be optimal, in terms of computational time and accuracy, compared with the 21-snapshot case. As expected, the result with many snapshots is more precise, but in the three-snapshot reconstruction, the error is already acceptable. The POD decomposition is optimal in the energetic sense; therefore, in a steady problem, the first POD mode is quite always able to get a relevant part of the field energy and a reduced number of modes and snapshots can be used to build a surrogate model.

##### 3.2. NACA 0012 Airfoil

In this section, the problem of the two-dimensional steady subsonic flow past a NACA 0012 airfoil is analyzed. The parameter space of the surrogate model is two-dimensional: the angle of attack of the airfoil and the Mach number of the uniform flow. The POD-ROM is applied to obtain flow fields for parameter combinations not belonging to the initial set of snapshots. A 2-level full factorial design is applied to define the initial snapshot set composed of configurations that will be solved using the CFD full model of the problem. In Figure 9, a representation of the design in the parameter space can be seen, together with the positions of the reconstructed fields, and the values of the combinations used to create the snapshots are summarized. One objective of the test is to compare different interpolation and regression techniques, as explained in Section 2.2, for the PODI.

The numerical solution of the Navier-Stokes equations represents the high-fidelity solution of this problem. The solver used in this work is* simpleFoam*, from the open source software OpenFOAM. The Spalart-Allmaras model with wall functions is used as turbulence model. The geometry is discretized with about 500000 cells. In Figure 10, the grid in the computational domain and a zoom of the region around the airfoil are shown.

**(a) Full computational domain**

**(b) Zoom of the airfoil region**

Pressure and velocity fields are reconstructed with the PODI technique for three different points: , , and , with , 0.08, and 0.11, respectively. Four techniques are used to define the response surface of the POD coefficients: a least square regression of the first order, radial basis functions using Gaussian basis, with and without polynomial term, and multiquadric basis considering the polynomial term. A comparison of the results can be seen in Figure 11, where only the biggest error of the three reconstruction points is considered. The best overall result for the reconstruction of the pressure field is obtained using radial basis functions with Gaussian basis, no relaxation of the interpolation condition, and a value of the shape parameter of 1.05. In this application, a constant parameter is chosen for all the different radial basis functions; therefore, , .

In the case of the POD reconstruction of the velocity field, the behaviour of the response surfaces is different. In this case, the lower error is obtained using a multiquadric basis, with the orthogonality condition. These interpolation methods will be used for the airfoil surrogate models built in Section 3.2.1.

In all the three test points, the surrogate model shows good agreement with the high-fidelity solution.

###### 3.2.1. Influence of the Number of Snapshots

In a similar way to the backward facing step case, the influence of different numbers of snapshots is investigated. The aim is to show the rapid decay of the errors when increasing the snapshot number and therefore the possibility of using a ROM with a reduced snapshot set. In particular, four PODI surrogate models are constructed using 4, 9, 16, and 25 snapshots corresponding to a 2-, 3-, 4-, and 5-level full factorial design for the two parameters, Mach number and angle of attack . With respect to the surrogate model of the previous section, the parameter ranges are extended: now is varying from 0.05 to 0.25 and is between and . In Figure 12, a visualization of the snapshot position is shown. The surrogate models are used to reconstruct pressure and velocity fields in seven random points in the parameter space. The positions of the reconstructed cases in the () plane are summarized in Table 2.

The error with respect to the solution of the CFD full model is computed using the expressionwhere is the cell value of the field of interest calculated with the CFD full model and the corresponding value obtained with PODI.

In Figures 13 and 14, the error trends obtained with (20) are shown for the pressure and velocity fields. As expected, the errors are decreasing as the snapshot number increases. In the case of the velocity field, being the values under 0.01, the error is slowly decreasing. Therefore, depending on the a priori threshold of the surrogate model error, the use of 25 snapshots can be avoided and 9- or 16-snapshot sets can be used, saving computational time and effort. For the pressure field, higher errors are generated, but again the use of the 16-snapshot set can fulfill accuracy requirements.

###### 3.2.2. Comparison between Response Surface Methodology and POD Method

The response surface methodology (RSM) is an interpolation/regression technique able to predict values for desired parameter combinations, starting from information acquired from a known data set. In the POD surrogate method described in this paper, the RSM is used to obtain the POD coefficients corresponding to parameters not belonging to the initial snapshot set.

In this section, a comparison between the POD method and the RSM stand-alone will be presented. Gaussian basis is used to build a response surface with the radial basis function technique.

The most important difference between the two methods is that with RSM the output will be a single number (e.g., the objective function of the problem) and with the POD method the entire field of every fluid dynamic variable of interest can be obtained.

The NACA 0012 airfoil problem is considered and lift and drag coefficients are computed using a response surface with Gaussian basis and shape parameter . The 16-snapshot set of the previous section is used in the POD method and the aerodynamic coefficients calculated in the same points are then exploited to build the response surface.

The same 6 test points, A, B, C, D, E, and F, of the previous section are used to compare the methods.

In Figure 15, the relative errors on the computation of the lift coefficient with respect to the CFD solution can be seen for RSM and POD methodology. In Figure 16, the errors are reported for the computation of the drag coefficient.

The results are comparable and furthermore the outputs of the POD method are not single values but entire fields.

##### 3.3. Drag Coefficient Optimization of an Automotive Shape

###### 3.3.1. Problem and CFD Setting

In the last test case, the POD-ROM is employed in an optimization loop to obtain the minimum drag coefficient for an automotive shape. The base form is the open source DrivAer car model from the Technical University of Munich [23]. The drag coefficient is minimized acting on four shape parameters: the length , width , and height of the trunk and the height of the diffuser (Figure 17). The parameters are changed through mesh morphing. This concept is clarified in Figure 18 where the variation of the trunk width can be visualized. In Figure 19, the variation of the trunk height can be seen: starting from the base configuration, the minimum () and maximum () variations are shown.

**(a) Base trunk width**

**(b) Reduction of the trunk width**

**(a) Maximum reduction of the trunk height**

**(b) Maximum increase of the trunk height**

The optimization problem can be formulated as with being the vehicle drag, the density, and the velocity of the undisturbed flow and a reference area equal to the vehicle maximum frontal area.

The geometry is discretized with 1959410 cells and the CFD simulation is performed with the OpenFOAM solver* simpleFoam* using a turbulence model. is 40 m/s and the Reynolds number . The correct deformations corresponding to the desired values of the design variables are imposed with mesh morphing. Therefore, the cell number remains constant for each geometry and only some point positions are modified.

###### 3.3.2. Optimization

In this specific problem, the cost function evaluations correspond to the calculations of the drag coefficient. As explained in the previous sections, the adoption of a reduced-order model is very attractive to avoid a large number of high-fidelity computations; therefore, the POD method is used: pressure, velocity, turbulence kinetic energy, specific turbulence dissipation, turbulence eddy viscosity , and the mass flow through the cell faces fields are reconstructed and then the drag coefficient is computed. The selection of the set of snapshots is made on the basis of the central composite design theory. The number of snapshots determines the total number of CFD calculations. An appropriate selection of the snapshots is extremely important because their calculations represent the most time consuming step of the entire optimization procedure. The calculation of the POD basis is done following the method described in Section 2. Response surfaces are generated with a least square method of the second order or interpolating using radial basis functions as explained in Section 2.2.

The optimization has to be performed in a 4-dimensional parameter space; therefore, willing to adopt a full level factorial design, we have to construct snapshots only for the 2-level case and this number would grow exponentially increasing the levels. Taking into account the error trends of the surrogate models built in the two previous test cases, a reduced initial snapshot set can be used in this practical case. A 2-level fractional factorial design , adding the central point, is adopted instead of a full level and the snapshot set is composed only of 9 snapshots. In Table 3, a description of the design parameter combinations used to generate the snapshots is reported. In this table, the highest and lowest values of the variables are represented with and , respectively. The real values of the high and low levels of the design parameters are determined by the optimization constraints and are for , , and and for .

Once the snapshots are calculated, the remaining function evaluations required by the optimization algorithm are obtained using the PODI surrogate model.

The optimization algorithm used in this application is the SOGA (single-objective genetic algorithm) implemented in the JEGA library of the open source software Dakota. SOGA is a classical single-objective genetic algorithm that performs optimizations of a single cost function. Obviously, the use of a genetic algorithm is not mandatory: the PODI surrogate model can be linked to any other optimizer.

In this problem, the design variables are represented in floating point, with a random initialization and control to avoid duplications. The number of individuals composing the initial population is 50. A shuffle random crossover type is set with a rate of 0.8. With this particular crossover, the parent chromosome sequences, once selected, are randomly shuffled and then the single-point crossover is performed. This operation is useful to eliminate the positional bias associated with the length of each chromosome. A random mutation rate of 0.08 is imposed. This kind of mutation corresponds to a random selection of an individual and a random selection of a design variable at which a random valid value is assigned. After 7 generations, the genetic algorithm is able to identify an optimal solution calculated using the surrogate model. The optimal drag coefficient is 0.3013, starting from a base configuration (all parameters at 0 level) of 0.3111, with an improvement of the 1.8%. The error with respect to the CFD solution is 1.36% for the optimal point.

The adoption of a PODI surrogate model dramatically reduces the computing time of the optimization: instead of ~240 h without the adoption of the surrogate model, ~90 h are necessary with PODI, using 2 Intel Xeon E5440 Quad core processors.

Figure 20 shows considerations on the energy associated with the POD modes. The results have a direct analogy with the 2 previous test cases and in particular with the backward facing step problem. Considering, for example, the -component of the velocity, the first POD mode is able to represent the greater amount (99.3%) of the energy and its qualitative appearance is comparable to the reconstructed field. The other POD modes add information on the zones of the field with minimal energy, therefore mostly on the vehicle wake. All the POD modes are used for the surrogate model evaluation, as shown in the previous test cases, since this addition is not computationally demanding.

**(a)**-Component of the velocity**(b) First POD mode**

**(c) Second POD mode**

**(d) Third POD mode**

**(e) Ninth POD mode**

In order to further compare the results of the proper orthogonal reconstruction for the optimal point configuration, a section in the wake of the vehicle is defined as shown in Figure 21. The velocity and pressure profiles on this section, along the centerline of the vehicle, are plotted in Figure 22. The maximum relative errors, calculated as in Section 3.1.2, are 0.73% for the velocity and 3.14% for the pressure.

**(a) Velocity profile**

**(b) Pressure profile**

###### 3.3.3. POD Coefficient Calculation

To test the surrogate model, 14 random points in the parameter space are selected. These points belong to the first generation of individuals created by the optimization algorithm. In order to have a visualization of the points, a slice in the 4-dimensional parameter space is made and the point positions in the plane can be seen in Figure 23. A comparison between some of the different methods to build the response surface for the POD coefficients is shown in Figure 24. The comparison is made between four kinds of interpolation: RBF using Gaussian basis, RBF with Gaussian basis, and relaxation of the interpolation conditions using only 5 nodes; RBF with Gaussian basis and polynomial term; multiquadric basis and polynomial term. No regression was used because in the preliminary test cases the errors of these types of reconstruction were much higher than using a radial basis function interpolation.

The relative error is not calculated considering the reconstructed fields but using the drag coefficient , that is, the objective function of the optimization problem and the CFD simulation, is considered as reference solution: Analyzing Figure 24, it can be seen that the errors increase using RBF with Gaussian bases without any modification. The relaxation of the interpolation conditions and the addition of a polynomial term, used to improve the conditioning of matrix , give a reduction of the percent error. Both the multiquadric and the Gaussian bases are able to generate a response surface that foresees accurately the values of the PODI reconstruction coefficients . The relaxation of the interpolation conditions gives good results but the initial choice of the number and node position is an additional unknown which has to be handled during the building process of the surrogate model. On the other hand, the addition of a polynomial term is easier to implement and therefore for the construction of the surrogate model linked to the optimizer, a multiquadric basis with no relaxation and addition of the polynomial term has been adopted.

#### 4. Conclusions

Three applications of the proper orthogonal decomposition with interpolation (PODI) for the construction of surrogate models were presented. In the first case, a PODI/ROM was built for a backward facing step problem varying the slope of the step. The technique provided acceptable results showing its capability to substitute CFD simulations once the initial set of snapshots was obtained.

The second problem was the construction of an optimal PODI/ROM in order to analyze the subsonic steady flow field around a NACA 0012 airfoil. The parameter space was two-dimensional and was generated varying the Mach number of the undisturbed flow and the angle of attack of the airfoil. Four surrogate models were built and compared using different snapshot numbers and different response surfaces for the interpolation of the PODI coefficients.

The third case was a practical case and the surrogate model was linked to a single-objective genetic algorithm. Four shape parameters were chosen as design variables and the drag coefficient of a full vehicle was the cost function to minimize. The function evaluations required by the genetic algorithm were performed through the PODI/ROM avoiding other CFD simulations except the ones required to generate the initial set of snapshots. The saved amount of time was remarkable.

In all the three cases considered in this work, the PODI method was able to produce a consistent low error surrogate model that can be used for fast evaluations of the flow fields of interest using reduced initial sets. The number and position of the snapshots are crucial features that affect both the computational effort in building the surrogate model and its accuracy with respect to the full model solution. The optimal position and number of the snapshot set can be a useful subject of future investigations, together with a rational choice of the particular type of response function and a methodology to compute a reliable a priori estimate of the ROM accuracy.

#### Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.