Abstract

For complicated aerodynamic design problems, the efficient global optimization method suffered from the defect of the incorrect portrayal of the design space, resulting in bad global convergence and efficiency performance. To this end, a Kriging-based global optimization method, named the Kriging-based space exploration method (KSE), was proposed in this paper. It selected multiple promising local minima and classified them into partially and fully explored minima in terms of the fitting quality of the surrogate model. Then, the partially explored minima would be furtherly exploited. During the local search, an enhanced trust-region method was adopted to make deep exploitation. By combining local and global searches, the proposed method could improve the fitting quality of the surrogate model and the optimization efficiency. The KSE was compared to other global surrogate-based optimization methods on 12 bound-constrained testing functions with 2 to 16 design variables and 2 aerodynamic optimization problems with 24 to 77 design variables. The results indicated that the KSE generally took fewer function evaluations to find the global optima or reach the target value in most test problems, holding better efficiency and robustness.

1. Introduction

In aerodynamic shape design, Computational fluid dynamics (CFD) solvers and optimization algorithms were essential. The optimization algorithms have been divided into two categories: gradient-free and gradient-based methods. Generally, the gradient-based methods needed sufficient gradient calculations and inherently converged to local minima, which was not suitable for complex multimodal problems [1]. The gradient-free methods, which did not need gradient information, could be easily incorporated into existing frameworks and have been found in many engineering applications [2, 3]. The gradient-free methods could be divided into several different groups: biology-based, physics-based, social-based, music-based, chemical-based, sport-based, swarm-based, and hybrid methods which were combinations of these [4, 5]. Since the optimization field was expansive, suitable gradient-free methods would be problem dependent. However, the gradient-free methods would need massive calls and simulations for the high-fidelity design requirements, leading to an enormous computational burden. Therefore, the efficient global optimization method (EGO) [6] has greatly acquired the research and development of the global optimization method called surrogate-based optimization (SBO) [712], which had the potential to uncover the unconventional aircraft configurations in the aerodynamic design. For problems with multimodality, the accuracy for the surrogate model largely depended on the capability of the surrogate model in capturing the nonlinearity of the problems. Ignoring this would result in suboptimality or infeasibility of optimal solutions [1316]. The EGO method soon acquired a reputation for its flexibility to imitate complicated functions for it had a random error estimate between the true and the surrogate functions to quantitatively measure the uncertainty in prediction [3, 17].

The most attractive characteristic of the EGO method was that it could meet two requirements: the exploitation of local minima (local search ability) and the exploration of uncertain areas (global search ability). The balance of global and local search ability largely relied on the predicted standard error. However, the numerical experiments showed that the EGO did not perform very well on multimodal problems and it also suffered from the curse of dimensionality on problems with more than 30 design variables [1820]. Thus, it would take a considerable amount of computational cost to improve the surrogate model accuracy [18] for complicated aerodynamic design problems with high multimodality and dimensionality [21]. Therefore, enhancing and balancing these two abilities was essential to improve optimization efficiency [22]. Various extensions and modifications of EGO have been developed and applied to engineering design problems. Sun et al. [23] proposed two different multiobjective EGO algorithms, where the infill criterion formed by the “Kriging believer” and “multiple good local optima” strategy was adopted to compromise exploitation and exploration. By extending the generalized EGO algorithm and adopting a constant liar strategy, Li et al. [24] used a Kriging-based unconstrained global optimization method to obtain multiple infill sample points. Xing et al. [25] proposed a P-EI method, combined with the effective design domain simplification technology, to search multiple peaks of the expected improvement (EI) function. Amine Bouhlel et al. [26] proposed a regional extreme criterion combined with the partial least squares method (KPLS(+K) model) to deal with the high-dimensional problems.

Besides, there were many meaningful works to enhance and balance the global and local searches for surrogate-based global optimization: Wang and Simpson [27] applied the fuzzy -means clustering method on many cheap sample points to reduce the design space. A stochastic RBF method for the global optimization of expensive functions was proposed by Regis and Shoemaker [28], which improved the Gutmann-RBF method by varying the size of the subdomain in different iterations [29]. Villanueva et al. [30] used the -means clustering method to partition initial design space into different subregions and explored them separately by multiagent systems. Long et al. [31] combined a set of intelligent space exploration strategies with TR-ARSM for global optimization. In addition, the multistart approach has also proved to be an efficient global optimization strategy. Regis and Shoemaker [18] introduced A QUAsi-multistart Response Surface (AQUARS) locating the promising regions of the objective function by performing local searches around the local minima of the surrogate model, and it bounced between the local minima to obtain the global optima. Dong et al. [32] combined the multistart approach with space reduction strategy, introducing the multistart space reduction method. The MSSR reduced space size automatically based on the current best sample point. Eriksson et al. [33] proposed the TURBO algorithm which applies multiple local surrogate models to approximate the objective function and uses an implicit bandit method to perform a principled global allocation of samples across these models. Bartoli et al. [34] proposed a superefficient global optimization coupled with a mixture of expert (SEGOMOE) method, where an enrichment strategy based on a mixture of experts and adaptive surrogate models was applied to tackle multimodal problems with complex constraints. Diouane et al. [35] proposed a trust-region-like EGO method (TREGO), which alternated between traditional EGO steps and local steps within a trust region. To deal with the constrained black-box problems, some researches adopted different infill criterions to figure out the feasible regions during the optimization process [3638], and some researches integrated surrogate model with evolutionary algorithms to deal with expensive constraints [3941], which could dramatically improve the optimization efficiency and convergence on constrained optimization problems.

These researches indicated that the design space reduction approach presented to be an efficient optimization method for multimodal and computationally expensive black-box problems. However, there were two main arguments significantly affecting the optimization stability, convergence, and efficiency: (a)The local minima of the surrogate model would change significantly as more sample points were evaluated, and the new subspaces were generally centered at a single best point. Thus, it was difficult to select suitable local minima to construct promising subspaces. Once it was trapped in some deceptive regions, it would cost plenty of computational effort to escape and search for other possibilities [32](b)Some algorithms tended to search for new sample points around the well-explored local minima, leading to wasted computing resources, and would ignore the potential regions where better sample points might exist

The Kriging-based space exploration method (KSE) was proposed to address these problems by using multiple subregion local search strategy and the enhanced trust-region method in this paper. It combined the global search over the entire space and local search in the subspaces to balance and enhance the global exploration ability and local exploitation, thereby addressing problem (a). During the local searches, not all local minima need to be further exploited. The local minima were classified into fully and partially explored minima in terms of root mean square error (RMSE). And only the local minima with large RMSE needed further exploitation, during which an enhanced trust-region method was proposed to emphasize the optimization of the partially local minima, thereby addressing problem (b). Besides, to overcome the limitations of the original EGO, the infill criterion would be turned into minima prediction (MP) [6] to make further exploitation when all local minima were fully explored and the improvement of the EI criterion was lower than the preset threshold.

The proposed KSE method was compared to alternative global optimization methods on 12 test functions with 2 to 16 design variables and 2 aerodynamic design problems with 24 to 77 design variables. The numerical results indicated that the KSE method generally used fewer function evaluations to find the global minima on most test problems compared to alternative methods. Moreover, it could use fewer function evaluations to reach a reasonably good solution to aerodynamic design problems compared to alternatives. Hence, the KSE was very promising for the global optimization of engineering design.

The rest of the paper was organized as follows: part 2 introduced the background while part 3 focused on the framework of the proposed method. Part 4 and part 5 described the computational experiments and their results as well as the related discussion. Finally, the last part provided a summary and the conclusions.

2. Methodology

During the last decade, the EGO method [6] has become one of the most popular methods for global optimization. The EGO method was a sequential planning strategy that used a Gaussian process-based surrogate model, e.g., Kriging, and a variance-based merit function or criterion to generate candidate solutions. Therefore, the theories of EGO were briefly depicted in this section.

2.1. The Kriging Model

The Kriging was first proposed in the field of geostatistics by Krige [42] and Matheron [43]. After Sacks et al. [44] applied it to computer-based engineering design, the Kriging model became popular in engineering design and optimization.

In the Kriging model, the values of the black-box function were assumed to be the outcomes of a stochastic process. In particular, it was assumed that the function value was a realization of a random variable which was normally distributed with mean and variance . For any two points and , the correlation between and could be modeled as where and were hyperparameters of the model, and they provided more degree of freedom for the predictor to imitate the behavior of multimodal functions. Let and be the vector function values. Fitting the surrogate model through the was equivalent to finding the maximum likelihood estimates (MLEs) of the parameters . Then, the Kriging predictor at a new point was presented as where and . The uncertainty at could be given by mean squared error (MSE): where . The first term represented the uncertainty due to the correlation of the sample points, and the second term described the uncertainty of the estimated parameter . For more details, Jones [21] gave a taxonomy of surrogate-based optimization methods.

2.2. The EGO Method

The EGO method belonged to the so-called “two-stage” method: the first step consisted of fitting the surrogate model and estimating the associated hyperparameters. The second step consisted of using this surrogate model instead of the true function and searching for promising sample points. It used (4) to find new sample points.

Here, was the current best objective function value. The and were the cumulative density function and the standard normal probability density functions.

For an inequality constraint function as , the constraint function was also modeled by a Gaussian process. And this model could be used to calculate the probability that the constraint was met. So, the ‘probability of feasibility’ for each constraint would be calculated in (5): where was a normally distributed random variable with mean value and variance value and measured the feasibility of new points. Provided that the constraint and objective functions were all independent models, Sasena et al. [45] proposed the constrained expected improvement (CEI) method to handle constraints by multiplying the expected improvement of the objective function and the probability of feasibility as

Locatelli [46] proved that the EGO method generally took plenty of function evaluations to find the global optima. During the EI-based global optimization, the was usually slightly lower than its true error value [47]. Unless the uncertainty of the current region became relatively low, the underestimation would cost many iterations to explore the local spaces, especially for some deceptive regions with lower estimated standard deviation, causing slower convergence rate.

2.3. Trust-Region Method

The trust-region method (TRM) [48] has proven to be an efficient space reduction method for its good performance in optimization [49]. It identified the promising space, where the actual global optima probably exist, in light of how well the current surrogate model predicted the improvement of the objective function. In TRM, the current best sample point was chosen as the center of the new space. The trust-region factor and trust-region radius were computed through

The and were the true and predicted values, respectively. could measure the improvement of objective function and accuracy of surrogate model at the same time through and . If , the current surrogate model was capable of decreasing the true objective value, and then, the trust radius was updated according to (8) [49], where the typical values of were used, namely, . The was the upper bound of the trust radius, which was usually set to 0.5, and indicated the Euclidean norm. Otherwise, the current surrogate model could not improve the objective value for its bad fitting quality. In this case, the minimum trust radius was used, which usually lies in the interval . Then, TRM was forced to exploit a small neighborhood around the present best point. The TRM allowed optimization to focus on a small region around the current point with dynamic bounds, and it was very efficient in local search, avoiding the prior knowledge of constructing local space.

3. Proposed Method

3.1. Classification of Local Minima

From the discussion above, the EGO method attempted to explore the entire design space which would require numerous computational resources. Once it focused on some deceptive local minima, it was hard for it to escape from local minima and explore other potential regions in time. An unbalanced approach to exploitation and exploration usually caused a poor depiction of the design space, as well as the convergence and efficiency of optimization.

To overcome these issues, this paper proposed the Kriging-based space exploration method (KSE). The multiple subregion local search strategy was used to explore the local regions in the entire design space, improving the depiction of the design space and surrogate model accuracy. One of the main aspects of this strategy was to select reasonable potential regions to make deep exploitation. It classified the local minima of the surrogate model into fully and partially explored local minima, from which the partially explored minima were selected to make further exploitation. The classification of partially explored local minima was relayed on the depiction of local regions, defined by the root mean square error (RMSE) of local regions. The poor fitting quality of the surrogate model would lead to substantial fluctuations in design space, resulting in misleading results for the algorithm. And a good depiction of the entire design space, which relied on depicting vicinities of local minima, could improve the fitting quality of the surrogate model. Thus, the main purpose of multiple subregion local search strategy was to make the proper depiction of potential local regions and provided efficient information for global optimization.

Let design space and be any function from to , whose local regions were nearly linear. Suppose was a local minima of the objective function over ; then, there existed radius , so that for all . Similarly, the for the surrogate model with local minima over local space . For a well-performed surrogate model, the would be consistent with the and the trend of in surrogate model would be similar to that of in objective function.

Otherwise, it would need more sample points to depict the neighborhood properly, like in Figure 1. Under this circumstance, the root mean square error (RMSE) of sample points within the local region was used to decide whether the corresponding local minima were fully explored or not, and the partially explored local minima would need further exploration. During these processes, the exploitation of partially explored local minima of the surrogate model was used to improve the fitting quality of the surrogate model in local space, and the exploitation of partially explored local minima of the objective function was used to search for better points. Local minima for the surrogate model did not always correspond to a local minima for the objective function. However, the deep exploitation in the vicinity of local minima of the surrogate model could eliminate the “fake local minima” and obtain the true local minima for the objective function. Since the local minima with better performance were more likely to be close to the global optima, the further exploitation of the local minima of objective function would make the algorithm toward the global optima more efficient. In combination, the global exploration for the target value would be promoted.

The above conditions could not guarantee that the surrogate model would achieve a good approximation to the objective function in some neighborhoods of local minima. However, if had a relatively simple surface in the neighborhood of (e.g., if it was nearly linear), then the possibility that was well approximated by the surrogate model in that local region would be increased by those conditions. Thus, the fully explored local minima of the surrogate model were more likely to be the local minima of the objective function, as shown in Figure 1.

3.2. The Enhanced Trust-Region Method

Although the traditional trust-region method (TRM) has shown the advantages of improving the optimization efficiency, it might not make sufficient search in local regions since the space radius of new design space was largely dependent on the previous best point (the radius from was linearly dependent), which would lead to premature of the optimization. To avoid this shortcoming, this paper proposed an enhanced trust-region method (ETRM) combining the independent/dependent strategy with the stochastic perturbation strategy, as shown below.

Here, the was the new local space defined by the , , and , which were computed by different strategies, as shown in (9). The space radius of the ETRM changed in terms of the actual improvement of EI () while the original TRM adopted the predicted improvement based on the surrogate model as shown (8). The trust factor was computed through (10), where typical values of , , , and were used, respectively. If , where the was the threshold for the , the EI criterion was capable of decreasing the objective function value; the space radius would be expanded according to (11). If , the space radius would keep unchanged. If , the surrogate model was unable to obtain better points in current space; then, a conservative space radius would be selected. (12) used the independent strategy from the current best point and the entire design space to construct a uniform box-shaped trust region, where were the lower and upper bound of the entire design space, respectively, and was the Euclidean norm. The purpose of this strategy was to reduce the influence of the previous sample points and make the optimization more independent to find diverse sample points in each iteration. However, the dependent information from the previous best point could maintain the efficient dimension information which would help the algorithm toward the global optima quickly. Thus, the dependent information was used to improve the dimension setting of the trust region, which was indicated by the vector direction from the previous best point to the current best point , as shown in (13). Besides, to make a full exploration of the local region, a stochastic strategy for trust region was presented by (14). Here, the was the probability of perturbing any dimensions of design space, and was the maximum number of iterations. For the iteration , the number of the dimension to be perturbed was a random variable with a binomial distribution with a mean number of perturbed dimensions , where was the number of all dimensions in the design space. In (14), the was decided by a strictly decreasing function and would decrease as the optimization carries on. The was the set of the index of the perturbed dimension decided by and , where was a uniform random number generated in the range , and it would be different in each iteration to maintain the diversity of design space. Then, the was computed by the trust radius and the current best point . The use of a stochastic strategy for constructing a trust region was expected to be helpful for problems with many local minima and design variables.

At the initial stage of optimization, the distribution of could be very sparse and their positions would change significantly in different iterations since the surrogate model accuracy would not be very precise. Under this condition, the dependent strategy could make use of useful dimension information to accelerate the convergence speed to better points. And the stochastic strategy would help the algorithm to search for multiple diverse sample points by perturbing a large fraction of the dimensions. As the progress of optimization, the distribution of would be more centralized and the distance between the best points would be smaller. By perturbing a progressively smaller fraction of dimension, the vicinity of the current best point was less likely to be disturbed. At present, the trust region would be largely dependent on the independent strategy, which ensured that the neighborhood of would be focused throughout the process of optimization.

The ETRM measured the improvement of the objective function and combined an independent/dependent strategy with the stochastic perturbation strategy to construct the promising local regions. These strategies could deal with different situations across the optimization process, and in combination, they could preserve the diversity of design space and prevent the premature of optimization, improving the efficiency and convergence of optimization.

3.3. Detail of Proposed Method

The proposed KSE method consisted of four parts: (1) search effective samples and select local minima; (2) classify local minima into fully and partially explored local minima; (3) exploit the partially explored local minima; and (4) combine global exploration and local exploitation. The flowchart of the method is illustrated in Figure 2, and the following were the detailed steps of this method.

Step 1 (initial design). In the beginning, Latin hypercube sampling (LHS) method [50] was adopted to generate initial sample points.

Step 2 (global search by EI criterion). The EGO method based on DACE, where the Gaussian kernel function was adopted, was used to explore the entire design space, during which the particle swarm optimization (PSO) algorithm was adopted as an optimizer. As for the problems with multiple constraints, the constrained expected improvement criterion (CEI) [45] would be adopted to handle constraints in KSE.

Step 3 (identify the local minima). The sample points with good performance were selected as local minima and corresponding sample cluster.

Step 4 (select the partially explored local minima). Classify the local minima into fully and partially explored local minima.

Step 5 (multiple subregion local search). Deep exploitation of all partially explored local minima, based on the ETRM, would be carried out to improve the surrogate accuracy. After all local minima have been explored to their fullest extent, the algorithm would proceed to Step 2 until no partially explored local minima could be identified.

Step 6 (switch criterion function). At the last stage of optimization, the MP criterion would be adopted instead of EI criterion based on the improvement of the EI criterion. The objective function was set as (15), and the function transition factor measured the improvement of the EI criterion, as shown in (16). The was the objective value of the current best sample point, and was the improvement of the EI criterion. In this paper, the was set as 0.01, which meant if the was less than 1% of , the objective function turned into the MP criterion.

The KSE for Global Optimization
Input: (1) original design space ;
(2) global exploration iteration number ;
(3) cluster number , cluster radius ;
(4) maximum iteration number ;
(5) : EI function;
(6) threshold of RMSE:
Output: the best solution as global optima ;
Begin
Choose an initial set of sample points by LHS sampling method.
Evaluate to obtain the corresponding response , and store them in the database .
While
for
Employ EGO algorithm with ;
Store the optimized solution and its function response value in the database ;
;
end
Select clusters from database , and identify the cluster center , other sample points within the distance , would be classified into the corresponding cluster . The distance between sample points with cluster center was computed by Euclidean distance.
for
Sort all sample points of in terms of their function response value in ascending order;
Use Cross-validation to obtain the predicted value of all sample points;
;
if
Store into fully explored local minima dataset ;
else
Store into partially explored local minima dataset ;
end
end
if
for
Define local space centered at by the ETRM algorithm;
Employ EGO algorithm with in ;
Store the optimized solution and its function response value in the database ;
;
end
else
Select current best sample point value .
if
criterion;
else
criterion;
end
end
end
End

4. Illustrative Examples

4.1. Example of the EGO Method

To illustrate the limitations of the traditional EGO method, the Goldstein-Price function [51] was used as an example. This function was usually evaluated on the , with 3 local minima. And the global optima was located at with the value of 3.000. Figure 3 describes the original GP function while Figure 4 shows the surrogate space constructed by the Kriging model with initial sample points [18], where was the dimension. It demonstrated that the initial surrogate mode could not achieve a good approximation of the objective function.

The convergence history of the GP function is shown in Figure 5, where two obvious stagnations happened. The first stagnation was in the global exploration process, during which the maximal EI value was very large and the optimizer tended to explore the entire design space and improve the surrogate accuracy, as presented in Figure 6. During the late stages of optimization, the optimizer searched the neighborhood of the global optima. Since the objective values of sample points in that local space were close to that of the global optima, the EI criterion showed very little improvement and the optimizer tended to stochastic search. Thus, the optimization was trapped again.

The sample infilling process is presented in Figure 7, in which the contour was the space constructed by the Kriging model, and white triangles were the infilled sample points. At the beginning of optimization, the fitting quality of the Kriging model was not that good and it would take a large number of sample points to explore the entire design space, during which the first stagnation happened, as shown in Figure 7(a). After that, the surrogate accuracy was improved, and the optimizer would turn into an efficient searching process during which the objective value was efficiently decreased. Figure 7(b) presents that the surrogate model achieved a much better approximation to the objective function, and sample points with good results have been obtained. However, at the later stage of the optimization, the first term of the EI function would become small since the difference between the prediction value of unknown points and the current best point was very small, and the second term of the EI function would dominate the optimization trend. Thus, the optimizer tended to stochastic search in unknown regions, during which the EI criterion could not make substantial improvement. Therefore, the optimization was trapped again, as shown in Figure 7(c). After 45 sample points were updated, the global optima would be found at the 99th iteration. Figure 7(d) presents the final surrogate model.

The process above demonstrated that the EGO method explored the entire design space at the price of plenty of computational costs, for it took a large number of sample points to improve the fitting quality of the surrogate model in the entire design space and get rid of deceptive regions. Besides, the EI criterion tended to be maximized in the regions close to the bounds of the design space and far away from the previous sample points, which made it inefficient for finding the global optima. To overcome these problems, this paper proposed a Kriging-based space exploration method.

4.2. Example of the Proposed Method

To better demonstrate the KSE method, the optimization process of the Goldstein-Price function was shown below. Figures 8 and 9 present the convergence history of the objective function value and the , respectively. There were mainly four phases during the optimization process. Phase 1 was the global exploration process based on the EI criterion. However, during this process, the objective value had little improvement after many iterations due to the bad fitting quality of the Kriging model. Then, three local minima were found and multiple subregions of local searches started to make deep exploitation, during which the objective value decreased a lot, as shown in phase 2. After that, the global exploration started again to search the global optima, which was demonstrated as phase 3. During this process, the improvement of the EI criterion was very little. To speed up the convergence of optimization, the objective function was transited into the MP criterion, and the global optima was soon found, shown as phase 4.

The update of sample points is shown in Figure 10. Figure 10(a) presents the global exploration based on the EI criterion. Then, three partially explored local minima were selected to make further exploitation by multiple subregion local search strategy. To perform better exploitation, the ETRM algorithm was adopted to restrict optimization in a small region with dynamic bounds, where the red-dashed rectangular were their initial design spaces, namely, as shown in Figure 10(b). The exploitation result is presented in Figure 10(c), in which the local minima disappear and the surrogate model approximated the objective function well. To detect new partially explored local minima, the global search continued and several points were updated. In Figure 10(d), all local minima were fully explored. And in terms of the value of , the objective function was turned into the MP criterion and it only took 3 evaluations to locate the global optima. At last, the global optima would be found at the 58th iteration.

The KSE method could enhance and balance the exploitation and exploration ability of optimization by locating potential local minima and combining multiple objective functions, improving the optimization efficiency and preventing premature optimization.

5. Test Functions

5.1. Experimental Setup

The performance of KSE was studied in the sense of global convergence capability, efficiency, and robustness on a broad set of global optimization test problems with 2 to 16 design variables. These included the seven Dixon [51] test functions (Branin, Goldstein-Price, Hartman3, Shekel5, Shekel7, Shekel10, and Hartman6), two test functions (SC and Shubert) with multiple local minima, and three test functions (Trid10, Sphere10, and F16) with higher dimension [32]. These test problems had their unique characteristics, and in combination, they could better represent many different situations in design problems. The detailed forms of these functions are given in Table 1. The initial LHS samples with size were the same for all optimization algorithms on all problems. Thirty runs on each of these benchmark problems have been made.

The experiments of all test cases were performed on Intel(R) Core (TM) i7-8700 CPU @ 3.20 GHz computer. The obtained statistical results were compared with that of several well-known global optimization methods. One alternative was the efficient global optimization method (EGO) [6] as implemented in MATLAB by Viana [52], which was one of the most widely used surrogate-based global optimization methods, especially for engineering design problems. The model-pursuing sampling (MPS) method [53] was an adaptive surrogate model method producing discriminative samples toward the actual global optima. The mesh adaptive direct search (MADS) [54] algorithm was implemented as NOMADm [55], using the surrogate model to make it suitable for computationally expensive problems. A universal multistart approach called multilevel single linkage (MLSL) [56], with local optimization solver sequential quadratic programming (SQP), is implemented by MATLAB optimization toolbox [57]. Another alternative was trust-region-based adaptive response method (TR-ARSM) [49], which was an efficient space reduction method for multimodal problems. Trust-region implementation in Kriging-based optimization with expected improvement with restart strategy (TRIKE-Restart) was a trust-region-like approach where EI function was combined with trust-region and restart strategy [20]. We also compared the proposed method with two multistart methods: the A QUAsi-multistart Response Surface (AQUARS) [18], during which the initial points were obtained from the CGRBF method [29], and the multistart space reduction method (MSSR) [32], which was the combination of multistart approach and space reduction strategy. The parameter settings of these methods are summarized in Table 2. The acquired statistical results of AQUARS, TR-ARSM, TRIKE-Restart, and MSSR were collected from the published papers due to the lack of publicly available software to implement the competitor methods.

These methods were compared in terms of the average number of function evaluations (over 30 trials) required to get a solution with . If was the global minima value over the domain and was the best value obtained by an algorithm, then the relative error of the algorithm was given by

5.2. Results and Discussion

Table 3 gives the average number of function evaluations required by each method to reach the target value with [18]. If all 30 trials got within 1% of the global optima, then the number inside the parenthesis was the standard error of the mean. The symbol “” indicated that at least one of the trials could not detect the target value. And was the number of trials that did not get within 1% of the global minima value within 500 function evaluations.

From Table 3, it was shown that the proposed KSE method performed much better than the EGO method, MPS method, and MLSL-SQP method on all test problems. The EGO method generally required more evaluations to get the target value, especially for the problems with high dimensionality and multimodality. The new sample points of EGO were more likely to be updated around the local minima which were well explored. The MPS method performed a little better than EGO on high-dimensional problems (Trid10 and Sphere10), but it was not efficient for complicated problems with higher multimodality (Shekel problems and Shubert function), since it was hard for MPS to detect the promising regions from the design space with multiple local minima. It was presented that MLSL-SQP was much better than EGO and MPS methods on 8 test problems, except for the difficult Shekel problems with deep and narrow global optima basins. However, the local searches of MLSL-SQP tended to full convergence around all local minima even if these local minima were unlikely to lead to better points. These three methods were well known for engineering design, so these results suggested that the KSE method was very promising for engineering optimization problems.

It showed that KSE could reach the target value faster than 4 alternative methods (TR-ARSM, MSSR, AQUARS, and TRIKE-Restart) on 8 test problems (Branin, SC, Hartman3, Shekel5, Shekel7, Trid10, Sphere10, and F16), and it could be compared with the best results attained by the alternative methods on 2 other problems (Goldstein-Price and Shubert). Although alternatives based on the multistart approach (MSSR and AQUARS) could perform better than the KSE on 3 test problems (Hartman6, Shekel10, and Shubert), it could not conclude that the multistart approach had extremely advantages over the KSE, since KSE performed much better than MLSL-SQP on all test problems. It might be that these methods put much emphasis on local search. Besides, the different initial LHS methods and surrogate models in the compared papers could also influence the results. Anyway, the KSE was quite robust in the test cases mentioned in Table 3. And it has shown its advantages in stability and efficiency to reach the target value in all trials on serval test problems.

Figure 11 shows the box plots for the number of function evaluations required to get within 1% of the target value for the KSE and some other alternatives on all test problems. Many trials of EGO and MPS could not achieve the desired target value on many test problems, and it was impossible to create accurate box plots for these methods. Hence, they were excluded from the box plots. The box plots showed that MADS was competitive with the KSE on Shekel and Shubert problems and MLSL-SQP was much worse than the KSE on all test problems in terms of the median number of function evaluations to get within 1% of the global minima value. The distributions of the number of function evaluations were much less spread out for the KSE compared to that of other methods on most test problems. Hence, the KSE was generally more robust and efficient than other methods.

Rather than demonstrating the superiority of the proposed algorithm over other existing algorithms, the objective of the study was to show that the proposed algorithm could be a viable alternative for computationally expensive black-box global optimization, which has been shown on a wide variety of benchmark test problems.

5.3. Sensitivity Analysis

In this section, the KSE was tested on various cluster numbers and function transition factors. Table 4 shows the results obtained by the test cases of applying KSE with different hyperparameters including (1) , (2) , (3) , (4) , (5) , and (6) . Case (1)~case (3) were compared to study the influence of cluster number. The comparisons of the other cases as well as case (1) were to research the influence of the function transition factor, during which case (6) presented the results without function transition, indicating that the objective function was always the EI criterion. The initial samples with the size of were the same for all cases. Each test problem has been repeated 30 times.

From Table 4, we could know that case (1) and case (2) achieved really good results on Branin and Goldstein-Price problems and case (3) also obtained competitive results on these two test problems, which showed that the proposed method was not that sensitive to the cluster number for problems with low multimodality and dimensionality. In the era of increasing multimodality and dimensionality, a more comprehensive clustering strategy in KSE was needed to identify the promising local minima. It was shown that case (2) obtained the best results on Hartman3, Shekel5, Hartman6, Shekel7, Shekel10, Shubert, and Trid10 test problems and competitive results on Sphere10 and F16 test problems. But case (3) also obtained the competitive results on Shekel5, SC, Shekel7, Shekel10, and Trid10 test problems.

As for the function transition factor , it seemed that the KSE was not sensitive to the function transition factor with the appropriate cluster number, like . There was little difference among the results obtained by case (1), case (4), and case (5) on all test problems, where the case (1) obtained the best results on 8 test problems, except Branin, Hartman3, Sphere10, and F16 test problems, and competitive results on Branin, Hartman3, and Sphere10 test problems. Despite this, case (6) performed worse on most test problems than the cases with a function transition factor . This suggested that it was important to incorporate the function transition factor as well.

Figure 12 shows the box plots for the number of function evaluations required to get within 1% of the target value of all cases. It was presented that the results obtained by case (1) were better than other cases on 8 of the 12 test problems (all except Branin, Shekel5, Trid10, and Sphere10). And the distribution of function evaluations of the case (1) was more concentrated than that of other cases, although the results of the case (4) and case (5) could be competitive on all test problems, indicating that they were not very sensitive within the current interval.

In summary, these results suggested that cluster number and function transition factor were all necessary for good performance on most test problems, and the KSE was not very sensitive to function transition factor with suitable cluster number.

6. Test on Benchmark Problems

This section applied the proposed KSE method to two aerodynamic optimization problems. To validate the proposed algorithm’s performance, three other global optimization methods were employed for comparison. One was MLSL coupled with mesh adaptive direct search (MADS) [54] algorithm, which was implemented as NOMADm [55], using the surrogate model to make it suitable for computationally expensive problems. Another one was the constraint-importance model-pursuing sampling (CiMPS), which was an extension of the MPS method developed for handling expensive constraints by Kazemi et al. [58]. The last method was the EGO method, one of the most well-known global optimization methods for engineering problems. The default or recommended tuning parameters of these methods were used. Ten runs on each problem have been made by all methods.

In this section, a high-fidelity CFD solver, CFL3D [59], was adopted to evaluate the aerodynamic characteristics of airfoils and wings. The Navier-Stokes function incorporated with shear stress transport (SST) two-equation turbulence model was used to simulate the flow field of airfoils and wings. To perform the aerodynamic design optimization, a more powerful workstation equipped with AMD Epyc 7763 64-core @ 2.45 GHz CPUs and 512 G RAM was used.

6.1. Case I. Transonic Airfoil Design

Case I was provided by the AIAA aerodynamic design optimization discussion group (ADODG: the drag minimization of the RAE 2822 airfoil in transonic viscous flow). The Mach number was 0.734, and the Reynolds number was 6.5e6. The RAE 2882 airfoil was optimized with a fixed life coefficient of 0.824. The pitching moment should be bigger than -0.092, and its area should not be less than that of RAE 2822. The statements of problem are summarized in Table 5. More detail on this case could be referred to in [60, 61]. The problem could be expressed as a standard nonlinear optimization problem:

It was a standard airfoil optimization with medium multimodality [62], which could be used to test the proposed method. The baseline shape was the RAE 2822 airfoil. The design variables were the -coordinates of the CST parameterization with 24 shape parameters in total. The design space is shown in Figure 13, and the computational grid is shown in Figure 14. Generate 100 sample points in the initial design space with the LHS method, and update 300 sample points during optimization progress.

In this case, the EGO method tried to explore everywhere of design space, leading to bad optimization convergence; the CiMPS method was very efficient in the early stage of optimization, but at early stage, the surrogate model was not stable, and it was easily trapped in local minima, leading to optimization premature in this problem; the starting points could easily lie in the neighborhood of the previously found local minima, causing wasted computational cost in researching these well-explored local minima. Figure 15 presents the design results of ten trials of all methods, and most of the design results by the KSE method were better than other methods, indicating that the KSE method was more stable than the other alternatives. Figure 16 shows that the KSE took 17% iterations of MADS to reach the similar drag coefficient value and it converged to the target shape faster. Thus, the KSE method was more efficient and robust.

The best design result of each method was selected to make a comparison. Table 6 presents the result comparison of different methods which indicated that all the constraints were satisfied. Under the same limited objective function evaluations, the computational cost of the MADS was the cheapest, but the KSE obtained better design results, compared to other optimization methods. And the of design result by the KSE was around 110 cts, which was similar to the result in reference [62]; therefore, it was very likely close to the global optima. Figures 17 and 18 show the comparison of the optimized airfoil shapes and pressure coefficient distribution. As we could see, the shape trend of all optimized airfoils was the same: the locations of maximal thickness were moved backward, compared to the baseline, and the radiuses of their leading edges were reduced. The upper surfaces of optimized airfoils were very similar, and the difference mainly occurred on the lower surface, which was the main source of local minima. From Figure 18, we could know that the suction peaks of all optimized airfoils got much stronger and the shock waves have been weakened compared to baseline. However, the KSE nearly eliminated the shock wave while other methods remained an obvious shock wave, leading to bad aerodynamic performance.

6.2. Case II. CRM Wing Optimization

The baseline of Case II was a wing with a blunt trailing edge selected from the CRM wing-body geometry [63], which was representative of a contemporary transonic commercial transport, similar to the Boeing 777. The geometry and specifications were also given by the ADODG, while the fuselage and tail were removed from the original CRM, and the root of the remaining wing was moved to the symmetry plane. This baseline geometry is shown in Figure 19. All coordinates were scaled by the mean aerodynamic chord (275.8 in). The resulting reference chord was 1.0, and the reference area was 3.407014. The moment reference point was located at .

The problem statements are shown in Table 7. The objective was to minimize the drag, subject to a lift constraint () and a pitch moment constraint (). The shape design variables were the -coordinate movement of 72 control points on the FFD volume and 5 twist angles, which are shown in Figure 20. The control points at the trailing edge were constrained to keep the trailing edge unchanged. And the leading edge control points at the wing root were also constrained to maintain a constant incidence for the root section. There were 500 thickness constraints imposed in a 25 chordwise and 20 spanwise grid covering the full span. Each thickness constraint was set to be no smaller than 25% of the baseline thickness at all locations. Finally, the internal volume was constrained to be greater than or equal to the baseline volume. The optimization problem was shown below.

The computational grid is shown in Figure 21. The mesh was marched out from the surface mesh using an O-grid topology to a far-field located at a distance of 25 times the span. The cruise Mach number was 0.85 with a Reynolds number of 5 million based on the mean aerodynamic chord. The mesh generated for optimization contained 15.5 million cells. It was a complicated 3D aerodynamic optimization problem with relatively high dimensionality and multimodality [64].

200 sample points were generated in the initial design space by the LHS method, and 500 sample points were to be updated as optimization progresses. Figure 22 presents the cost-effectiveness comparison and optimized results of ten trials for each method, indicating that the KSE method generally achieved better results within default computational costs and was more efficient, compared to other methods. Figure 23 shows that the KSE held a better convergence speed to obtain the ideal design results. Table 8 summarizes the results of all methods. At the optima points, the lift coefficient and pitch moment coefficient constraints were all met. The discrepancy in the computational cost of all optimization algorithms was not that large under the same number of objective function evaluations. It was shown that the optima by the KSE were the best of all global optimization methods, and the drag coefficient has been reduced by 14.4 counts compared to the baseline. The drag coefficient of the result obtained by the CiMPS method was similar to that of the MADS method, which was slightly better than that of the EGO method.

Pressure distributions on the platforms of all optimized results are shown in Figures 2427, during which the baseline was shown on the left and the optimized results were on the right, respectively. For all the optimized wing results, the lift coefficient targets were met and the pitching moment constraints were satisfied. In the optimized result of the KSE method, the shock was largely reduced, which was achieved by fine-tuning the twist distribution and airfoil shapes. The optimized results of the CiMPS and MADS methods were similar to each other, the low-pressure regions extended, which was beneficial to drag reduction, but there still existed obvious shock across the middle of the wings. As for the optimized result of the EGO method, the shock wave on the inner wing has been eliminated, while on the outer wing, the shock extended to the wingtip.

Pressure distributions on different cross-sections are compared in Figure 28. At , the thickness of the airfoil on the optimized results got thicker while airfoils on other sections get thinner, since the thicknesses were allowed to decrease to 25% of the original thickness and the wave drag was strongly influenced by decreased the thickness. But the volume was constrained to be not smaller than the baseline volume, so the optimizer decreased the thickness of the airfoils on the outer wing while increasing the thickness near the root section. Lyu et al. [63] observed similar trends in their results. Most of the optimized results eliminated the shock at , except for the results obtained by the EGO method. At , , and , the shock has been eliminated on the optimized wing of the KSE method, while for other results, the shocks were still existing. All results shared a similar trend at Y =2.115 m, and the double weak shock replaced the strong shock. And at , the shocks moved forward without obvious weakness.

7. Conclusion

In this paper, a Kriging-based space exploration method (KSE) was proposed for multimodal and computationally expensive black-box problems. This method combined multiple subregion local search strategy with global search to balance and enhance the exploration and exploitation ability. By multiple local search strategy, the exploitations of local minima were enhanced, which was beneficial to promoting the exploration of the entire space and improving the optimization convergence as well as efficiency. During this process, an enhanced trust-region method (ETRM) was adopted to make efficient further exploration of the promising local minima. Hence, this method was expected to perform better than traditional global optimization methods.

In the numerical experiments, the KSE was compared with other global optimization methods on 12 analytical functions. The results indicated that the KSE method was robust and it generally took fewer function evaluations to get the global optima. Then, it was demonstrated for aerodynamic shape optimizations of the benchmark RAE 2822 airfoil and CRM wing with high dimensionality. Generally speaking, it was difficult for the original EGO algorithm to obtain the ideal results as a result of the curse of dimensionality. The proposed KSE could overcome the limitation of the original EGO and be more effective and stable in all optimization cases compared to other alternatives.

However, the current work still had some limitations. When dealing with high-dimensional problems, a very large amount of samples would be acquired to fit a high-quality surrogate model, and plenty of time would be cost to train the hyperparameter and construct the Kriging model, which made KSE inefficient for enormously high-dimensional problems. Besides. KSE might be still trapped in local minima when a large number of local minima made design space appear strong nonlinearity in local regions. Therefore, our future work would focus on training the hyperparameter with fewer samples, and improving the surrogate accuracy in nonlinear local regions, to make KSE more robust with complicated problems.

Data Availability

Code used to support the findings of this study will be made available on request.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Acknowledgments

The authors appreciate the online OPTIP toolkits from G.G. Wang. This work was supported by the Key Laboratory Funding with the reference number 6142201200106.