Abstract
When building a surrogate model, it is important to identify a proper mean function for the kriging model. The commonly used variable selection method is the penalized blind kriging (PBK) method. But this method could lead to a low time efficiency, which is not suitable for experiments with timesensitive data. In this paper, we propose three stepbystep approaches for constructing an appropriate mean function to improve the prediction accuracy and time efficiency of the PBK method. Several functions and two engineering examples are used to prove the effectiveness of the proposed methods. From simulation results, we can see that Method 1 (M1) and Method 2 (M2) have been significantly improved in both the prediction accuracy and the time efficiency compared with PBK. Especially, in the Test function, compared with the traditional PBK method, the prediction accuracy of M2 is improved by and , respectively, under the penalty of Lasso and Elastic Net, and the time efficiency of M1 is improved by and , respectively, under the penalty of Lasso and Elastic Net. In addition, Method 3 (M3) has been significantly improved in prediction accuracy compared with PBK.
1. Introduction
The surrogate model has been widely used to replace computationintensive engineering simulation models or blackbox systems [1–3]. There are several commonly used surrogate models, such as kriging, support vector regression, and radial basis functions [4]. Their application fields involve the structural dynamics of aeroengine casings, dynamic reliability estimation of turbine blisk, and computationally expensive constrained optimization problems.
The kriging model can be used to fit computer experiments because of its interpolate character. Its error estimation can be extended to many aspects such as failure probability estimation [5], robust design optimization [6], uncertainty analysis [7], and global sequential sampling [8]. As the combination of mean function and stochastic Gaussian process (GP), the kriging model is called ordinary kriging (OK) if using a constant mean to fit the overall trend or universal kriging (UK) if supposing some specified variables in the mean function. However, their prediction accuracy will be reduced if the mean function is specified inappropriately [9]. Computer experiments usually include a large number of input variables, inactive variables which hardly have an impact on the response tending to reduce prediction accuracy; thus, it is a matter to identify active variables into a mean function [10].
In this paper, we consider the problem of variable selection for the kriging model. We want to use the method of variable selection to construct an interpretable kriging model which can select the active variables and remove inactive variables with high prediction accuracy and high time efficiency. The commonly used variable selection method in the kriging model is the penalty likelihood method. For the penalty likelihood method, Hung [11] proposed the penalized blind kriging (PBK) approach to select variables. This method, however, suffers from the limitation of low time efficiency and prediction accuracy. In the field of engineering, because of the time sensitive of data, researchers always want to improve time efficiency greatly while maintaining preferable prediction accuracy. Therefore, efficient variable selection methods are called for.
The idea here is to improve the time efficiency and prediction accuracy of PBK using stepbystep approaches. Three new approaches are proposed for variable selection. The first method is to select the variables using a linear model at the first and then estimate the parameters in the kriging model. The second method is first to estimate the parameters with the OK model and then select the variables via the penalized kriging model and finally refit the kriging model. The third method is an improvement of PBK, which is a refitting based on the variable selection of PBK. These methods are proved to improve the prediction accuracy and time efficiency by several analytic functions and two examples.
The remainder of the paper is organized as follows: in Section 2, three new methods are proposed for variable selection. Section 3 illustrates the effectiveness of proposed methods with four simulated examples and compares the prediction accuracy, identification, and time efficiency. Section 4 implements the proposed methods for two engineering examples. Section 5 provides some discussion and concluding remarks.
2. StepbyStep Variable Selection Methods
In this section, we first review some of the details; then, three methods are proposed for variable selection based on PBK methods.
2.1. Kriging Preliminaries
The kriging model [12] assumes that the response is a realization from the stochastic process:where is the basis functions, and is a vector of unknown coefficients, respectively. The stochastic process has mean 0 and covariance , where is a correlation function between and depending on parameter . In most engineering applications, the most commonly used correlation function is the stationary Gaussian correlation function , which is chosen in our research.
Given some sampled points and the responses , the best linear unbiased predictor (BLUP) can be obtained as follows [12]:where is a correlation between the unsampled point and the sampled points ; the matrix is constructed by at the sampled points.
The loglikelihood function can be written as follows:Then, the parameters , , and could be estimated via maximum likelihood estimation (MLE). For given correlation parameters , the MLE of is as follows:and the MLE of is as follows:
Since there is no analytical solution for the parameters , optimization methods are needed to obtain as follows:
Finally, the estimated parameters are substituted into equation (2) to obtain the predictor of responses for the unsampled points.
2.2. StepbyStep Variable Selection Methods
Although the kriging model has a character of interpolator due to its stochastic process, penalty methods still could be used to select the important variables from the candidates to improve prediction accuracy [13].
Two penalty functions are used for variable selection in the PBK model by Hung [11], i.e., Lasso [14] and adaptive Lasso [15]. It estimates the regression coefficients by minimizing the negative penalized loglikelihood function.where is the regularization term called “penalty function,” and is the regularization parameter. In order to efficiently obtain the estimation of parameters in the penalty likelihood function, Hung [11] proposed the iteratively reweighted least angle regression (IRLARS) algorithm for Lasso penalty and adaptive Lasso penalty.
Besides above penalty methods, Elastic Net penalty proposed by Zou and Hastie [16] also could be used for variable selection. It has regularization parameters and . The Elastic Net estimation of the parameters can be solved by the following formulation:where and . Thus, it can be seen that the penalty function of Elastic Net is a convex combination of ridge regression and Lasso. Particularly, the Elastic Net penalty becomes ridge regression when , and the Elastic Net penalty becomes Lasso penalty when . According to the IRLARS algorithm for Lasso penalty, we describe the IRLARS algorithm for Elastic Net penalty in Algorithm 1.

Regularization parameters are obtained based on 10fold cross validation, and the estimation of parameters are obtained based on the IRLARS algorithm in this research.
Although PBK effectively reduces the prediction error, further improvements are needed for low time efficiency. Thus, we propose three stepbystep variable selection methods to improve the prediction accuracy and time efficiency. We describe them in detail as follows:
2.2.1. Method One (M1)
We integrate penalty into the linear model firstly, where ; then, we select active variables by solving the penalized likelihood function for :where is obtained based on cross validation. Then, selected basis functions corresponding nonzero regression coefficients are substituted into the Kriging model (1) as new variables, and correlation parameters are updated as . are estimated by formulas (4)–(6), respectively. Finally, the estimated parameters are used to predict the responses of unsampled points by substituting into formula (2), and the predicted model can be expressed as follows:
The proposed method M1 is given in Algorithm 2.
2.2.2. Method Two (M2)
Firstly, we estimate the parameters and with the ordinary kriging model by solving the following formulas:where is an vector of ones. Then, we can select active variables by minimizing the negative penalized loglikelihood function and update correlation parameters. Finally, the new kriging model is fitted similarly to that in M1.The proposed method M2 is given in Algorithm 3.

2.2.3. Method Three (M3)
Firstly, the PBK method is used to select active variables based on the kriging model via minimizing the negative penalized loglikelihood function (7). It considers correlation between sample points, which is different from M1. Then, the selected variables based on PBK were substituted into the model (1) as new variables, and correlation parameters are updated. Finally, the estimated parameters estimated by formulas (4)–(6) are used to predict the responses of unsampled points by substituting into (2).
The variable selection part of M3 is using IRLARS of PBK in Algorithm 1. The fitting steps are the same as Steps 2–4 in Algorithm 2 after selecting variables. We will not go into details here.
Theoretically, M1 and M2 do not need the iterative step of parameter estimation of such as PBK in variable selection; thus, they can improve time efficiency greatly. In addition, the last step of the proposed three methods is to update and refit the kriging model, so the prediction accuracy could be improved.
3. Simulation
The effectiveness of the proposed methods is proved in terms of simulation performance. We validate the simulation performance of the methods compared with back propagation (BP) neural network, generalized regression (GR) neural network, and PBK method according to time efficiency and prediction accuracy. We use the following indicators to measure time and prediction accuracy.(a)The root mean square prediction error (RMSPE) is defined as where is the number of testing samples. The mean of RMSPE (MRMSPE) of the simulations is calculated to evaluate the prediction performance.(b)The time of consumption (CPU) shows the uptime of the system. Especially, we need identification to measure their accuracy of variable selection in linear functions using the following indicators:(c)The average number of inactive effect identified rate (IEIR).(d)The average number of active effect identified rate (AEIR).(e)The average size of identified mean function (MEAN).
Obviously, when AEIR is larger, MRMSPE, IEIR, and CPU are smaller, and if MEAN is closer to the target, the model is better.
Four analytical functions are displayed in the following equations:(1)A known function [11]:(2)Test function [17]:(3)Gramacy and Lee function [13]:(4)Borehole function [18]:
A known function (KF) includes twelvedimensional inputs, where the first six variables decrease the effects on the responses, the remaining variables , are unrelated (i.e., zero coefficients) to the outputs, and has mean 0 and covariance , where and . The Test function (TF) includes twelvedimensional inputs, where the last six variables are irrelevant to the outputs. The Gramacy and Lee function (GL) includes sixdimensional inputs, where the last two variables are irrelevant to the outputs. The Borehole function (BF) includes eightdimensional inputs. And the nonlinearity gradually gets more obvious from the function KF to function BF. In function KF, we consider 60, 80, and 100 sample points for training in order to compare the effect of different sample sizes. In function TF, GL, and BF, we consider 100 sample points for training in order to compare the effect of nonlinearity. 1,000 sample points are randomly selected as testing data. We evaluate the accuracy of prediction, identification, and time efficiency based on 500 simulations.
In order to compare the effectiveness of the proposed methods, simulation studies are conducted to evaluate the performance of the BP neural network and GR neural network methods as other surrogate models firstly. The number of neurons in the hidden layer of the BP neural network is the optimal number of neurons among the 15 candidate neurons. The transfer functions from the input layer to the hidden layer and the hidden layer to the output layer choose “logsig” and “purelin,” respectively. The simulation results for four functions are given in Table 1.
Due to the lack of interpretability of a neural network, neural network methods cannot select variables. In this paper, we will focus on variable selection for the kriging model. We carry out simulation studies using PBK, three methods in Section 2 with Lasso penalty and Elastic Net penalty. The firstorder polynomial basis function and the stationary Gaussian correlation function are used. And Latin Hypercube designs [19] are used to select sample points for training data in the design domain because they can be produced with minimal computational cost, and the design space is fulfilled well. The simulation results for four functions are given in Tables 2–6, respectively.
The principal results of these simulations are as follows:(1)According to Tables 2–6, the prediction accuracy of the proposed three methods is better than that of the PBK, BP, and GR neural networks, where M3 is the better one. In addition, their high prediction accuracy is obvious with escalating nonlinearity. Particularly, we take Table 4 as an example; the RMSPE for the predictors with Lasso penalty and Elastic Net in M3 gets and smaller than the PBK predictors, respectively.(2)According to simulation results of the function KF (i.e., Tables 2 and 3), the method M1 could identify more variables, which lead to its highest AEIR and IEIR. Although AEIR of M2 is not high, its IEIR is the lowest and MEAN is close to the target. As to M3, its identification is basically the same as PBK. It indicates that M2 and M3 have better identification in variable selection.(3)In the aspect of time, the efficiency of M1 and M2 is obviously better than that of PBK, BP, and GR neural networks, where M1 is the most efficient. But the efficiency of M3 is little worse than that of PBK. Particularly, we take in Tables 2 and 3 as an example the time of M1 with Lasso penalty and Elastic Net penalty gets low to and smaller than PBK, respectively. For other cases, the optimization of time efficiency of M1 is almost higher than .
In summary, the BP neural network is mostly worse than the traditional PBK method in the accuracy of prediction, and it is not faster than the PBK method with Lasso penalty in the time efficiency. Although the GR neural network is faster than the PBK method in the time efficiency, it is mostly worse than the traditional PBK method in the accuracy of prediction. We take in function KF as an example; the RMSPE for BP neural network, the GR neural network and PBK is 0.107 7, 0.108 9, and 0.065 8, respectively, and their time is , , and . In addition, due to the lack of interpretability of BP and GR neural networks, these methods do not have identification. Thus, the neural network is not the preferred candidate method in variable selection methods compared with PBK. Although M1 greatly improves time efficiency by reducing the iterative steps of parameter estimation of , it identified more inactive variables than PBK, which lead to its high AEIR and IEIR, as the correlation between sample points is not considered in the variable selection. As to M2, it not only maintains the advantage of M1 in time efficiency but also makes up for the shortcomings of M1 by considering the correlation between sample points. Therefore, M2 is superior to M1 in both identification and prediction accuracy and can be used to replace PBK. Although M3 is refitted on the basis of the variable selection of PBK, which leads to its low time efficiency, it has a significant improvement in the prediction accuracy. Therefore, we can use M3 as an improved method of PBK.
4. Empirical Application
4.1. Circuit Simulation
The proposed methods are applied to the circuitsimulation code [12]. Six input variables and one response are in this experiment. The simulation experiments were conducted 32 times. This data are also used to analyze the relationship between output and input [20]. It can be useful for illustrating the prediction accuracy and time efficiency. The candidate variable set includes all linear effects. To compare the prediction accuracy and time efficiency of different methods, the dataset was separated into two sections: half of the data are used for training and the other half for testing. The experimental results with Lasso and Elastic Net penalty are listed in Tables 7 and 8, respectively.
From the second and third columns of Tables 7 and 8, we could find the prediction accuracy of M1 (0.220 7 and 0.2207), M2 (0.2207 and 0.2207), and M3 (0.1520 and 0.1591) which can achieve better prediction than PBK (0.286 9 and 0.274 0), and the prediction of M1 and M2 are all improved by and , those of M3 are improved by and relative to PBK.
As displayed in the last two columns of Tables 7 and 8, we could find the operation time of M1 (0.37 s and 0.81 s) and M2 (0.43 s and 1.08 s) which are less than those of PBK (2.26 s and 8.65 s) and M3 (2.29 s and 9.01 s), and the approximation efficiency of M1 is improved by and , those of M2 are improved by and relative to PBK.
Therefore, M1 and M2 can be used as variable selection methods to replacing the traditional PBK method according to prediction accuracy and time efficiency for fitting small samples. And M3 can be used as an improved method of PBK due to better prediction accuracy.
4.2. Piston Clap Noise
Piston secondary motion would cause an engine noise which is unwanted. It is the departure of a piston from the nominal motion prescribed by the mechanism. Six variables were used to minimize the piston clap noise. The variables are the cylinder liner , location of peak pressure , skirt length , skirt profile , skirt ovality , and pin offset , respectively.
The relevant dataset of the example is derived from Fang et al. [2]. The data include 100 observations, 6 input variables, and the candidate variable set includes all linear effects. In this paper, we divide the 100 data into 5 sets, select one of the sets as the test set for calculating RMSPE, and others as the train sets; this process is repeated 5 times and train sets are different every time. PBK, M1, M2, and M3are compared, and the running time is recorded. The simulation results with Lasso and Elastic Net penalty are shown in Tables 9 and 10, respectively.
From the second and third columns of Tables 9 and 10, we could find the prediction accuracy of M2 (0.5466 and 0.5466) which can achieve better prediction than PBK (1.0069 and 0.9942), M1 (0.8316 and 0.7108), and M3 (0.6357 and 0.5466), and the prediction of M1 is improved by and , those of M2 are improved by and , and those of M3 are improved by and relative to PBK.
As displayed in the last two columns of Tables 9 and 10, we can find that the operation time of the M1 (2.11 s and 5.75 s) and M2 (5.20 s and 16.89 s) which are less than those of PBK (18.50 s and 189.45 s) and M3 (21.66 s and 207.76 s), and the approximation efficiency of M1 is improved by and , and those of M2 are improved by and relative to PBK.
Therefore, M1 and M2 can achieve the goal which replace the PBK method both in terms of prediction accuracy and time efficiency. And M3 can be viewed as an improvement of PBK.
5. Conclusion and Discussion
In the field of engineering, we always use a simple surrogate model to fit the complex true model for time saving and interpretability [21]. In order to improve the accuracy of prediction and save the cost of time for the PBK method, three new methods are proposed to obtain the trend function to improve the prediction accuracy and time efficiency. M1 is to select the variables using the linear model at the first and then to estimate the parameters in the kriging model. M2 is first to estimate the parameters with the OK model and then to select the variables via the penalized kriging model and finally refit kriging model. M3 is an improvement of PBK, which is a refitting based on the variable selection of PBK.
Several analysis functions with escalating nonlinearity and two engineering examples with different sample sizes are used to test our proposed stepbystep variable selection methods. The experimental results show that M2 which is superior to PBK, BP, and GR neural networks and other methods can be used as a favorable method of variable selection in both prediction accuracy and time efficiency. If we want to pursue higher time efficiency while maintaining certain prediction accuracy, M1 might be a good choice. As well, M3 could be used as an improved method of PBK due to good prediction accuracy.
Data Availability
The data used to support the findings of this study are included in this paper.
Disclosure
Ziheng Feng and Min Li are cofirst authors.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Nos. 11871294 and 12101346) and Shandong Provincial Natural Foundation, China (Nos. ZR2021QA053 and ZR2021QA044).