Abstract
The fluctuation of industrial process operation parameters will severely influence the production process. How to find the robust optimal process operation parameters is an effective method to address this problem. In this paper, a scheme based on datadriven model and variable fluctuation analysis is proposed to obtain the robust optimal operation parameters of industrial process. The datadriven modelling method: multivariate Gaussian process regression (MGPR) based on Bayesian statistical learning theory can map the process operation parameters to objective performance with the flexibility in nonparameter inferring and the selfadaptiveness to determinate hyperparameters. According to the minimum variance criterion, the parameter fluctuation analysis can be performed through multiobjective evolutionary algorithm based on the MGPR model. To analyze the robustness influence of a single parameter, cross validation is applied to evaluate the model output with 2% fluctuation. After that, the robust optimal process operation parameters can be obtained and applied to guide the production. The effectiveness and reliability of the proposed method have been verified with the hydrogen cyanide production process and compared with other model methods and single objective optimization method.
1. Introduction
With increasing attention paid to controlling production quality and costs in industrial production, various studies on reducing the costs and increasing production benefits have been widely explored in recent years [1, 2]. The operation parameter optimization is an effective method to promote the profit of industrial production processes. It works on a process parameter optimization algorithm under the built process model to choose the optimal operation parameter [3, 4]. The critical step of the operation parameter optimization method is to build a precise production process model. But most of the industrial processes are nonlinear, and it is difficult to describe them with a mathematical model. In recent years, with the rapid development of automation and digitalization, abundant data have been kept in most of industrial production processes. The data carries production information which can reflect the operation condition. The datadriven method based on machine learning is an effective analysis method and has been widely used in many papers [5–7]. It can describe the system model without too much prior knowledge. Based on the datadriven model, the optimization method can be performed to optimize the operation parameter. Because the industrial production process generally has nonlinear characteristics, the traditional optimization methods are not suitable for obtaining the optimal operation parameters. The evolutionary optimization algorithms can find the optimal solutions without knowing too many complicated mathematical models [8–11].
After the optimal operation parameters are obtained and set in the industrial production equipment, the deviation may happen due to the harsh environment and equipment performance degradation. Because most of the optimal objectives are very sensitive, the real objective performance tends to have some fluctuations. Therefore, fluctuation analysis is needed. The robustness of parameters can reflect the fluctuation characteristic, and the robust optimal operation parameters can decrease the fluctuation of objective performance. There are many studies focusing on robust optimization: A sigma point method for robust multiobjective dynamic optimization of chemical processes was presented in reference [12]. A parametric uncertainty Bayesian description method was used for optimizing the chemical processes by solving a robust optimization problem in reference [13]. By applying the Taguchi method, a robust design method was used for optimizing the bakers’ yeast process in reference [14]. In the above methods, robust optimization needs to consider both optimal operation parameters performance and robustness metric at the same time. Hence, the fluctuation analysis of operation parameter is a multiobjective problem. Multiobjective evolutionary optimization algorithm, through imitating biological evolution process and searching for Pareto frontier, can efficiently and quickly solve multiobjective optimization problems.
In this paper, a scheme based on datadriven model and fluctuation analysis is proposed to obtain the robust optimal operation parameters of industrial processes. The Gaussian process regression based on the Bayesian statistical learning theory can build the model between the operation parameter and objective performance with the flexibility in nonparameter inferring and the selfadaptiveness to determinate hyperparameters. This datadriven method does not need to include too much information about the operating mechanism. To analyze the fluctuation of parameters, 2 objective functions are designed to perform the multiobjective method NSGAII: (1) maximization of the expected objective performance and (2) minimization of robust metric based on the minimum variance criterion. To analyze the single variable robustness influence and obtain the more accuracy decision information for production, cross validation is applied to evaluate the mean of model output with 2% fluctuation. The effectiveness and reliability of the proposed method have been verified through the hydrogen cyanide (HCN) production process.
This paper is organized as follows: the datadriven model Gaussian process regression method, NSGAII, multiobjective robust optimization design criterion, and fluctuation analysis are introduced in Section 2. The verification experiments and comparison results are given in Section 3. Section 4 presents the conclusion of the proposed method.
2. Robust Optimization of Operation Parameter Design
2.1. DataDriven Multivariate Gaussian Process Regression Theory
Multivariate Gaussian process regression is a GPR model with multiple inputs and multiple outputs. Optimization modelling means that a model is built to calculate the optimal solution. and are the input and output vector of the training data, where .
The Gaussian process regression mode is to build a model between input vectors and output vectors. If a new sample is available, the predicted can be obtained based on the model built.
The Gaussian process regression model [15] assumes that there is a zeromean Gaussian prior distribution regression function; the function is shown in the following equation:wherein is the covariance matrix with its element ; here, is the commonly used squared exponential covariance function. The function is given in the following equation:where only when i = j otherwise, , , and l is length scale, and and are signal and noise variance, respectively.
An appropriate hyperparameter set is crucial for a GPR model to make the prediction of a dominant variable more accurate. Hence, the hyperparameters need to be optimized in the training process through maximizing the likelihood function; it is shown in the following equation:
Once the optimal hyperparameter set is obtained, the GPR model is available to make a prediction of the distribution of for the corresponding . If comes, according to the property of the multivariate Gaussian distribution, the posterior distribution of the output can be obtained, where and are the mean and variance of the distribution which can be calculated in the following equations:
Finally, the expectation of the posterior distribution is taken as the predicted result of the GPR model.
The accuracy of the optimization model is assessed by using the mean square error (MSE) and the mean average relative error (MARE) of the test samples.where is the expected output, is the predicted output of the built GPR, and N is the length of data.
2.2. Robust Optimization and Criterion Design
The robustness of the parameters can be illustrated in Figure 1. Both A and B are operation parameters in the parameter solution set. When the parameters in A have a fluctuation ΔX, the objective performance will decrease ΔY_{1}. When the parameters in B have a fluctuation ΔX, the objective performance will decrease ΔY_{2}. It can be seen that the ΔY_{2} is less than ΔY_{1} which means that B is much robust than A, and B is the robust parameter and A is the optimal parameter in solution set.
As previously mentioned, the multiobjective fluctuation analysis has 2 objective functions: (1) maximization of the expected objective performance and (2) minimization of robustness metric based on the minimum variance criterion. Thus, the criterion for robustness is illustrated in the following equations:where objective 1 is objective performance of a model. In this paper, it is the MGPR predictive value . is the robustness metric in objective 2. Objective 2 is created with reference to the minimum variance criterion; it is assumed that is the optimal solution, and related objective values of MGPR is . A number of operation parameters are randomly generated within 2% of to calculate the using the MGPR model together with these operation parameters, and the variance of is the robustness metric . According to the minimum variance criterion, the smaller the variance, the better the performance. To get the robust optimal operation parameters, the values of objective 1 and objective 2 should be maximized and minimized, respectively.
2.3. NSGAII
Nondominated sorting genetic algorithmII (NSGAII) [16, 17] is a multiobjective algorithm which is upgraded from NSGA by simulating the search space biological evolution. It is an automatically searching algorithm with the parallel global search strategy. NSGAII has a super performance in global searching space. Compared to NSGA , NSGAII comes with the crowded comparison operator and fast nondominated sorting approach. NSGAII has (where m is the number of objectives and N is the population size) with the computational complexity better than that of NSGA .
The main steps of the NSGAII are briefly described as below: Generate random parent population P_{t}, and do genetic algorithm operation to get offspring population Q_{t}. The size of parent population and offspring population is N. Put P_{t} and Q_{t} in the variable R_{t}, and the size of R_{t} is 2N. Calculate the population with nondominated ranking to structure different levels’ nondominated set F_{1}, F_{2}, and F_{3}. Calculate the crowding distance of each of population as required, and then determine the until the size of is N.
NSGAII uses the crowding distances to select the parent population for a new individual. The crowding distance is formulated as
The new population is selected depending on the comparison of the congestion of parameters. The crowding distance is preferably larger so as to maintain the diversity of parameters space.
2.4. Robust Optimization Design of Operation Parameters
Based on MGPR algorithm, the industrial process operation parameter takes X as input and Y as output to build the process model. NSGAII is used for obtaining the robust optimal parameters of a nonlinear production process on the training model. Robustness optimization is considered as a multiobjective problem. Both objectives 1 and 2 in equations (7) and (8) are the fitness functions of NSGAII. The flow chart is shown in Figure 2 (Algorithm 1).
To illustrate it clearly, the presented algorithm is summarized as below.

3. Experiments and Discussions
3.1. Production Process of Hydrogen Cyanide
In this paper, the effectiveness of the proposed method is verified through the hydrogen cyanide (HCN) production process. The chemical compound HCN a volatile and colorless liquid has a highly toxic bitter almondlike odor [18]. It is widely used in the production of organic glass and metal cyanide. The raw materials for HCN production include ammonia, natural gas, and air. These raw gases still need to be purified, mixed, oxidized, and pickled. The detailed steps are described in reference [18]. As the equipment is exposed to the air during the production process, the production will be influenced by temperature, humidity, aging equipment, raw materials, and many other uncertain factors. The objective performance of the production process is measured by the conversion rate of HCN. According to the information related to HCN production process, 9 operation parameters were selected as decision parameters, as shown in Table 1.
3.2. MGPR Process Model
With the help of the MGPR, the HCN model was built with the 9 operation parameters as the input variables and the conversion rate of HCN as the output. As the environmental humidity and temperature have moderate changes from March to June, the experimental data are obtained from 1918 samples taken from March to June of 2017. The MGPR model is based on statistical learning theory, so that the required training sample size should not be too small. In this experiment, 30, 45, and 55 samples were applied, respectively, as the testing samples, and the rest of data were used as training data.
The kernel function in MGPR is called squared exponential function, as shown in equation (2). The predicted results and percentage errors of the testing samples are shown in Figure 3.
Figure 3 shows the predicted results and the percentage errors of MGPR. It is worth noting that all of the three percentage errors are less than 0.02%, indicating that the model has a good generalization performance. The block line in Figure 3 is the 95% confidence range of the predicted value. In order to show the prediction effect and assess the accuracy of the model in a more direct manner, the MSE and MARE result of 30, 45, and 55 testing samples were calculated, as shown in Table 2:
It can be seen from Table 2 that MSE and MARE are less than 0.2%, indicating that the MGPR has a good performance in building the model of HCN production process. The generalization performance of the model is very high, and the MGPR model can be applied to followup robustness optimization.
3.3. Fluctuation Analysis: Robust Optimization
For obtaining the robust optimal parameters, fluctuation analysis of parameters is required. In the robustness optimization of HCN process based on the MGPR model, the output of the MGPR model and the created robustness criterion are used as the fitness functions of the multiobjective optimization function NSGAII. The two objective functions are contradictory. The stronger the model output, the weaker the robustness metric and vice versa. NSGAII can be used to search the Pareto frontier to get the best operation parameters about the MGPR output and robustness metric. The 9 operation parameters need to be optimized, and each of them has a range of value as shown in Table 3.
Considering the computation time and computation complexity, the population, the max iteration algebra, the crossover probability, and the mutation probability are set as 200, 500, 0.7, and 0.1, respectively. Figure 4 shows the result of 500generation NSGAIIoptimized Pareto frontier.
From this Figure, it can be seen that the value of objective 1 is from 57 to 70, while the value of objective 2 is from 0 to 9. In this paper, the maximum of is considered to be the robust optimal solution . To explain the model robustness by robust optimization, here, 100 operation parameters are randomly generated with , where is the robust optimal parameter and is the range of optimal parameters which can be 2%, 5%, 10%, 15%, 20%, and 30%. The mean of HCN model output is the performance indicator under different fluctuation conditions for analysis. The fluctuation of the robust optimal parameters is shown in Figure 5:
In Figure 5, the red point is the robust optimal solution. When the operation parameter has a fluctuation of 2%, all of the objective performances (HCN conversion rate) are larger than 69%. When the fluctuation is 5%, all of the objective performances are larger than 67.5%. Even when the fluctuation is 30%, all of objectives are larger than 68%. It shows that all robust optimal solutions have a good response to fluctuations in the parameter.
For comparing the fluctuation performance of global optimal operation parameters, the paper discusses the robustness of global optimization. The global optimal operation parameters are obtained by genetic algorithm. The fluctuation results of global optimization are shown in Figure 6.
In Figure 6, it can be found that when the global optimal solution fluctuates, the objective performance has a big fluctuation. When the global optimal operation parameter has a 2% fluctuation, the worst objective performance is 66.1%. As the fluctuation increases, the conversion rate decreases approximately to 60%. This indicates that the robustness of the global optimal solutions is poor.
To explain the effect more intuitively, the variance of the fluctuation solutions and the mean of objective solutions are calculated. In Table 4, the comparison results of robust optimal solution and the global optimal solution are given.
In Table 4, it can be found that the objective performance of the robust optimal solution is 69.68% and the objective performance of global optimal solution is 72.67%. When the solutions have fluctuations, the objective performance of robust optimal solution is better than that of the global optimal solution.
Based on the historical data of HCN production system, the fluctuation of each parameter is usually within 2%. To analyze the robustness influences of a single parameter, cross validation is applied to evaluate the model output with 2% fluctuation. Perform one parameter fluctuation analysis of 2% and keep the other parameters unchanged. 100 operation parameters are randomly generated from , and is the robust optimal solution of NSGAII algorithm. Calculate the mean and variance values of 100 operation parameters output by GPR. The results are shown in Table 5.
According to the criterion, the minimum value of has the maximum robustness, PN has the maximum robustness, and PP has the minimum robustness. The information can be used to guide the production.
3.4. Comparison Experiment
To further verify the effectiveness of the proposed method, this paper performs a comparison experiment to analyze the superiority performance.(1)Comparison with other modelling methods.
This paper compares the MGPR model with backpropagation neural network (BPNN) and support vector machine (SVM) modelling methods. The details of BPNN and SVM are described in reference [19]. The model settings are consistent with those of the MGPR model. The results of 45 testing samples are shown in Figure 7.
As can be seen from the results in Figure 7, MGPR has the best performance to build a production process model. The performance of BPNN and SVM is worse than that of the MGPR model. This indicates that MGPR is better than BPNN and SVM in HCN production process modelling. The MSE and MARE of the 3 model methods are shown in Table 6:
The literature [20] employs a weighted method to perform signalobjective robustness optimization. Therefore, GA is applied to optimize the robust operation parameters. The experiment employs the GA to search for the optimal parameter , and then uses to randomly generate 100 solutions to assess the robustness of optimization parameters. The settings of single objective are consistent with those in the reference. The comparison results are shown in Table 7.
Table 7 shows that the robustness of single objective weighted method is worse than that of NSGAII. This indicates that NSGAII exhibits a better performance in the robustness optimization of nonlinear chemical process operation parameters.
4. Conclusions
This paper proposes a robust industrial process parameter optimization method based on the datadriven model and fluctuation analysis. The datadriven method MGPR is applied to model, the production process between the operation parameters, and the output objective performance. The results of the models show that it has an excellent generalization performance. The robustness criterion can be established based on the model output and the minimum variance criterion. Robustness criterion and model output can be used as the fitness function of NSGAII for robustness optimization and fluctuation analysis. To analyze robustness influence of the single parameter, cross validation is applied to evaluate the model output with a 2% fluctuation. It has been verified through the HCN production process that this robust optimal solution shows better robustness than the single objective optimal performance. The comparison between other modelling methods and the single objective weighted method also reveals the better performance of the proposed method.
Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by the Scientific and Technological Research Program of Chongqing Municipal Education Commission, KJZDK201801502.