#### Abstract

Recently, Gaussian Process (GP) has attracted generous attention from industry. This article focuses on the application of coal fired boiler combustion and uses GP to design a strategy for reducing Unburned Carbon Content in Fly Ash (UCC-FA) which is the most important indicator of boiler combustion efficiency. With getting rid of the complicated physical mechanisms, building a data-driven model as GP is an effective way for the proposed issue. Firstly, GP is used to model the relationship between the UCC-FA and boiler combustion operation parameters. The hyperparameters of GP model are optimized via Genetic Algorithm (GA). Then, served as the objective of another GA framework, the predicted UCC-FA from GP model is utilized in searching the optimal operation plan for the boiler combustion. Based on 670 sets of real data from a high capacity tangentially fired boiler, two GP models with 21 and 13 inputs, respectively, are developed. In the experimental results, the model with 21 inputs provides better prediction performance than that of the other. Choosing the results from 21-input model, the UCC-FA decreases from 2.7% to 1.7% via optimizing some of the operational parameters, which is a reasonable achievement for the boiler combustion.

#### 1. Introduction

With the rapid development of data science applied in some traditional industries, the emergence of comprehensive data repositories has resulted in an explosion of information. However, there has been a growing gap between massive data (both useful and useless) and the users’ ability to effectively deal with the information. A large amount of data mining algorithms tries to leap the gap. Some of them have been successfully used in multifarious industrial applications, for example, process control [1], quality improvement in industry [2], enhancement in semiconductor manufacturing [3], improving personnel section and enhancing human capital [4], soft sensors [5], drug design [6], and robotic systems [7].

Among the developed data mining technologies, Gaussian Process (GP), a generalization of a multivariate Gaussian distribution to infinitely many variables, is absorbing more and more attentions because of its flexible nonparametric nature and computational simplicity. GP models are constructed from classical statistical models by replacing latent parametric functions by random processes with Gaussian prior [8]. GP predictors are widely used in nonparametric Bayesian approaches to supervised learning problems [9]. They can also be used as components for other tasks including unsupervised learning [10] and dependent processes for a variety of applications [11, 12]. Different from other algorithms, it can produce the belief evaluation of each prediction. GP has been proved to be useful and powerful for regression in various fields, such as biological systems [13], environmental systems [14], and chemical engineering [15]. Our work focuses on the application in power generation industrial process. Modeling the process of coal fired boiler combustion and optimizing the combustion for high efficiency are based on a GP model.

During the last century, coal has been the primary energy resource and will remain the main fuel for power generation worldwide at least up to 2030 [16]. International Energy Agency (IEA) reported that a total of 85 GW capacity of coal fired power generation plants had come on line during the period of 2010 to 2015 in the world. In China, the growth in coal fired power generation capacity is about 724 GW between 2000 and 2015. Among different types of coal fired power generation units in the world, pulverized coal combustion units account for about 98% of the total electricity produced from coal [17]. As the most important indicators of combustion efficiency, Unburned Carbon Content in Fly Ash (UCC-FA), are usually used to evaluate the efficiency of a pulverized coal fired boiler, high UCC-FA reflects large losses of fuel [18]. Moreover, the performance of electrostatic precipitators is degraded by high UCC-FA because carbon brings high conductivity and loses electric charges quickly. As a consequence, the stack opacity may increase considerably, making it difficult to comply with particulate emission regulations. UCC-FA usually is in the range of 1–12%, but it can be up to more than 20% in some cases [19, 20]. Acar and Atalay’s study [21] showed that the production of coal fly ash was estimated to about 500 million tones in the world for a year and the production of fly ash was forecasted to increase since the global reliance on coal fired power generation would be still heavy in the next few decades. Power plants operators are highly interested in the technologies of reducing UCC-FA for monetary profits. From the view of the basic mechanism of unburned carbon formation in fly ash, both high combustion temperature and high oxygen concentration can reduce the UCC-FA during the combustion process. However, this mechanism for reducing UCC-FA is in conflict with the combustion condition of low NOx emission which requires low combustion temperature and low oxygen concentration. As is well known, NOx is the most alarming pollutant from power plant because it has great impact on the environment and human health. Thereinto, coal combustion for electric power generation is a main source of NOx emission. In China, there are more than 60 percent of the NOx emissions, about 7.7 million tons of NOx, emitted from coal fired power plant boiler due to the large share of coal fired power generation in total power generation [22]. To reduce NOx emission from coal fired power plants, quite a few legislations and regulations are passed to limit NOx emission from coal fired boilers. Besides, the mandatory installations of low NOx technology for boiler have been executed. That means the boiler operator cannot apply their existing models and experiences to achieve low UCC-FA performance as before. Therefore, reconstruction on UCC-FA reducing model becomes very important for these power plants.

For optimization, UCC-FA must be predicted for various parameter settings. Traditionally, there are three paths to achieve the goal: experimental, computational, and indexed [23]. The experimental way is direct but expensive, which requires carrying the combustion experiments in test furnaces or operating plants. The computational way is based on computational fluid dynamics (CFD) codes and simplified mechanism formulations. Building CFD models is a synthetic and time-consuming work and mainly is explored for boiler design or old boiler remoulding. The indexed method involves predicting combustion behaviors using the index that is associated with the nature properties of coals. Mean vitrinite reflectance [24], heat value, volatile matter of the coals [25], and fuel ratio [26] are commonly used as simple indices for burnout prediction. Except for these three traditional ways, data-driven model is gaining popularity for its flexibility as effective methods are discovered and computing power increases.

The main contributions of this article are as follows:(i)We build GP models to predict UCC-FA with selected inputs and covariance function. Different from most of other approaches, the GP model can provide error bar for each prediction, which can evaluate the degree of confidence of the model. Genetic Algorithm (GA) framework is exploited to optimize the GP model.(ii)With the predicted UCC-FA, we present an optimization algorithm for the combustion in order to minimize the UCC-FA in the coming coal fired process. GA also is the optimizer in this procedure.(iii)Based on 670 sets of real data, two models with different numbers of inputs are implemented and the comparison of their performances for UCC-FA is given.

#### 2. Related Works

Many researchers have tried various ways of using data to model the relationship between UCC-FA and combustion operation parameters. Hao et al. [27] introduced artificial neural networks (ANN) to give the model for UCC-FA prediction based on 21 sets of experimental data. One set of data was utilized for combustion optimization. Zhao et al. [28] used backpropagation (BP) neural networks to model the relationship between UCC-FA and 11 boiler operation parameters for a four-corner tangentially fired pulverized coal utility boiler. An 11-23-1 type BP neural network model was used and the prediction error for the model building data was less than 6%. Cai et al. [29] developed a Support Vector Machine (SVM) model to predict UCC-FA of coal fired boiler. The evaluation indices are the root mean square error (RMSE) and mean relative error (MRE) that can give a description of the prediction performance. However, the prediction performance is difficult to evaluate for a new combustion condition. Bian et al. [30] used SVM to model the prediction of UCC-FA and particle swarm algorithm (PSO) to optimize the parameters of the model. The simulation results illustrated highly accurate predictions and a good generalization ability. This work based on experiment data of a 1000 MW ultrasupercritical unit tried a soft-sensing way for the UCC-FA. Zhu et al. [31] combined artificial bee colony algorithm with SVM (ABC-SVM) to build a model to predict UCC-FA for a 1000 MW utility boiler with 48 sets of experimental data and achieved better prediction performance compared with BP model. Zhang and Bao [32] developed an adaptive weighted fusion and least square SVM algorithms. With 30 groups of data, this method achieved a good prediction performance. Their work also tried to realize soft-sensing for the UCC-FA. All above researches conducted confirm that data-driven model is an effective way to predict UCC-FA for boiler combustion. However, most existing methods do not consider the uncertainty in prediction, which therefore may represent over confidence in predictions. This follows from the fact that the given data may not be enough to describe the real forms of the physical properties. As a result, using such models may mislead the optimization process. In practice, we need to know whether the given data can be used for modeling or not. If not, the boiler manager should provide more data via increasing the sensing frequency.

#### 3. Preliminaries

##### 3.1. Boiler

Our research is based on a 330 MW dual-furnaces tangentially fired boiler, adopting the air-staged low NOx combustion technique. This boiler is one of the main boilers in JianBi power plant, which is located in Jiangsu province, China. The dimension of the furnace is about 17 m × 8.475 m × 46 m. The boiler is equipped with four stages elevations of first air burners (AA–DD), five elevations of secondary air burners (A–E), and one elevation of over fire air (OFA) burners. Figure 1 shows the arrangement of the burners distributed in each corner of the furnace. The OFA is extracted from the secondary air or known as F level of secondary air wind. Four midspeed coal grinding mills and corresponding four pulverized coal feeders are employed to supply pulverized coal to these burners. The coal-air mixture streams are directed at the circumference of an imaginary circle at the center of the furnace. This boiler is designed for bituminous coal and equipped with distributed control system (DCS) which can measure and report the operating parameters.

**(a) Dimension of the furnace**

**(b) Arrangement of the burners**

**(c) Cross section of the furnace**

670 sets of real operation data of this boiler are collected from DCS including four levels of first wind speeds, five levels of second wind speeds, OFA speed, rotation speed of four coal feeders, oxygen concentration at the furnace outlet, load, and total air flow rate. Corresponding UCC-FA to each set of data is acquired from UCC-FA monitoring system. Coal quality properties including volatile content, ash content, moisture content, and heating value are obtained from running records. All these data are from normal production process over 5 day and the coal is not changed during this period.

##### 3.2. Gaussian Process

The GP algorithm for supervised learning was popularized by Rasmussen and Williams [33] and was inspired by Neal’s work [34], which shows that Bayesian Neural Networks converge to GP in the limit of an infinite number of units under certain conditions.

Suppose that a set of data points exists ( is the input dimension vector and is the corresponding output). And we assume that each is dependent on a latent variable : , where is the noise. For latent variables, . For regressing , a zero-mean multivariate Gaussian prior distribution is placed over as expressed:where is covariance matrix dependent on and hyperparameter . th element of is , where is a positive definite function parameterised by . is also known as the covariance function. Along with some sets of data and a covariance function, the prediction can be made. To do so we consider a test point and the associated latent variable . Under the GP framework, the joint distribution of and is also a zero-mean multivariate Gaussian:where is the vector derived from the covariance between and the training inputs and . Given the Gaussian noise assumption, we can express the joint distribution over the observed targets and unobserved target :When the joint distribution is assumed as Gaussian distribution, we can give conditional using standard formula: where and .

Hence, with a covariance function defined by , we can calculate a Gaussian predictive distribution for any test point . We can also calculate the multivariate Gaussian predictive distribution for any set of test points as follows:where is matrix of covariance between the training inputs and test points. matrix consists of the covariance between each pair of the test points. More details on how to design and use GP are described in [33].

#### 4. UCC-FA Modeling and Optimization

In this section, a nonlinear model to determine UCC-FA properties using GP is presented and the optimal operation parameter for minimizing UCC-FA is obtained based on the predicted UCC-FA values. The entire process of the system is illustrated in Figure 2.

##### 4.1. Covariance Function

When building a nonlinear GP model, the selection of covariance function is a challenging job and important due to its great influence on the prediction result. To date, there is no theory about how to choose a covariance function for a given problem. In our work, we have tried many different covariance functions, such as radial basis function (RBF), sigmoid function, and polynomial function, to model these data. Prediction performances of models with different covariance functions are compared. The Rational Quadratic covariance function is selected for this work: , where , , , and are parameters of the covariance function.

As the Rational Quadratic covariance function has been selected, there are four parameters (, , , and ) which need to be determined. These parameters have significant influences on GP model, but how to determine or calculate them is a complex problem that has not been theoretically solved. In this study, GA is used to optimize these parameters. When the evaluation function assumes the minimum value, the best values of these parameters are obtained.

##### 4.2. Input Determination

Modeling the UCC-FA for a boiler that has already been fitted with low NOx combustion technologies is a difficult problem because it involves chemical reactions, thermal phenomena such as turbulent flow and turbulent transfer processes, coal particle motion, and turbulent diffusion [35]. Many factors, for example, the coal properties, the combustion system design, and operating conditions, have significant influences on the UCC-FA. For instance, high degree of the moisture content can impair char burnout and reduce the flame temperature, which would increase the UCC-FA. And the carbon content in fly ash commonly contributes to the similar impact on the UCC-FA as moisture content. On the contrary, the volatile content is negatively correlated with UCC-FA. For pulverized coal fired boiler, the particle size also strongly influence the UCC-FA. Smaller coal particle size leads to lower heat transfer resistance and a more cribriform construction, which makes the penetration of oxygen easier, so as to promote char oxidation reactions and reduce the UCC-FA. Firing designed coal is another important reason for decreasing the UCC-FA and improving combustion performance. But with the pressure of competition and tightening emissions regulations, power plant managers concern the cost of fuel sources; sometimes they prefer undesigned coal for financial reasons. This preference strongly influences the combustion performance and burnout.

Except for above factors, the operation of combustion conditions also influences UCC-FA greatly, for example, reducing air/fuel distribution imbalances; adjusting excess air ratio/oxygen concentration; changing the distribution of the first air speeds, the second air speeds, and over fired air (OFA) speed. For a given boiler and a type of coal, the reduction of UCC-FA lies on the adjustment of the combustion operation parameters. A feasible strategy of combustion operation can achieve excellent UCC-FA. The variation in either fuel properties or operating conditions may cause difficulties in coal combustion, leading to UCC-FA increased [26]. The adjustment of the operation parameters (distribution of the first air and the second speeds, oxygen concentration at outlet of the furnace, etc.) can change unburned carbon burnout [36]; keeping air/coal distribution balance among the burners and matching the operating conditions to the coal characteristics are particularly crucial for UCC-FA [27].

Under the analysis above, we should use all 21 inputs including all operation parameters acquired from DCS and coal properties acquired from running records. The input variables are shown in Table 1. However, considering more inputs is computationally expensive. Despite the fact that the modeling step is offline and the runtime can be ignored, the combustion optimization should be an online process in which the latency may affect the performance of the boiler significantly in practice. Therefore, we also need to consider a smaller number of inputs for this reason. With the suggestion from the operators of the boiler, 13 inputs (four levels of the first wind speeds, five levels of the second wind speeds, the OFA speed, the oxygen concentration at the outlet of the furnace, and the load and the total air rate) are contained at least. Thereinto, besides the load and total air rate as the whole background of the combustion, the rest of the inputs are commonly used by operators to tune combustion process for the boiler.

In this article, we propose two models with 21 and 13 inputs, respectively, for UCC-FA prediction. There must be a trade-off between the two models and further discussion would be made later.

##### 4.3. Combustion Optimization

With the predicted UCC-FA values, the combustion can be optimized via adjusting operation parameters. For a given coal fired boiler in the normal production process, some parameters are not adjustable, such as the load and the properties of the coal. The load is the demand from the client and cannot be adjusted by boiler operators. In fact, the boiler operators tune the combustion process according to the load. The properties of the coal are also immutable. Besides, the pulverized coal feeder speed usually is seldom adjusted, if the load is not changed. For the boiler we investigated, the most adjustable parameters are the first wind speeds, the second wind speeds, the OFA speed, and the oxygen concentration at outlet of the furnace. Based on the prediction results of UCC-FA, we also use GA framework to optimize the operational plan (adjustable parameters) to achieve the low UCC-FA. Distinguishing from some other existing works, we use the range of the prediction provided by the error bar of GP model as the composite of the GA, instead of the singular values with weak confidence.

#### 5. Performance Evaluation

In this section, we firstly test the 21-input and 13-input models on the UCC-FA prediction and then give an optimization of the combustion to reduce the UCC-FA based on the best model of the two.

##### 5.1. UCC-FA Prediction

A tenfold cross-validation (10 CV) is applied on the 670 sets of data to build the GP models for predicting UCC-FA properties of the boiler. The training data and the test data are the same for these two models during the tenfold validation model building process. The parameters of GA are set as follows: the population size is 50, the probability of crossover is 0.8, probability of mutation is 0.25, the maximum number of generations is 1000, and the evaluation function is set as the negative log marginal likelihood. The ten mean relative errors of the two models are illustrated in Figure 3. It can be seen that the mean error of the 21-input model is smaller than that of the 13-input model in all 10 trials. Thereby the model with 21 inputs will be selected to optimize the combustion in the next step. The relationship of measured data and predicted data for the selected model for all the 670 sets of data is shown in Figure 4.

##### 5.2. Combustion Optimization

GA is also utilized in the combustion optimization and the parameters are set as follows: population size is 50, probability of crossover is 0.8, probability of mutation is 0.25, maximum number of generations is 1000, and the evaluation function is to minimize the predicted UCC-FA. Table 2 displays the optimization results. The adjustable parameters of Case 1 are configured as normal production process. The adjustable parameters of Case 2 are the results via optimizing the normal combustion condition. The UCC-FA of Case 1 is acquired from continuous monitoring system, and the UCC-FA of Case 2 is predicted using the GP model. UCC-FA decreases from 2.7% to 1.7% in the iterations of optimization as shown in Figure 5.

Comparing Case 1 (manual optimized combustion condition) with Case 2 (optimized combustion condition) in Table 2, we can see that the oxygen concentration of the outlet of the furnace and distribution pattern of secondary winds are changed. The improvement of oxygen concentration is in accord with the mechanism of decreasing UCC-FA for combustion. And the changes of distribution of second wind speed, enhancing the top level (E level) of second wind speed, are also helpful for reduction of UCC-FA according to some experiments carried on this boiler. The changes of first wind speeds are not as obvious as that of second wind speeds because of worrying about combustion safety so as to limit the operational rang of the first wind speed in a very narrow scope. For this boiler, the influence of OFA and distribution of secondary winds on UCC-FA should be considered relatedly, because the OFA is extracted from the secondary wind.

#### 6. Conclusions

Different from many existing researches on the similar target, our work uses GP that can give out error bar for each prediction on modeling the relationship between UCC-FA and boiler combustion operation parameters. GA is exploited in both modeling step and UCC-FA reduction step as the optimizer. The results show that the proposed approach is feasible.

The comparison of the prediction between two models shows that the model with fewer inputs has little larger mean errors. On the other side, with fewer inputs, both offline modeling and online combustion optimization are computational cheaper. Therefore, if the accuracy of prediction can be a little slack, the 13-input model is more suitable for online optimization in practice.

In addition, we do not optimize the pulverized coal feeder speed for the low UCC-FA because the managers of the boiler conservatively prefer to optimize the most tuned parameters. But indeed, the distribution of coal feeder speed has some influence on UCC-FA because it may change the air/coal ratio and the balance of the furnaces. Intuitively, we may get a better result if we also optimize the pulverized coal feeder speed, which would be our future work.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This study was supported by the State Nature Science Foundation of China (no. 61375078; no. 61304211) and China Scholarship Council.