#### Abstract

Straw fermented fuel ethanol is a complex process with multivariable, large lag, and strong nonlinearity. It is difficult to directly measure the key parameters such as ethanol concentration and cell concentration online. Aiming at the problem, a soft sensing model of straw fermented ethanol based on improved support vector regression (SVR) is proposed. Based on the analysis of the process of ethanol production from straw fermentation, the Bayesian method is used to optimize the support vector regression (BSVR). And the concepts of generation a priori and generation likelihood are introduced to optimize the data prediction model. The comparative experiment of model training and testing is carried out. The simulation results show that the proposed BSVR method is better than SVR. It can improve the generalization ability of data and the anti-interference of the model, and its prediction accuracy and stability are higher.

#### 1. Introduce

In recent years, due to the rapid development of biological fermentation technology, the biological energy is more and more applied in the public life. Bioenergy includes synthetic methane, ethanol, and butanol [1], most of which can be used for fuel. China is a big agricultural country. About 932 million tons of straw is left every year. All kinds of straw are the substrate of biological fermentation and also the main raw material of biological energy. Recycling of straw, on the one hand, can prevent the environmental pollution caused by the burning of straw. On the other hand, it can produce biological energy which is conducive to environmental protection, to benefit the country and the people. At present, it had been studied on using straw as raw material for many years. Back in 2004, Iogen Company had chosen the straw to convert it into ethanol [2]. Influenced by the factors of fermentation technology and control technology, the production yield of straw fermentation for ethanol is not yet high, and the low production capacity limits the process of industrialization and civil use. The process of producing ethanol by straw fermentation is a complex multivariate, strongly coupled, and strong nonlinear system, influenced by many environmental conditions such as temperature, initial PH, dissolved oxygen (DO), strain quantity, rotational speed, and organic acid [3]. The mechanism is complex, and the precise model of the accused object is difficult to describe. In the process of ethanol production, some key parameters (such as cell concentration, matrix concentration, and ethanol concentration) that have a great impact on the quality of the fermentation process are difficult to be measured directly [4–6]. At present, the key parameters are obtained through periodic offline assay, analysis, and measurement in the laboratory, which results in variable information lag and makes it difficult to optimize and control the fermentation process in real time. Moreover, the offline sampling process increased the risk of bacteria dyeing during the fermentation process, resulting in a decrease in ethanol concentration yield and quality [7, 8]. Therefore, for the above problems, seeking achievable online measurement of key parameters has important significance to improve ethanol production and quality, as well as to promote the industrial production of fuel ethanol.

Soft measurement technology had been used in the biological fermentation industry. However, the soft measurement techniques are different for different fermentation processes. Modeling should be conducted according to the specific fermentation process [9]. The accuracy of the model directly affects the accuracy of the soft measurement and thus affects the efficiency and quality of the fermentation. So, it is especially important to build accurate soft measurement models. Straw fermentation and ethanol involve many parameters and complex control. Regular PID cannot meet the accuracy requirements, and soft measurement technology is required. The core of soft measurement technology is to model the fermentation process through parameters, auxiliary variables, and data preprocessing technology. There are two main modeling methods: mechanism modeling and data-driven modeling [10, 11]. Mechanism modeling is analyzing the fermentation process to determine the biomass required to be measured during the fermentation process, as well as the impact factors. The mathematical function is then used to represent it and thus to create the model. The Danish scholar GERNAEY K V has established a relatively general mechanistic model [12] according to the characteristics of the fermentation process. Japanese scholar Kouki Sakimoto used ethanol, established the enzymatic lysis and kinetic model [13], and compared this model with the actual fermentation process [14]. Since the fermentation mechanism is a complex object, it is difficult to express it clearly by mathematical expression. So, it is difficult to accurately control. Data-driven modeling is using advanced intelligent control algorithm to process the measurement data and train the learning algorithm to obtain the prediction value. The document [15] proposes a soft measurement strategy of K principal component based on K principal component analysis-radial basis function neural network (KPCA-RBFNN) to meet the data requirements of real-time control for microbial fermentation production. The document [16] adopts a control strategy of the neural network inverse to achieve an optimized control of L-lysine production. Since this modeling approach does not depend on the fermentation model, the prediction results will inevitably deviate. Support vector machine (SVM) is often used for probabilistic statistical regression learning [17], using the collinearity problem [10] by partial least squares. Artificial neural network [18] combined with other data statistical analysis methods was applied to construct a soft measurement model [19] for nonlinear processes. ANN requires a large amount of data training and testing of [20], and the computation is large. After the model is determined, the straw fermentation process is a multivariate and complex dynamic system [21]. Therefore, it will cause a decrease in accuracy and the real time of system control to a certain extent. The Bayesian approach belongs to probabilistic statistics [22] and can handle the uncertainty of the system parameters. The Bayesian inference method is used to select the SVM model and to support vector regression (SVR) parameter values, as well as to optimize the soft measurement model [23]. The SVM model was used to select the support vector regression parameter values by Bayesian inference [24]. It is applied to biological fermentation soft measurement techniques, such as butanol fermentation [25], which has achieved some results.

In the process of straw fermentation and ethanol extraction, due to the factors such as many parameters, large data quantity, and uncertain model, it is difficult to make ethanol control and low ethanol yield. In view of these problems, this paper proposes a soft measurement modeling method of ethanol for straw fermentation fuel based on an improved support vector machine. The main contributions of this paper are as follows:(1)The shortcomings of existing measurement and control technology in the process of ethanol extraction by straw fermentation were analyzed, and a soft sensing model was proposed.(2)Because there are many parameters in the process of straw fermentation ethanol, the model is uncertain, there are certain errors in statistical learning analysis, and the generalization ability of conventional SVR method is insufficient, which is easy to fall into local optimization. To solve this problem, Bayesian method is used to optimize SVR, and a soft sensing model of straw fermentation ethanol based on BSVR method is proposed in this paper.(3)The test function and multi-parameter data set are established. After the soft sensing model training and studying, the simulation results show that the proposed BSVR method is better than SVR, improves the generalization ability of data and the anti-interference of the model, and has higher prediction accuracy and stronger stability.

#### 2. Straw Fermentation to Produce Ethanol Technology

In order to establish a soft measurement model of fuel ethanol fermentation produced by straw fermentation, it is first necessary to understand the basic process and principle of ethanol production produced by straw fermentation. Suitable fermenters and supporting equipment are designed for straw fermentation requirements. According to the control requirements of straw fermentation process, the straw control system of fermentation process can be designed. Liquid fermentation is generally used in the progress of large-scale straw fermentation for ethanol production. The process of ethanol produced by straw fermentation is generally composed of four aspects containing straw pretreatment, hydrolysis of cellulose, saccharification fermentation, and ethanol separation.

The process flowchart is shown in Figure 1.

As shown in Figure 1, the principle of ethanol production by straw fermentation is to destroy the complex structure in the straw through straw crushing and enable the separation of cellulose, hemicellulose, cellulose, and lignin. Cellulose and hemicellulose will generate six carbon sugar and five carbon sugars caused by the enzyme preparation. After straw pretreatment, cellulase hydrolysis, fermentation species, and fermentation processes, liquid fermentation and solid fermentation were combined together. The two-step fermentation method of ethanol and liquid fermentation was produced, and the liquid synchronous fermentation process was controlled. Straw was crushed, pickled, pretreated, boiled, and filtered. The filtrate was added to the fermented species. Then, fuel ethanol was obtained through liquid fermentation distillation and distillation. After the filter residue was added to the fermentation species, fuel ethanol was obtained by solid fermentation. Solid-state waste residue and liquid waste liquid are fermented under the action of prion-producing pseudosilk yeast species to produce high-protein feed (SCP). According to the process and mechanism analysis, the initial pH, temperature, ammonia or ammonium bicarbonate concentration, dilute sulfuric acid concentration, solid-liquid ratio, time, and stirring speed of the fermentation process are determined as the input of soft measurement, and the cell concentration and ethanol yield are determined as the output of soft measurement.

#### 3. Soft Measurement Model of Support Vector Regression

##### 3.1. Support Vector Regression

In the process of straw fermentation and ethanol production, multiple biological parameter measurement and control are required. Due to the uncertainty of the model, there are few initial sample data of microbial fermentation. So, we should model the fermentation process and conduct predictive control. SVR is developed based on SVM, and it is suitable for statistical analysis and modeling of data. For one input *x* and one input *y*, it can be expressed as the joint probability *F* (*x*, *y*) in the statistics. Assuming that there is an independent sample with the same distribution (*x*_{1}, *y*_{1}), (*x*_{2}, *y*_{2}), …, (*x*_{n}, *y*_{n}), machine learning is to find a certain function set *f* (*x*)-specific solution under the condition of satisfying the expected risk minimization.

The expected risk can be expressed as *R*(*ω*) = ∫*L*(*y*, *f*(*x*, *ω*)) *dF*(*x*, *y*), where *f* (*x*, ) is the prediction function while is the generalized parameter. And *L* (*y*, *f* (*x*)) is a loss function. The marker values used by the SVM for prediction were −1 and +1. It is with limitations during processing. There are limitations in processing. In order to further solve more complex problems, more tag values need to be provided. So, regression is added to the SVM. This can greatly retain its own better generalization ability. What is more, the soft measurement modeling just uses the regression for continuous and arbitrary prediction.

If the sample set is , *x* and *y* can be expressed as follows:where is the weight vector and *b* is the threshold value. Insensitivity function was added to achieve the regression effect. The loss function of the insensitive functions is

In dealing with constraint problems, considering some errors, relaxation variables and are introduced. After the introduction of insensitive functions and relaxation variables, the expression of the optimization problem is

*C* is the penalized coefficient, and the kernel uses the Gaussian kernel function:

The *σ* expresses the width of the Gaussian kernel function in equation (4).

When using SVR method, the characteristic dimension of model variables may be larger than that of sample points. The kernel function mapping dimension of formula (4) is very high, resulting in excessive calculation, insufficient generalization ability, and sensitivity to abnormal data and affecting the accuracy of soft sensing. In order to improve the adaptability of soft sensing model, SVR needs to be improved.

##### 3.2. Bayesian Method to Optimize SVR

Because there are many parameters and uncertain models in the process of straw fermentation of ethanol, there are certain errors in statistical learning analysis. Bayesian reasoning method can be used to optimize support vector regression, effectively suppress the explosion of data dimension, reduce the amount of reduction, and improve the real-time performance of soft sensing. The Bayesian method has good scalability to integrate with multiple algorithms. And it can solve the problems of data deviation and missing data. Therefore, the Bayesian method was used to optimize the SVR. The resulting support vector regression Bayesian optimization model compares with SVR. Bayesian methods mainly utilize probabilities for learning. In Bayesian methods, it is mainly necessary to solve the posterior distribution and the maximum posterior estimation of a problem. The notions of generation priors and generation likelihood need to be introduced to solve the problem here. This solution method is suitable for unsupervised learning and equally for supervised learning. In classification problems, SVR can be classified as supervised learning. SVM is often used in soft measurement models, while those used in soft measurement models are mainly SVR. There is a classification problem to be solved in a low-dimensional space _{Rd}. For the sample set, it is mapped into a high-dimensional space to solve the optimal support vector. The common composition set is the optimized hyperplane: , The discriminant function is . It can be ensured that the optimal problem empirical minimization is translated to

In order to explain the derivation process more intuitively, the feature expression of the sample is first added to a dimension. This dimension is set to a constant 1 and can be expressed as . Also to simplify the discriminant function, the weights of and the bias of *b* are put together to form a new value of , which can be expressed as . Substitute the alternative function values into the original discriminant. The expression is converted to . The discriminant is reduced to the new function value and the inner product form of the sample features.

This processing is done for elements in all sample sets for calculation. From the basic principle of the function, it is known that the function expression *b* that originally contains *θ* is included because it is incorporated into the new function values. So, the default is *θ* = . For the convenience of subsequent derivation, the subsequent weights are still referred to *w.* At the same time, assume the ratio factor *λ* = 2/*C*. Enter formula (5) to obtain

The ratio factor *λ* is a normal number, and the introduction of *λ* does not affect the calculation results.

In solving formula (6), the generation prior likelihood pair is first derived from the functions minimizing the empirical risk. The posterior distribution was obtained by using the Bayesian method. Meanwhile, Bayesian method is used to optimize SVM. A unique prior likelihood pair is reversed by a generation prior likelihood pair. At this point, the likelihood function is expressed as . Expression for the generation prior probabilities and the generation likelihood functions can be derived from the above conditions:

Generation prior probability can be obtained from formula (7). It can be written in a Gaussian distribution form:where 0 is the vector corresponding to the optimal solution of zero and *I* is the unit matrix. However, the relationship between the actual prior probability and the generation prior probability is . It is available from the expression of the surrogate likelihood function .where is a function expression containing , but is not a constant.

In the process of straw fermentation of ethanol, the generation a priori and generation likelihood are used to optimize SVR of Bayesian. It can improve the data adaptability of the soft sensing model and eliminate the interference data. The effective data are taken as the soft sensing input, and the key parameters of the fermentation process are obtained through training and learning.

#### 4. Experimental Simulation Analysis

To verify the Bayesian optimization method for SVR, the constructed new model was tested experimentally. The process data of straw ethanol fermentation required by the experiment are provided by the fermentation control system platform of Jiangsu University. The medium used in the standard fermentation process is used for batch fermentation, and the fermentation temperature is set at *T* = 50°C. The dissolved oxygen concentration *C*_{L} is set to 35%∼40%. pH = 7.12. The fermentation time of ethanol was 72 hours; take 10 batches of data, including 4 batches of data for the training set, 4 batches of data for the test set, and the remaining 2 batches of experimental data as the verification data of the soft sensing model. Through the test, the pH, temperature, ammonia or ammonium bicarbonate concentration, dilute sulfuric acid concentration, solid-liquid ratio, time, acetic acid, etc., are determined as auxiliary variables. Cell concentration, reducing sugar consumption, and ethanol concentration were taken as key parameters. The experimental data were extracted from the ethanol production process by straw fermentation process, put into the BSVR soft measurement model designed for prediction experiments, and compared with the SVR soft measurement model. Select the corresponding test function for verification. At the same time, a small amount of interference data was added to the test to verify whether the BSVR soft measurement model could achieve high anti-interference capability. The effect of the soft sensing model is judged comprehensively by the test results. The test functions selected are as follows:where *γ*_{1} is a Gaussian noise with a mean of 0 and a variance of 1. *γ*_{2} is a Gaussian noise with a mean of 0 and a variance of 0.1. The data were selected for multiple cross-validation, and the optimal key parameter width and penalty coefficient of the kernel function of support vector *M* regression were *σ* = 1, *C* = 8. 120 groups of sample data were collected for the two functions for training, and then 60 groups of sample data were collected for the two functions for testing. The expression of the test error which adopts the evaluation of root mean squared error (RMSE) and mean absolute error (MAE) is shown in equation

The test results are shown in Table 1.

From the error data of the four test functions, it can be seen that there is a certain deviation between the predicted data and the actual data, there is overshoot, and the anti-interference ability needs to be improved. However, under the BSVR algorithm, the error is less than that of the SVR method, indicating that the interference signal is filtered by the Bayesian method, the error is reduced, and the generalization ability of BSVR is higher than that of the SVR algorithm.

To further verify the effectiveness of the BSVR soft measurement modeling method, according to process and mechanism analysis of ethanol production by straw fermentation, the initial pH, temperature, ammonia or ammonium bicarbonate concentration, dilute sulfuric acid concentration, solid-liquid ratio, time, etc., were determined as inputs, while the body concentrations, consumption of pentacarose, hexa-carbon sugars, and ethanol production were determined as output. The two batches of ethanol concentration data obtained were used to build the BSVR and SVR soft measurement models, respectively. And training against training sample was set from two models to predict the ethanol and cell concentration.

Figures 2 and 3 show the online real-time prediction value and prediction relative error curve of bacterial concentration using BSVR and SVR soft sensing models, respectively. It can be seen from the figure that the predicted value of bacterial concentration using BSVR soft sensing model is closer to the real value. Although in the initial stage, there are large errors due to insufficient training samples, when the number of samples is greater than 40, the predicted value fluctuates around the real value. The error of SVR soft sensing model is significantly greater than that of BSVR model, which shows that the prediction performance of this method is better.

Figures 4 and 5 are the online real-time prediction values of product ethanol concentration using BSVR and SVR soft sensing models, respectively. Figures 6 and 7 are the relative error curves of ethanol concentration obtained using BSVR and SVR soft sensing models, respectively.

By comparing the two groups of predicted values obtained with the real values obtained by laboratory experimental analysis, it can be found that the two soft sensing models can better track the real values, and the soft sensing model based on BSVR is superior to the soft sensing model based on SVR in terms of fitting accuracy. It is basically consistent with the offline analysis and test values. It shows that the accuracy of BSVR soft sensing model to track the real value is higher, and has better robustness and generalization ability.

It can be seen from Figures 6 and 7 that the predicted output of ethanol concentration using BSVR soft sensing model can better follow the actual output, and the relative error can be maintained between ±0.03. The predicted output of ethanol concentration using SVR soft sensing model can better follow the actual output in the early stage of fermentation, but with the progress of fermentation process, the longer the time, the worse the follow-up effect, and the prediction error will fluctuate greatly.

In order to further verify the effectiveness of this method, a state observer (SO) is introduced in this paper [26]. SO is often used to predict the variable value at the next time. The ethanol fermentation process is a fully observable system, and SO can be used to predict the key parameters. In this paper, the mean square error (MSE) is used as the performance index to analyze the prediction results of BSVR, SVR, and SO soft sensing models. MSE is specifically expressed aswhere *x*_{i} is the true value, is the predicted value of soft sensing, *i* is the sampling time, and *n* is the number of samples.

Table 2 shows the mean square error of ethanol concentration prediction of BSVR, SVR, and SO soft sensing models. It can be seen from the table that the prediction accuracy of BSVR soft sensing model is higher than that of SVR and SO soft sensing models. Outcomes prove that using BSVR soft sensing model can effectively improve the prediction accuracy of ethanol concentration in the process of straw fermentation to produce ethanol fuel gasoline.

#### 5. Conclusions

In order to reduce the environmental pollution of straw and turn waste into treasure, making straw fermentation and biological energy has become an important way to solve the energy crisis. This paper presents an improved learning framework for SVR according to problems of multiple parameters and many influencing factors in the process of making straw fermentation and making ethanol fuel gasoline. The BSVR soft measurement model was designed by optimizing SVR with Bayesian inference. Effective data were selected, and interference data were eliminated, avoiding model uncertainty caused by interference and data error. The data training and test experiments were conducted. Experimental simulation and analysis for straw fermentation show that the BSVR soft measurement modeling method proposed here outperforms the traditional SVR soft measurement modeling method in terms of fitting accuracy and generalization ability lamp, with higher prediction accuracy and accuracy, which verify the effectiveness and superiority of the proposed method.

#### Data Availability

At present, it is still in test, and some data can be provided.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Authors’ Contributions

Xu Feng contributed to analysis and design of soft sensing model, test, and simulation. Tang Hong-yu contributed to overview and arrangement, and overall model design. Wang Bo developed dynamic model and technological process design of straw biological fermentation. Zhu Xiang-lin did scheme discussion and provided experimental guidance.

#### Acknowledgments

This study was supported by the Zhenjiang Key R&D Project (SH2020005) and Natural Science Foundation of Jiangsu Province (BK20191225).