Abstract

The anaerobic treatment process is a complicated multivariable system that is nonlinear and time varying. Moreover, biogas production rates are an important indicator for reflecting operational performance of the anaerobic treatment system. In this work, a novel model fuzzy wavelet neural network based on the genetic algorithm (GA-FWNN) that combines the advantages of the genetic algorithm, fuzzy logic, neural network, and wavelet transform was established for prediction of effluent quality and biogas production rates in a full-scale anaerobic wastewater treatment process. Moreover, the dataset was preprocessed via a self-adapted fuzzy c-means clustering before training the network and a hybrid algorithm for acquiring the optimal parameters of the multiscale GA-FWNN for improving the network precision. The analysis results indicate that the FWNN with the optimal algorithm had a high speed of convergence and good quality of prediction, and the FWNN model was more advantageous than the traditional intelligent coupling models (NN, WNN, and FNN) in prediction accuracy and robustness. The determination coefficients R2 of the FWNN models for predicting both the effluent quality and biogas production rates were over 0.95. The proposed model can be used for analyzing both biogas (methane) production rates and effluent quality over the operational time period, which plays an important role in saving energy and eliminating pollutant discharge in the wastewater treatment system.

1. Introduction

Because of the economic advantages and low generation of excess sludge, the anaerobic biological treatment process is an efficient process for treating high-concentration organic wastewater, such as paper-mill wastewater, where the complex organic contaminants can be converted into clean energy (methane gas) in the anaerobic treatment process [13]. However, the anaerobic treatment process is a complicated multivariable system and is influenced by various influent characteristics and operating conditions, which is difficult to be solved within a short time [4, 5]. Therefore, biogas (methane) production rates are also influenced by various influent characteristics and operating conditions [6].

Because of the nonlinearity, uncertainty, and posterity of the anaerobic treatment process, it is difficult to operate and control that process. To increase the steadiness and reliability of the anaerobic treatment process, modeling is a significant method, which can be used in controlling, operation, and optimization of the anaerobic treatment process at a reasonable cost [7]. In recent years, numerous studies have been carried out and various modeling methods have been developed to control and simulate the anaerobic treatment process [811]. However, because of the superficial understanding of the mechanisms associated with the anaerobic treatment process, it is difficult to analyze and estimate more underlying phenomena in anaerobic digestion using conventional mathematical models. Therefore, to eliminate the complicacy, difficulty, and applicability, more practical, secure, and simple models are needed to be investigated [4, 12].

Because artificial intelligence has logic thought, fast disposal capability, and nonlinear characteristics, it may carry on the free precision to any continual nonlinear function approaching. The commonly used artificial intelligence methods are the neural network (NN), fuzzy logic (FL), wavelet transform (WT), genetic algorithm (GA), and metaheuristic algorithms [13, 14]. Hence, the model based on artificial intelligence can achieve precise simulation results in the wastewater treatment process.

In recent years, a variety of models based on the NN for estimating the performance of the anaerobic treatment process have been conducted by many researchers [15]. A backpropagation neural network (BPNN) model integrating the additional momentum method with the adaptive learning rate method was developed to estimate the operational status of the upflow anaerobic sludge bed (UASB) [16]. The results indicated that the model can predict and optimize the control parameters and propose strategies of the reactor. In addition, another BPNN model based on the Levenberg–Marquardt algorithm was designed by Sridevi et al. [17], which can be used to successfully predict the biodegradation and biohydrogen production in a hybrid UASB reactor treating the distillery wastewater. Above all, the model based on the NN can efficiently simulate and predict the nonlinear characteristic of the anaerobic wastewater treatment system. However, the NN has some defects, such as converging slowly and immersing in local vibration frequently [18, 19].

Therefore, there are many neural network coupling algorithms, such as the wavelet neural network (WNN) and fuzzy neural network (FNN), to be proposed to solve the problems faced by the ordinary NN [2022]. The FNN based on fuzzy logic (FL) and NN can realize FL by the NN. In the meantime, the coupled algorithm can capture fuzzy rules effectively and realize fuzzy reasoning by using the NN structure. So if the FNN is applied in the wastewater treatment system, it will more effectively simulate the wastewater treatment system.

Many research studies about modeling the anaerobic wastewater treatment process using the hybrid FNN have been carried out in recent years [2325]. Erdirencelebi and Yalpir integrated FL and NN to develop a hybrid FNN model for simulating the anaerobic wastewater treatment process [2]. The results illustrated the developed hybrid FNN model could be used for forecasting the effluent quality accurately in a UASB system. In order to monitor degradation of the penicillin-G wastewater in an anaerobic hybrid reactor, a hybrid FNN model was established by Mullai et al. [26] using the adaptive network-based fuzzy inference system (ANFIS). The simulation results exhibited that the developed hybrid model was effective and the correlation coefficient (R2) of the model for chemical oxygen demand (COD) values was high. Therefore, clarification of the place of the present subject in the scheme of the FNN methodology can be considered a particular field of investigation to evaluate real-time effluent quality and biogas (methane) production rates that are necessary to control the anaerobic process and to establish fault diagnosis. Nevertheless, the FNN also has drawbacks, which are no time-frequency localization characteristics and may easily cause the low convergence rate and accuracy. This is exactly the advantage of the wavelet transform (WT). Hanbay et al. [27] have successfully used wavelet packet decomposition and NN for prediction of the anaerobic wastewater treatment plant. Furthermore, on the basis of kernel principal component analysis and WNN, a soft sensor system could realize real-time detection of redox potential, dissolved oxygen, pH, and COD in the wastewater treatment process [28].

Hence, a new system with the fuzzy wavelet neural network (FWNN) was established by integrating advantages of various intelligent techniques. This network could effectively increase the detection rate and reliability of the model by improving the discernment, generalization, and approximation capacities [3, 29, 30]. Such an integrated intelligent system can overcome the shortcomings mentioned above. Therefore, the hybrid FWNN offers a more efficient method for modeling, simulation, control, and operation optimization of the complex process system, such as the wastewater treatment process.

The performance of the anaerobic treatment process is very complicated and makes remarkable changes based on various influent characteristics and operating conditions, such as organic loading rates (OLRs), pH, hydraulic retention time (HRT), and toxic organic compounds. Various potential advantages based on such an artificial intelligence-based model for real-time evaluation of effluent quality and biogas production rates would be fully demonstrated, such as withstanding various shock loads caused by substantial influent fluctuations, optimizing operational parameters of the process for controlling operational cost, providing an online evaluation and estimation of emissions on an energetic basis, and building a continuous early-warning strategy without requiring a complicated model structure. However, studies on modeling biodegradation and biogas (methane) production rates in a full-scale mesospheric internal circulation (IC) anaerobic reactor treating paper-mill wastewater using the FWNN are very limited.

Based on the relationship between the effluent COD and the biogas flow rate under various operating parameters such as influent COD (CODinf), HRT, OLR, pH in the reactor (pH), and alkalinity in the reactor (ALK), an FWNN model is developed to predict and estimate the effluent quality and biogas production rates based on the existing historical data. The key objective of this study was to develop a novel hybrid genetic algorithm evolving FWNN model for simulating the functioning problem of a full-scale internal circulation (IC) anaerobic wastewater treatment plant. The proposed hybrid model may be used for analyzing the biogas production rate and effluent quality over the operational time period, which plays an important role in saving energy and eliminating pollutant discharge in the wastewater treatment system.

2. Materials and Methods

2.1. Reactor System

A full-scale IC anaerobic treatment plant system was selected for a demonstration site. This treatment system used in the study is located in Guangdong, China. As shown in Figure 1, the wastewater treatment process including four IC reactors was operated to treat approximately 3 × 104 m3 paper-mill wastewater streams per day. Each IC reactor has a diameter of 9 m and a volume of 1100 m3. The treatment system is equipped with online flow, pH, DO, ORP, temperature, COD, and gas flow meter (HACH®) sensors. The signals delivered from above parameters were also used to control peristaltic pumps, stirrers, and air blower. The model used data from the full-scale sequential system that were collected over a period of 150 days. Other chemical indexes were determined according to standard methods [31, 32].

2.2. Genetic Algorithm Evolving Fuzzy Wavelet Neural Network (FWNN)
2.2.1. Identification of Model Parameters

The identification of model parameters is one of the key demands on modeling the anaerobic wastewater treatment processes. The most appropriate choice of model components, which can exactly display the running state of the anaerobic treatment process, can help improve the management efficiency and reduce functioning costs of the system [6].

OLR is used to measure the biological conversion ability. This parameter is a vital factor, which can significantly influence microbial ecology and performance characteristics of anaerobic treatment systems.

HRT is an important variable in the anaerobic treatment system. It is used to measure the amount of time the wastewater remains in the system. Retention time of the feed in the system is too short, to complete the entire treatment process, and biogas production will not be restrained.

pH is a chief parameter, which significantly affects the performance characteristics of anaerobic treatment systems. pH has a substantial effect on methanogenic bacteria.

ALK is reflected in the solution, to neutralize acids towards the equivalence point of carbonate or bicarbonate in the anaerobic treatment system. In order to control pH in the anaerobic treatment system, it must ensure there is enough ALK, which is effective in preventing the dramatic changes of pH.

COD is used to measure the organic compounds in wastewater. This parameter refers to substrate utilization proficiency and microbial metabolic activity in the anaerobic treatment systems.

Biogas production rate is usually used to refer to the processing efficiency of the anaerobic treatment system. In the anaerobic treatment system, the most significant operation is to control the effluent superiority and maximize the rate of biogas production by breaking pollutants.

Therefore, influent COD (CODinf), HRT, OLR, pH in the reactor (pH), and alkalinity in the reactor (ALK) were selected as the input parameters of the proposed FWNN model. Biogas production rates and effluent COD (CODeff) were selected as the output parameters of the proposed FWNN model.

2.2.2. Structure of the Proposed FWNN

The architecture of the FWNN for modeling the anaerobic treatment system is illustrated in Figure 2. For the FWNN, the wavelet was used for the neuron’s activation functions on the basis of the five-layer NN, and fuzzy inference can be realized [33, 34]. The FWNN includes five layers as follows.

The first layer consists of all input factors that act as the input layer. The layer data of input factors x1; x2; …; xn are the input mode. In this layer, there are five input parameters that are CODinf, HRT, OLR, pH, and ALK, so n = 5.

The second layer is the fuzzy layer. The fuzzy layer set theory was employed to processing of linguistic variables, and the selected membership function was the Gaussian function. The input characteristic variables were translated into fuzzy variables in this layer, which can be defined as follows:where cij and σij are the center and width parameters of the membership functions, respectively, and i and j are the number of input parameters and linguistic variables in the FWNN, respectively.

A self-adapted fuzzy c-means clustering has been used in this work. It has been used to address the fuzzy factors, and 18 sets of fuzzy control rules have been established by analyzing the actual database of knowledge. The third layer is the fuzzy rule layer. This layer consists of numeral hidden units representing fuzzy logic rules and numeral fuzzy partitions. The fuzzy rule base is generated from the given input and output data, and the logical inference can be realized, which can be given as follows:where n is the number of fuzzy rules.

The fourth layer is the wavelet network. In this layer, a wavelet network is designed using wavelet functions as the activation function of its nerve cells, based on the good local performance of wavelet transformation. The WNNs are used for the consequence of the FWNN. The output of WNNs with the jth wavelet neuron can be given as follows:where , aij is the dilation of the WNNs, bij is the translation of the WNNs, and is the weight of the WNNs.

The fifth layer is the output layer. The total output of the FWNN (y) in this layer is defined as follows:

In this proposed design, to monitor the anaerobic treatment system’s operational status, effluent COD and production rates of biogas (methane) were chosen as the network outputs.

2.2.3. Training Algorithm to Optimize the Proposed FWNN

A hybrid learning algorithm was applied to train and optimize the network parameters to further improve the prediction capabilities of the network. It has integrated genetic algorithm (GA) into gradient descent algorithm (GDA) to enhance the efficiency and robustness of the network.

GA is a kind of well-rounded global optimization method that owns the features with strong robustness and broad applicability [35]. Since the GDA easily falls into the optimum local and is sensitive to the initial values, the initial values of the network’s parameters are first determined by a real-coded GA, and then the GDA is used to train the network, thereby greatly accelerating its convergence. In this work, the formulation of the objective function can be defined as follows:where ydk is the desired value, yk is the output value of the FWNN, and n is the sample number. The output of the FWNN according to the s-th chromosome with can be defined as follows:in which

GA is an artificial intelligence method, which simulates natural evolution using the three main operations: selection, crossover, and mutation, to produce better fitness for individuals. The goal of the GA for the selection operation is to give population members (or solutions) more reproductive opportunities with better fitness values. Crossover and mutation operations produce new individuals in combining the information contained in two parents, and they can ensure that the new initial chromosomes are always feasible. The selection of the tournament is used to get the new generation. For the next generation, the member with the better fitness is selected.

Hence, the chromosome can be operated according to the following real-coded set:where

Thus, the optimal initial variables of the FWNN would be finally obtained with the three genetic operations of selection, crossover, and mutation. The initial population size Npop is 100 in this design, the crossover rate Pc is 0.7, and the mutation interval Pm is 0.01.

2.2.4. Parameter Updation through Gradient Descent Algorithm

As the parameters of the network were initialized by the GA, the parameters of the FWNN and model were verified and revised by the GDA [36]. Finally, all the parameters of the developed FWNN were made up of the center and width parameters of Gaussian functions, and the dilation, translation, and weight parameters of WNNs were simultaneously optimized according to the following:where is the desired value and is the output value of the FWNN. Accordingly, the parameter values of the FWNN can be given as follows:where and are the learning rate and the FWNN developed momentum factor, respectively.

2.3. Self-Adapted Fuzzy c-Means Clustering

In this work, according to the characteristics of the anaerobic treatment system, a self-adapted fuzzy c-means (FCM) clustering algorithm was proposed to deal with the fuzzy factors and thus determine the number of the FWNN’s fuzzy rules. Objects are strictly divided into clusters based on the fuzzy clustering method, and the best class number is obtained by the valid analysis of clustering [37]. The calculating equations are designed as follows:where represents the sum of weighted Euclidean distances, is the objective function representing the minimum square sum of weighted Euclidean distances, is the number of clusters, is the number of objects, is the observed value, and is the weighted exponent.

represent the Euclidean distance and can be designed as follows:

are the membership function values and can be represented as follows:

are the cluster centers, and the formula for their specific calculation is as follows:

3. Results and Discussion

3.1. Data Collection and Preprocessing

In order to evaluate the hybrid FWNN model for the anaerobic wastewater process, 150 datasets were collected, the network was trained with 120 datasets, and 30 sets were proved. Standardization, which eliminates data redundancies and effectively organizes the data, has been used to improve the FWNN’s performance. In this work, all datasets were converted to the range between 0 and 1 through scaling.

3.2. FWNN Development

Using all these data, the effluent COD and biogas (methane) production rates were predicted using an FWNN model. In addition, the datasets were analyzed using a self-adapted fuzzy c-means clustering, and the optimal clustering number with 18 sets was identified. The structure model shown in Figure 2 was determined based on the analysis of technology and experimental data as well as the forecast target. It included three models of the FWNN (FWNNCOD, FWNNQ, and FWNNCH4) for COD, Qgas, and CH4 prediction, respectively. For each model, there was a separate rule basis, but the models’ input parameters were the same.

A hybrid learning algorithm was applied after initializing the model structure and parameter to train and optimize network parameters. Because the GDA easily falls into local optimum and is sensitive to the initial values, the initial values of parameters of the network were firstly determined by a real-coded GA, and then the GDA was used to train the network, thereby greatly accelerating its convergence.

3.3. Simulation of FWNN Model

Three FWNN-based models were simulated and verified by the experimental data using the MATLAB program. The initial population size Npop, crossover rate Pc, interval of mutation Pm, maximum number of generations, learning rate , and momentum factor are 100, 0.7, 0.01, 200, 0.02, and 0.5, respectively. Figure 3 sketches the training process of the developed FWNN (taking FWNNCOD for example). From Figure 3, it can be easily understood that this network has virtues of good memory, fast convergence ability, and strongly stable capability. Consequently, the new parameters of FWNN models were obtained by repeated training and studying through the hybrid learning algorithm, as shown in Tables 1 and 2.

Figure 4(a) shows the predictive values of the FWNN models according to the testing datasets. As shown in Figure 4(a), it is easily found that the predicted values are in good conformity with those observed values. In this work, in order to assess the performance of models, various indicators were used to analyze and estimate the developed FWNN models, such as the determination coefficient (R2), correlation coefficient (R), root mean square error (RMSE), mean square error (MSE), and mean absolute percentage error (MAPE). As shown in Table 3, the performance indicators of the proposed FWNN models were acquired by comparing the predicted results with real values.

Table 3 clearly shows that using the FWNN, the MAPE values of 2.9083%, 3.3563%, and 4.0660% for COD, Qgas, and CH4 could be achieved. R2 values were 0.9647, 0.9681, and 0.9501, respectively, for COD, Qgas, and CH4. R values of COD, Qgas, and CH4 were 0.9822, 0.9839, and 0.9747, respectively. The RMSE values of 28.7439, 199.2556, and 155.0499 for COD, Qgas, and CH4 could also be achieved. Simulations on the proposed model showed that this proposed model not only could accomplish parameter calibration rapidly and find out the optimal solutions of parameters accurately but also could improve the converging rate and the stability of the models. The results showed a good concordance with the experimental values predicted. As shown in Table 3, for the three FWNN models, the predictive performance of the proposed FWNN models on effluent quality and production rates for biogas was satisfactory with a very high determination coefficient (R2), which were all over 0.95. In other words, a high R2 showed that only 3.53%, 3.19%, and 4.91% of the total variations for COD, Qgas, and CH4 were not explained by the proposed FWNN models. In addition, a high R for the three FWNN models illustrates that there was a good concordance of the predicted values with the experimental ones. Accordingly, based on the other small evaluation indicators (MAPE, RMSE, and MSE), it also shows that the predicted model developed had high predictive accuracy and satisfied robustness and fitness, making the system highly adaptable.

3.4. Comparisons with FNN, WNN, and NN

The developed FWNN models were compared with FNN, WNN, and NN models to demonstrate the correctness, efficiency, and benefits of the hybrid network. Based on the comparison of results, as shown in Table 3, it can be seen that FWNN models have lower RMSE (or MSE) and MAPE values and higher R2 and R values. Taking CODeff for example, when predicting, R, R2, MAPE, RMSE, and MSE values were 0.9822, 0.9647, 2.9083%, 28.7439, and 826.2142 using the FWNN, respectively. However, when using the FNN, WNN, and NN models, R values were 0.9645, 0.9351, and 0.8222, respectively; R2 values were 0.9302, 0.7697, and 0.6760, respectively; MAPE values were 4.077%, 4.4575%, and 8.3163%, respectively; RMSE values were 41.1297, 55.8223, and 88.2468, respectively; and MSE values were 1.6917 E + 3, 3.1161 E + 3, and 7.7875 E + 3, respectively.

Table 3 shows that FWNN models have higher estimation accuracy and better robustness than FNN, WNN, and NN models, showing that FWNN models are more accurate than FNN, WNN, and NN models for predicting effluent quality and biogas (methane) production rates. The results of this study suggest that the FWNN model was highly capable of extracting the dynamic IC system changes. Considering the nonlinearity, complexity, and randomness of the anaerobic treatment process, such a good predictive performance of FWNN models was particularly important for modeling the wastewater treatment process. The FWNN is a good choice for modeling the IC anaerobic treatment process. The simulated models based on the FWNN model can be effectively applied to a full-scale IC anaerobic reactor to cope with influent variations. The results show that anaerobic wastewater treatment can be better described with the FWNN than the FNN, WNN, and NN. Maintaining environmental standards, FWNN models can effectively achieve the IC anaerobic system’s environmental and economic goals in real time. In the future, in order to optimize the anaerobic treatment system, a control system will be developed to monitor and control the system based on the FWNN model.

3.5. Multidimensional Graphs of Affecting Factors and Regulating Strategies of IC

Using the partitioning connection weights (PCW) method, the importance of the influencing factors could generally be analyzed. In this work, four-dimensional graphs with two outputs were used for analyzing the importance of input parameters to outputs.

3.5.1. Influence of pH and OLR on COD Removal Rate and CH4 Production Rate

Figure 5(a) shows the influence of pH and OLR on the COD removal rate and the CH4 production rate. From Figure 5(a), when pH and OLR values varied from 6.8 to 7.4 and from 5 to 15 kg COD/m3⋅d, the rate of COD removal and the rate of production of CH4 increased, respectively. The treatment system was particularly sensitive to changes in pH when the OLR was high. However, when the OLR was above 15 kg COD/m3⋅d, changes in pH values rarely affected the performance of the treatment system. When the OLR exceeded 15 kg COD/m3⋅d or pH was above 7.5, there was a negative effect on the rate of COD removal and the rate of production of CH4, and the negative effect on the rate of COD removal and the rate of production of CH4 caused by the increased OLR was lower than that caused by low pH. Hence, when the OLR of the treatment system was enhanced by shortening HRT or increasing the influent COD, it was conducive to the stability of the treatment system through adding alkali to improve pH values.

3.5.2. Influence of pH and ALK on COD Removal Rate and CH4 Production Rate

Figure 5(b) shows the influence of pH and influent COD on the COD removal rate and CH4 production rate. Whatever pH was in the system, when ALK was low, it is not good for the rate of COD removal and the rate of production of CH4. The treatment system also became immovable at low pH. When the ALK exceeded 2500 mg/L and the pH in the treatment system exceeded 7.5, the rate of COD removal and the production of CH4 increased. Therefore, when the influent concentration of COD was high, pH and ALK values were kept higher than 7.5 and 2500 mg/L, respectively.

3.5.3. Influence of OLR and ALK on COD Removal Rate and CH4 Production Rate

Figure 5(c) shows the influence of OLR and ALK on the treatment system. When the OLR was lower than 15 kg COD/m3⋅d, the treatment system was rarely affected by ALK, and the CH4 production rate was low. When ALK was higher than 2500 mg/L, especially when it increased from 3000 mg/L to 3500 mg/L, the CH4 production rate decreased dramatically with the changes of ALK. The COD removal rate was low when the OLR was over 18 kg COD/m3⋅d. If the OLR continuously remained higher, the worsening trend in the treatment system would have occurred. If the OLR remained constant, the COD removal rate rules were obtained with the change of ALK. Moreover, it was shown that the optimal influent OLR was about 15 kg COD/m3⋅d when the treatment system ran in the operating conditions with a pH of 7.5 and alkalinity of 3000 mg/L.

4. Conclusion

The proposed research was to establish an artificial intelligence-based model for modeling a full-scale anaerobic wastewater treatment system. Combining the benefits of the NN, FL, and WT, the FWNN could be used successfully to predict effluent quality and the rate of production of biogas according to the strong nonlinear ship between its inputs and outputs. The FWNN model showed higher estimation accuracy and better robustness compared to FNN, WNN, and NN models and achieved better performance in predicting effluent quality and production rates of biogas with high determination coefficients R2 over 0.95. Meanwhile, the FWNN model can be used for analyzing the importance of the affecting factors. The proposed hybrid approach will provide a very impactful and cost-effective tool for modeling the anaerobic process that helps engineers monitor operational parameters to improve the performance of anaerobic treatment.

Data Availability

The data used to support the findings of this study have not been made available because the authors are asked to sign a confidentiality agreement with the Chinese government which provides the basis of the data.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Zehua Huang, Renren Wu, and XiaoHui Yi contributed equally to this work and shared the first authorship. Zehua Huang and Renren Wu conducted the experiments and wrote the main manuscript text, Xiaohui Yi contributed in the manuscript preparation and modification (revise process), Hongbin Liu prepared figures 13, Jiannan Cai prepared figures 45, and Guoqiang Niu prepared tables 13. Mingzhi Huang designed this research and supervised analyses including data interpretation and discussion and manuscript preparation as a principal investigator (PI). Guangguo Ying was involved in experimental design and supervised manuscript preparation.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (nos. 41977300 and 41907297), Guangdong Provincial Natural Science Foundation (no. 2016A030306033), Science and Technology Program of Guangzhou (no. 907224176081) and Guangdong Foundation for Program of Science and Technology Research (no. 2017B030314057).