Abstract

Heavy metals are widely recognized as being hazardous to human health and environmentally aggressive. The literature reports different approaches for lead removal, for example, water hyacinths. Heavy metal sorption isotherm modeling represents an important tool towards the study of equilibrium conditions. Fractional calculus represents a novel approach and a growing research field for process modeling, based on derivatives of arbitrary order. Recently, a novel isotherm based on fractional calculus was proposed for lead sorption using water hyacinth (Eichhornia crassipes). This paper reports a general procedure on error analysis and its influence on parameter estimation. It was applied to mathematical models based on fractional differential equations, focusing on a heavy metal novel isotherm sorption model. Parameter variance was calculated by using two different approaches (with the complete Hessian matrix and with a simplified Hessian matrix), and joint parameter confidence regions were generated, being successfully able to show that the fractional nature of the model is statistically valid.

1. Introduction

Heavy metals are widely recognized as being hazardous to human health and environmentally aggressive, being continuously generated by different chemical plants. The use of lead in the battery industry [1] is an important example. The literature reports different approaches for heavy metals removal, such as chemical precipitation [2], ion exchange [3], and electrochemical [4] and water hyacinths [5]. Mathematical models represent an essential tool for in-depth process studies, design, optimization, and control [6]. Therefore, heavy metal sorption isotherm modeling represents an important way towards the study of equilibrium conditions, which play a key role in sorption process design. The most common approach for this task consists in the use of classical models [7], such as Langmuir, Freundlich, and Redlich-Peterson among others, followed by proper parameter estimation and model discrimination analysis.

Fractional calculus represents a novel approach and a growing research field for process modeling, being based on derivatives of arbitrary order [814]. The literature reports a broad range of applications, concerning systems engineering [15], diffusion processes [16], heat transfer [17], solid mixing [18], biological systems [19], and fluid mechanics [20] among others [21]. Recently, dos Santos et al. [22] proposed a novel isotherm based on fractional calculus for lead sorption using water hyacinths (Eichhornia crassipes). The reported isotherm can successfully predict equilibrium concentrations of lead between the aqueous solution and the water hyacinth after. The model was validated using synthetic effluent [1]. It is important to highlight that the proposed model also leads to better performances when compared to classical models (Langmuir, Freundlich, and Redlich-Peterson), which were used for sake of comparison.

Error analysis represents a crucial step in model validation and further applications [23]. Recently, Joshi et al. [24] presented a detailed model analysis concerning classical sorption models. Regarding fractional-calculus-based models, Gabano and Poinot [25], Khemane et al. [26], and Isfer et al. [15] report the calculation of parametric variance.

It is important to state that one may identify a mathematical model of fractional order for a given set of experimental data [27]; however, only a precise error analysis can ensure that the derivative is in fact fractional. If the variance of the estimated fractional order of the derivative is large enough, the fractional order can be statistically regarded as integer order for a given confidence level. Consequently, the analysis of parameter joint confidence region becomes an essential tool, as the region indicates, for a given confidence level, the possible set of parameters that could generate the experimental data [28]. To the best of our knowledge, the generation and analysis of joint confidence regions have not yet been reported for models based on fractional calculus. This paper reports a detailed study on an error analysis procedure applied to mathematical models based on fractional differential equations. After the development of a theoretical framework concerning parameter estimation, parameter variance estimation and joint confidence region determination, the fractional model proposed by dos Santos [22] was used as a case study for validation purposes.

2. Theoretical Framework

2.1. Mathematical Model

Further details regarding the experimental data set can be obtained from dos Santos and Lenzi [1]. It needs to be highlighted that experimental data was normalized in the interval for proper parameter estimation [22, 29]. The mathematical model used in this work was firstly proposed by dos Santos et al. [22] for describing the lead equilibrium sorption. According to the authors, a large number of experimental results on equilibrium systems dealing with heavy metals have the following behavior: when the heavy metal concentration in the aqueous phase is low, the equilibrium concentration in the solid phase may largely change for a given modification in the concentration of the fluid phase. On the other hand, for higher concentrations of the heavy metal in the fluid phase, the equilibrium concentration in the solid may be less sensitive, indicating some kind of saturation. These features resemble to a certain degree in an exponential behavior, obtained, for example, from first-order differential equations. Consequently, an exponential model for heavy metal sorption isotherm, as given by (1), can explain some normalized experimental results, where parameters and depend on the sorption process features, like the type of heavy metal, solid matrix used for sorption process, and so on:

Therefore, by using fractional calculus, the previous model can be generalized to (2), by considering a fractional order for the differential equation. One can note that parameter also depends on the features of the sorption process: where is the 0th-order Epsilon function defined by Podlubny [30], which uses the Mittag-Leffler function;    and    are dummy variables. Details regarding the Gamma function ( ) can be found in the appendix.

2.2. Parameter Estimation

Parameter estimation was carried out using a genetic algorithm procedure as reported by Isfer et al. [15]. More specifically, the initial population consisted of 250 sets of values for parameters , which iterated until the difference of each parameter , the best set of two consecutive iterations, was lower than . Crossover and mutation probabilities were of 80% assuring a good macroscopic search and of 10% assuring a good microscopic (refinement), respectively. In order to avoid a local optimum solution, estimation was performed using different initial populations. Parameters , , and were estimated using (3) as the objective function of the optimization problem, representing a normalized quadratic error analysis [31]. Experimental variances were considered constant and equal to for all experiments and experimental covariances were assumed equal to zero. Consequently, matrix is a diagonal matrix. According to Bard [32], can be approximated by (4):

2.3. Parameter Variance

According to Bard [32], for the objective function defined by (3), the parametric variance matrix is given by (5), which uses the Hessian matrix and matrix , given by (6):

It can be observed that matrix can also be obtained by using a sensitivity matrix [23], , which is given by (8):

Also, according to Bard [32], the elements of the Hessian matrix are given by

For linear estimation problems, TERM2 automatically vanishes. However, this is not the case for nonlinear problems, but, according to Alberton et al. [23], this term can be neglected in some scenarios. Therefore, the parametric variance matrix is usually approximated by

2.4. Parameter Confidence Interval and Joint Confidence Region

According to Himmelblau [33], parameter confidence intervals can be obtained by using (11) for a given confidence level of %, but the use different values of has been reported. For a small number of experimental data, the use of , obtained from Student’s -distribution, is recommended. On the other hand, , obtained from the normal distribution, can be used for a large number of experimental data:

Based on parametric variance, joint confidence region also needs to be obtained. For a given confidence level, this region provides the set of parameters that could actually generate the experimental data set. This is an important analysis tool, as although a given set of parameter may be within the confidence interval, it may not be inside the joint confidence region [33]. Usually, these joint confidence regions can be obtained by (12), which considers a linearization of the estimation problem [28]. Thus, a key issue to be addressed is concerns on the influence of the approach used to calculate matrix ((5) or (10)) on the shape of the confidence region:

It needs to be stressed that (12) is only an approximation for nonlinear problems. A much more realistic approach [28, 34] given by (13) which considers the intrinsic nonlinear features of the estimation problem, leading to joint confidence regions usually larger than the ones obtained by (12):

2.5. Model Prediction Variance

Pinto and Schwaab [35] report that the variance of the model predictions of the experimental data used for parameter estimation is given by (14). The main feature of this equation is that not only experimental and modeling error themselves are taken into account, but also the correlation between them is also considered for the variance prediction:

For prediction of the variance either of future experiments (for a given value of ) or available experimental data not used during parameter estimation, (15) needs to be used. Only used data for parameter estimation contribute to the correlation between experimental and modeling errors: where is the model gradient vector given by (16) and evaluated for :

2.6. Analytical Expressions

By using the definitions of and the Mittag-Leffler function, the mathematical model given by (2) can be rewritten as

The objective function used for estimating the parameter set is given by (18)

The elements of matrix and matrix are given by

The objective function gradient is given by

The elements of matrix are obtained by

Finally, the elements of result from

Further details regarding Gamma (Γ) and Psi (Ψ) functions can be found in the appendix.

3. Results

Heavy metal sorption processes involve a solid phase, which in many scenarios consist in irregular and disordered structures; consequently, many classical models usually cannot adequately explain observed experimental behavior. Towards this, fractional calculus plays a key role in mathematical modeling of transport phenomena in irregular structure, mainly due to memory effects as already reported in the literature [36, 37]. It is important to note that the model proposed by dos Santos et al. [22] presents a novel characteristic as according to the value of parameter , the Epsilon function turns into a different mathematical function; for example, if is equal to 1, an exponential form is obtained [30].

Table 1 presents a summary of the parameter estimation procedure. The row EPSILON 1 presents the results considering (5) to calculate the parameter variance matrix. The results in row EPSILON 2 considered (10) to calculate the parameter variance matrix. Firstly, it can be seen that the model reported by dos Santos et al. [22] presents a very good correlation coefficient and a low value of the objective function.

It can also be observed that the parameter variance is small when compared to the value of the parameter; consequently, all the parameters can be considered significant. This fact occurred independently of the approach used for the calculations of the parametric variance. This conclusion is also obtained after analyzing the parameter confidence intervals calculated for 95% of confidence level, regardless of the value used for in (11). It can be seen that all parameters are significant as they are all statistically different from zero. However, it is important to emphasize that for this nonlinear parameter estimation study, the choice of the approach used to calculate the parameter variance led to differences of one order of magnitude for parameters and . Besides, some interesting effects regarding parameter correlation showed up; more specifically, by choosing the simplified approach, correlation between parameters and and parameters and considerably increased.

Figure 1 presents the experimental values plotted against the model predictions. Bar errors were obtained using (4) to obtain an approximation of the experimental error and (14) to obtain the error of the model predictions of the experimental data used in the parameter estimation task. It can be observed that model predictions are in good agreement with the experimental values as the points are close to the straight line, which indicates the ideal case that model predictions are equal to the experimental values. Figure 2 presents a comparison of experimental data and model predictions plotted against the independent variable. As mentioned before, experimental error was calculated using (4). Again, one can see that the fractional model adequately describes the experimental data, especially for low concentrations of lead in the aqueous solutions, where low variations cause large variations in the lead concentration in the hyacinth. Besides, Figure 2 also presents the confidence interval limits ( ) considering (15) to calculate the model prediction error and using the complete Hessian to calculate .

Parameter plays a key role in the model because, as mentioned before, according to its value the equation can assume a different form. Besides, it is the order of the fractional differential equation (see (2)). Therefore, it is important to prove that is statistically different from 1, otherwise an integer order differential equation would be a model with the same efficiency. Initially, this analysis considers the confidence interval presented in Table 1. However, this may not be enough, as the parameter joint confidence region plays a key role. Figure 3 presents the region obtained by (12) considering the parametric variance matrix obtained by (5) (solid region) and considering the parametric variance matrix obtained by (9) (wired region). One can observe that is different from 1; consequently, the use of fractional order derivative is significant. Secondly, the approach used for calculating the parameter variance significantly influences the parameter confidence region as it can be seen by the difference in size and shape of the ellipsoids. This is an important consequence as a set of parameters which is inside the wired region may be outside the solid region; therefore, it may not be statistically significant.

Finally, it needs to be remembered that the joint confidence regions shown in Figure 3 were obtained by a simplified though useful approach. For nonlinear parameter estimation problems, a more accurate parameter confidence region is obtained by (13), which is presented by Figure 4. It is essential to stress that the size and the shape of the region may considerably change as (13) preserves the nonlinear features of the problem, remembering that (12) is somehow a linearization of (13). Moreover, it must be emphasized that parameter is still lower than 1 (see Figure 4); consequently, the model nature and experimental data behavior can be adequately described by a fractional order model.

It is important to analyze the variance of the model predictions of future experiments, which can be obtained by (15). Figure 5 presents the variance predictions plotted against the independent variable, that is, lead concentration in the aqueous solution. The variances were calculated using the different parametric variance matrixes; that is, Epsilon1 used (5) and Epsilon2 used (10). One can observe that using (10) the variance of future experiments predictions considerably changes. It must be remembered that, although often used, (10) is an approximation of the calculation of the parametric variance. Therefore, for nonlinear problems as the one present in fractional calculus applications, (5) should be chosen. It is also important to note that the minimum values of variance are obtained for lower values of the independent variable. For higher values, the variance tends to increase.

4. Conclusions

The availability of mathematical models plays a key role in understanding heavy metal equilibrium phenomena. The literature reports different approaches for modeling the sorption process, particularly lead. Recently, a model based on fractional calculus was proposed to describe experimental data concerning lead sorption by hyacinths. This paper reported the use of an error analysis procedure for mathematical models based on fractional order differential equations in order to show that the fractional order is in fact fractional.

Parametric variance matrix was calculated by two different approaches, one considering the complete Hessian matrix and the other considering a simplification of its elements. It was observed that the use of the complete Hessian matrix leads to different results; consequently, the simplified approach may not be recommended for some nonlinear parameter estimation problems, such as the one reported in this paper. Joint confidence regions played a key role in the analysis of parameter confidence intervals, especially in the order of the fractional differential equation. It is also important to conclude that the fractional model considered was in fact fractional, as the estimated order of the derivative was higher than its error and also statistically different from 1, showing that the fractional nature of the model is valid.

Appendix

Gamma function is defined by and Psi function, and some properties can be obtained from Lebedev [38]:

Nomenclature

:Objective function
: Value of distribution considering ( ) as confidence level and NP and NE−NP as degrees of freedom and
NE: Number of experiments
NP: Number of parameters
: Correlation coefficient
Value of the Student -distribution for a confidence level of and (NE−NP) degrees of freedom, where
: Independent variable-lead concentration in the aqueous solution
: Dependent variable-lead concentration in the hyacinth
:Value of the normal distribution for a confidence level of , where .
Greek Letters
:Dummy variable
: Variance
: th parameter.
Superscript
–1: Inverse
: Transpose
: Model
: Experiment
: Optimized or estimated value
: Predicted value of the dependent variable of a given
future experiment.

Acknowledgments

The authors thank CNPQ, CAPES, and Fundação Araucária for providing financial support and scholarships.