Comparative Study between Generalized Maximum Entropy and Bayes Methods to Estimate the Four Parameter Weibull Growth Model

Kamar, Saifaldin Hashim; Msallam, Basim Shlaibah

doi:https://doi.org/10.1155/2020/7967345

Journal of Probability and Statistics

On this page

Abstract Introduction Materials and Methods Results and Discussion Conclusions Data Availability Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 7967345 | https://doi.org/10.1155/2020/7967345

Comparative Study between Generalized Maximum Entropy and Bayes Methods to Estimate the Four Parameter Weibull Growth Model

Saifaldin Hashim Kamar¹and Basim Shlaibah Msallam²

Academic Editor: Dongchu Sun

Received28 Mar 2019

Revised23 May 2019

Accepted19 Aug 2019

Published14 Jan 2020

Abstract

The Weibull growth model is an important model especially for describing the growth instability; therefore, in this paper, three methods, namely, generalized maximum entropy, Bayes, and maximum a posteriori, for estimating the four parameter Weibull growth model have been presented and compared. To achieve this aim, it is necessary to use a simulation technique to generate the samples and perform the required comparisons, using varying sample sizes (10, 12, 15, 20, 25, and 30) and models depending on the standard deviation (0.5). It has been shown from the computational results that the Bayes method gives the best estimates.

1. Introduction

The models of growth curves describe the increase in weight, height, length, or other parameters depending on the type of phenomenon studied in relation to time. These models vary according to the shape of the curve and the number of parameters it consists of. The most important of these are Weibull growth models with one, two, three, and four parameters. The latest model is preferred by most researchers when studying the growth of phenomena that fit the direction of growth of the curve. Based on this, the study focused on the four parameter Weibull growth model to be estimated according to the methods used: Bayes, maximum a posteriori, and generalized maximum entropy.

Properties of growth curves and applications are important topics studied in detail by many researchers, as well as their advantages and disadvantages. Shafii et al. [1] presented a set of growth models, including the Weibull curve. Cumulative germination of onion seeds was represented using these curves and adopted for evaluation and appropriate as mathematical and experimental models while avoiding the constraints associated with using the method of moment. The four parameter Weibull model gave the best fit model through a variety of seed species and germination conditions. Fekedulegn et al. [2] introduced the partial derivatives of some nonlinear growth models, including the Weibull model. These partial derivatives were represented to estimate the parameters of different models using the Marquardt iterative method of nonlinear regression relating top height to age of Norway spruce from the Bowmont Norway spruce thinning experiment. Formulas that provide good initial values of the parameters are specified. Clear definitions of the parameters of the nonlinear models in the context of the system being modelled are found to be critically important in the process of parameter estimation. Narushin and Takma [3] conducted a study to choose the best predictive model for the accurate description of the average flock growth of laying hens and for the daily egg mass being produced by the layers during the productive period. The calculations were undertaken with the generalized data on the weight growth and daily egg mass, produced by the commercial flock of a Shaver White laying hens breed. The Narushin–Takma model was compared with different growth models, including the Weibull growth model. The other was highly accurate and convergent, and the Weibull model was the most accurate compared with other models in the study of body growth curve; moreover, the logistic growth model was the best in studying the egg mass producing curve. Tabatabai et al. [4] studied the application of models logistics, Gompertz, Richards, and Weibull to medical and biomedical studies. These mathematical models describe the growth kinetics, which are very important for predicting many biological phenomena such as tumor volume, speed of disease progression, and determination of an optimal radiation and/or chemotherapy schedule. The Newton–Raphson method was used to estimate the parameters. A new family of growth models that predict the volumetric growth behavior of multicellular tumor spheroids with a high degree of accuracy was developed. The study concluded that the family of hyperbolastic models can be a valuable predictive tool in many areas of biomedical and epidemiological research such as cancer or stem cell growth and infectious disease outbreaks. Eruygur [5] presented in his paper two parts, the first of which was the formulation of the new technique (generalized maximum entropy). The second was the Monte Carlo simulation results to compare the results of GME with those of the OLS method in the context of nonnormal disturbances. The researcher concluded that the performance of the GME estimator is remarkably good when compared with that of the OLS estimator, especially for small sample sizes. In addition, in case of nonnormal disturbances, this performance becomes prominently better. Ciavolino [6] discussed two main sectors. The first part presented the two estimation methods, namely, Generalized Maximum Entropy (GME) and Partial Least Squares (PLS), and he gave an overview of the main characteristics of both methods. The second part gave a brief introduction to the job satisfaction model using simulations and compared both the results of the estimation with an existing multicollinearity problem and consequently found that the GME method gave the best estimation. Ciavolino and Al-Nasser [7] aimed at applying both the Generalized Maximum Entropy (GME) and Maximum Likelihood (MLE) methods based on two different sampling techniques to improve the estimations of the Gompertz’s model, which illustrates the relationship between the force of mortality and age. The simulations were used to differentiate between the two previous methods and found that GME is better than MLE using MSE. Cousineau and Helie [8] used a tutorial program describing the technique of estimating the parameters using the maximum a posteriori method. The estimates are based on the mode of the posterior distribution of a Bayesian analysis. The relationship between maximum a posteriori estimation, maximum likelihood estimation, and Bayesian estimation is discussed; nevertheless, example simulations are presented using the Weibull distribution. It shows that, for the Weibull distribution, the mode produces a less biased and more reliable point estimate of the parameters than the mean or the median of the posterior distribution. The study also discussed the advantages and limitations of the maximum a posteriori estimation. Raji et al. [9] compared seven growth models using body weight measurements for 300 progeny obtained from bred parents representing the Japanese quail, which lives in Nigeria. The study which lasted for 20 weeks was carried out at the University of Maiduguri Livestock Teaching and Research farm. The Weibull growth model gave the highest value for R² as compared with the other models, while the lowest value was given for mean square error, standard deviation, and Akaikes information criterion. Mahanta and Borah [10] discussed some of the characteristics of two, three, and four parameters of the Weibull growth model and their application to forest trees. The parameters of these models were estimated using the Newton–Raphson iteration method for the mean diameter at breast height data and top height growth data originated from the Bowmont Norway spruce thinning experiment. The average height of 12 weeping Higan Cherry trees planted in Washington, D.C. is also used. The study found that the four parameter Weibull growth model was the best for the growth of forestry data sets.

The aim of this study is to present and estimate the parameters of the four parameter Weibull growth model as a nonlinear model; therefore, we used generalized maximum entropy, Bayes, and maximum a posteriori methods to compare the estimates of the parameters; moreover, we used the simulation technique to obtain the results.

To achieve the aim of the research, the paper is divided into five sections: the first section presents the introduction including the literature review, with a clarification of the aim and methodology, while the second section presents the four parameter Weibull growth model. The third section presents the methods used to estimate the parameters of the Weibull model. The fourth section describes the simulation experiment, and the results of the research are discussed in the fifth section. At last, the six section presents the conclusions.

2. Materials and Methods

2.1. Weibull Growth Model

The four parameter Weibull growth model is one of the nonlinear models, so dealing with it is not devoid of complexity that accompanies these models. This model can take the following form [2]:where is the size of growth at time , is the largest growth size at , is the scale parameter related with initial value, is the relative growth curve, is the shape parameter, and is the random error.

All parameters are greater than zero in the biological applications, in addition to .

The Weibull growth model is a sigmoidal symmetric growth curve at ; otherwise, it has no inflexion point [11, pp. 48‐49], and like a sigmoidal symmetric growth model, it has a part in which the shape of the curve increases exponentially and then inflects at a certain point to decline, creating a sigmoidal shape in which the two sides of the curve are symmetric, whereas in biological growth analyses, and are always positive [10].

The Weibull growth model that is explained in equation (1) can be reformulated as a regression formula as follows:where represents the nonlinear response function that is explained in equation (1), and is the response variable at i, where i = 1, 2, …, n.

Assuming random errors are uncorrelated and it has multivariate normal distribution, that is mean ε∼, by using factorial for the Taylor series of the nonlinear response that is explained in equation (2), and reducing it at first derivative produce [12, pp. 2281–2296]

Assume , , and .

Equation (3) can be compensated in model (2) to obtain:

Also, the following equation can be obtained:

The equation (5) is a linear formula and can be rewritten as a matrix as follows:where , vector of the rank (n × 1) for the response variable ; , matrix of the rank (n × 4) for the explanatory variables; , vector of the rank (4 × 1) for unknown parameters; , vector of the rank (n × 1) for random errors, as ∼.

2.2. Bayes Method (BAY)

We can find the joint function of variables (likelihood function) depending on equation (6) as follows:

Taking the natural logarithm and then making the partial derivation of the parameter vector and solving the equations, we obtain the following estimation formula:

Equation (8) is the Maximum Likelihood Estimation (MLE) of the parameters that is equivalent Least Squares method LS⁽⁾, consequently adding the estimator in equation (8) to initial estimating for obtaining MLE of the nonlinear model parameters in equation (2) , moreover iteration technical can obtained the final MLE of the parameters .

Using the Bayes method, we can estimate the linear model parameters that are explained in Equation (6), depending on the likelihood function shown in Equation (7); accordingly, probability density function of can be written proportionally as follows:where

In order to obtain the function of the posterior probability density through which a Bayes estimator is extracted, it is necessary to know the function of the prior probability density. The natural conjugate prior probability density function has been adopted. This type of prior distributions has an appropriate probability density function assumed depending on available information about the parameters [13, p. 97].

Assuming the multiple linear regression model described in equation (6) and when we have simple theoretical information about the parameter of standard deviation that represented the lower and upper limits, the second rule proposed by Jeffery can be used to obtain the following distribution [14]:where the vector of parameters is multivariate normal distribution:

We can obtain the joint prior probability density function of the parameters and from equations (10) and (11), that is,

Multiplying equation (12) with the likelihood function shown in equation (7), we obtain the joint posterior probability density function of the parameters and , that is,

Joint posterior probability density function of the parameters can be obtained by integrating equation (13) with respect to , that is,where

Equation (14) is the multivariate t distribution of freedom n with the means , which represents the Bayes estimator for the parameter using a square loss function:where the values of the vector are determined depending on the default values of the parameters ; moreover, the values of the matrix are as follows:

Estimation according to equation (15) is added to the initial estimation to obtain the Bayes estimation for the parameters of the nonlinear model shown in equation (2), and accordingly, we reach the final estimation of the parameters based on the repetition rule.

2.3. Maximum A Posteriori Method (MAP)

This method is a form of Bayes estimation based on the posterior distribution of the parameters for the specific model represented here in equation (6). This method is based on the data instead of the mean to find the estimation, and assuming that known, the following equation can be obtained:where , the prior distribution for with known , and it does not contain any value from and , likelihood function. is the probability distribution function of the observed data that is independent of and is extracted as follows:

The maximum a posteriori estimate is obtained by simplifying equation (17) as follows [8]:

Equation (19) is equivalent to the method of the MLE when adding the initial distribution.

Estimation according to equation (19) is added to the initial estimation to obtain the maximum a posteriori estimate for the parameters of the nonlinear model shown in equation (2), and based on the repetition rule, we can reach the final estimation of the parameters .

Although the maximum a posteriori estimating gives larger variances than Bayes estimating in general, it is characterized by easy dealing with the nonlinear growth models, as it is not required to obtain the posterior distribution in its full form, but only as approximation, which means we do not need to perform the numerical integration.

2.4. Generalized Maximum Entropy Method (GME)

According to this method, we perform reparameterization and reformulation of the linear regression model (6) in the form of expected values by writing its parameters , j = 1, 2, 3, 4 and random errors , i = 1, 2, …, n in the form of convex combination with a finite support variable whose dimensions usually range from 2 to 7 according to Ciavolino [6]. Thus, five support variables were identified for each parameter d_j = (d_j1, d_j2, d_j3, d_j4, d_j5) and for each error term r_i = (r_i1, r_i2, r_i3, r_i4, r_i5). Note that the support variables of parameters are not necessarily equal to the random errors.

The convex combination of (6) can be defined as follows:where the probabilities and , and then according to maximization of Shannon’s entropy function, we can estimate .

Equation (20) can be rewritten in terms of matrices as follows:where D, diagonal matrix with rank 4 × 20 of the support variables that were defined by equation (20). P is the vector with rank 20 × 1 of the probabilities that were defined by equation (20).

Also,where c is a constant symmetric around zero, where increasing c leads to increase in Mean Squares Errors (MSE) [5], so that c is chosen as 1.

Similarly, the same steps can be applied for random errors as follows:where the probabilities and , and then according to maximization of Shannon’s entropy function, we can estimate .

Equation (23) can be rewritten in terms of matrices as follows:where R, diagonal matrix with rank n × 5n of the support variables that were defined by equation (23). M is the vector with rank 5n × 1 of the probabilities that were defined by equation (23). Also,

Using equations (21) and (24) in the Weibull growth model (6), we can rewrite it as follows:

Note that the vector of probability values for parameters and random errors in the model (26) is unknown, so we must estimate the parameters (the values of are estimated based on the mathematical expectation law of the discrete random variable . To estimate the vectors M and P, we maximize the Shannon’s entropy function by using a Lagrange multiplier after determining the constraints as follows:where is Shannon’s entropy function.

Also,where , vector of one (1 1 1 1 1) and , defined as (20).

Using the Kronecker product of the matrix in equation (29), we get the final form of parameter constraints as follows:

As well as the random error constraints of the Weibull growth model using the previous steps (28)–(30), we obtain the final value of random error constraints as follows:

Also, the last constraint is the Weibull growth model constraint defined in (26).

Shannon’s entropy function can be written as Lagrange function:where λ, T, and Γ are the Lagrange coefficients.

By taking the gradient of L, it is possible to derive (32) with respect to (Γ, Θ, λ, M, P) and equal to zero, and the numerical technique should be used to compute the estimators [7]. Note that this method is difficult and more complex with the partial derivative, so Ciavolino [6], directly dealing with equation (27), rewrite it as follows:

Depending on the following function in Matlab,where lb and ub are, respectively, the lower and the upper bound and p₀ is the vector of the initialization values, whose elements are between zero and one, and

The function (34) used the iteration method to reach the final values when there is convergence of two successive estimators. After estimating the vectors M and P that are the outputs function (34), we must represent it in equation (26) to obtain the predictive values of .

3. Results and Discussion

We generated data by using Matlab for experiments with size of 1000 to represent the random errors which have normal distribution with mean zero and standard deviation 0.5 to simulate the oil production in Iraq for the period from 2003 to 2015 [15]. To achieve that, we used six experiments based on the method of finding initial values assumed by Mahanta and Borah [10], as shown in Table 1.

The results of estimation of the parameters of the Weibull growth model were compared using Bayes (BAY), Maximum A Posteriori (MAP), and Generalized Maximum Entropy (GME). Consequently, Table 2 shows the values of the estimated parameters; moreover, Tables 3 and 4 show the results of the comparison of estimation methods based on Mean Squares Error (MSE) and Total Deviation (TD) for the estimated parameters, that is,where β is the default parameter and is the estimated parameter by using one of the methods that are used in this research.

Through Tables 3 and 4, it can be seen that Bayes is the best compared with the other methods, as it gave the minimum mean squares error and total deviation for most experiments; nevertheless, we also note that the generalized maximum entropy showed better results when the sample size was 10 and 12 only.

4. Conclusions

The empirical results showed that the Bayes method estimation for the parameters of the four parameter Weibull growth model was the best compared with the other methods, based on the mean squares error and total deviation. Moreover, the results also showed that the posteriori method is characterized by the easy dealing with the multiparameter nonlinear growth models; in addition, it is not required to obtain the posterior distribution in full form, but only as an approximation. With respect to the generalized maximum entropy, results have given estimates of the four parameter Weibull growth model better than other methods, in the case of very small size of samples, and these estimates are weakened as the sample size increases.

Data Availability

The data used to support this study are available on the website of the Organization of Arab Petroleum Exporting Countries OAPEC, (http://oapecdbsys.oapecorg.org:8080/apex/f?p=101:23:::NO:::). The six experiments performed in this study were based on the method of finding initial values assumed in Ref. [10] and are shown in Table 1.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

B. Shafii, W. J. Price, J. B. Swensen, and G. A. Murray, “Nonlinear estimation of growth curve models for germination data analysis,” in Proceedings of the 3rd Annual Conference Proceedings, Annual Conference of Applied Statistics in Agriculture, pp. 19–42, Kansas State University, Manhattan, NY, USA, 1991.
View at: Google Scholar
D. Fekedulegn, M. P. Mac Siurtain, and J. J. Colbert, “Parameter estimation of nonlinear growth models in forestry,” Silva Fennica, vol. 33, no. 4, pp. 327–336, 1999.
View at: Publisher Site | Google Scholar
V. G. Narushin and C. Takma, “Sigmoid model for the evaluation of growth and production curves in laying hens,” Biosystems Engineering, vol. 84, no. 3, pp. 343–348, 2003.
View at: Publisher Site | Google Scholar
M. Tabatabai, D. K. Williams, and Z. Bursac, “Hyperbolastic growth models: theory and application,” Theoretical Biology and Medical Modelling, vol. 2, no. 14, pp. 1–13, 2005.
View at: Publisher Site | Google Scholar
H. O. Eruygur, “Generalized maximum entropy (GME) estimator: formulation and a Monte Carlo study,” in Proceedings of the National Symposium on Econometrics and Statistics, Istanbul, Turkey, May, 2005, https://mpra.ub.uni-muenchen.de/12459/1/MPRA_paper_12459.pdf.
View at: Google Scholar
E. Ciavolino, “Modelling GME and PLS estimation methods for evaluating the job satisfaction in the public sector,” in Proceedings of the QMod 07, University of Helsingborg, Linköping University Electronic Press, Helsingborg, Sweden, June 2007.
View at: Google Scholar
E. Ciavolino and A. D. Al-Nasser, “Information theoretic estimation improvement to the nonlinear gompertz’s model based on ranked set sampling,” Journal of Applied Quantitative Methods, vol. 5, no. 2, pp. 317–330, 2010.
View at: Google Scholar
D. Cousineau and S. Hélie, “Improving maximum likelihood estimation using prior probabilities: a tutorial on maximum a posteriori estimation and an examination of the weibull distribution,” Tutorials in Quantitative Methods for Psychology, vol. 9, no. 2, pp. 61–71, 2013.
View at: Publisher Site | Google Scholar
A. O. Raji, S. T. Mbap, and J. Aliyu, “Comparison of different models to describe growth of the Japanese quail,” Trakia J. Sci., vol. 2, pp. 182–188, 2014.
View at: Google Scholar
D. J. Mahanta and M. Borah, “Parameter estimation of weibull growth models in forestry,” International Journal of Mathematics Trends and Technology, vol. 8, no. 3, pp. 157–163, 2014.
View at: Google Scholar
Ph. Grosjean, “Growth model of the reared sea urchin Paracentrotus lividus (Lamarck, 1816),” Agronomic Sciences and Biological Engineering, Université Libre de Bruxelles, Brussels, Belgium, 2001, Ph.D. thesis.
View at: Google Scholar
J. K. Sohn and S. G. Kang, “Bayesian estimation procedure in multiprocessor non–linear dynamic generalized model,” in Communications in Statistics: Theory and Method, vol. 25, Taylor & Francis Group, London, UK, 1996.
View at: Google Scholar
G. G. Judje, W. E. Griffiths, R. C. Hill, and T. C. Lee, Theory and Practice of Econometrics, John Wiley & Sons, New York, NY, USA, 1980.
E. F. Halpern, “Polynomial regression from a Bayesian approach,” Journal of the American Statistical Association, vol. 68, no. 341, pp. 137–143, 1973.
View at: Publisher Site | Google Scholar
Organization of Arab Petroleum exporting Countries OAPEC, Annual Statistical Report, Organization of Arab Petroleum exporting Countries OAPEC, Beirut, Lebanon, 2017, http://oapecdbsys.oapecorg.org:8080/apex/f?p=101:4:943634539090::NO::P4_COUNTRY:106.

Copyright

Copyright © 2020 Saifaldin Hashim Kamar and Basim Shlaibah Msallam. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1115

Downloads

1128

Citations