Abstract

The Weibull growth model is an important model especially for describing the growth instability; therefore, in this paper, three methods, namely, generalized maximum entropy, Bayes, and maximum a posteriori, for estimating the four parameter Weibull growth model have been presented and compared. To achieve this aim, it is necessary to use a simulation technique to generate the samples and perform the required comparisons, using varying sample sizes (10, 12, 15, 20, 25, and 30) and models depending on the standard deviation (0.5). It has been shown from the computational results that the Bayes method gives the best estimates.

1. Introduction

The models of growth curves describe the increase in weight, height, length, or other parameters depending on the type of phenomenon studied in relation to time. These models vary according to the shape of the curve and the number of parameters it consists of. The most important of these are Weibull growth models with one, two, three, and four parameters. The latest model is preferred by most researchers when studying the growth of phenomena that fit the direction of growth of the curve. Based on this, the study focused on the four parameter Weibull growth model to be estimated according to the methods used: Bayes, maximum a posteriori, and generalized maximum entropy.

Properties of growth curves and applications are important topics studied in detail by many researchers, as well as their advantages and disadvantages. Shafii et al. [1] presented a set of growth models, including the Weibull curve. Cumulative germination of onion seeds was represented using these curves and adopted for evaluation and appropriate as mathematical and experimental models while avoiding the constraints associated with using the method of moment. The four parameter Weibull model gave the best fit model through a variety of seed species and germination conditions. Fekedulegn et al. [2] introduced the partial derivatives of some nonlinear growth models, including the Weibull model. These partial derivatives were represented to estimate the parameters of different models using the Marquardt iterative method of nonlinear regression relating top height to age of Norway spruce from the Bowmont Norway spruce thinning experiment. Formulas that provide good initial values of the parameters are specified. Clear definitions of the parameters of the nonlinear models in the context of the system being modelled are found to be critically important in the process of parameter estimation. Narushin and Takma [3] conducted a study to choose the best predictive model for the accurate description of the average flock growth of laying hens and for the daily egg mass being produced by the layers during the productive period. The calculations were undertaken with the generalized data on the weight growth and daily egg mass, produced by the commercial flock of a Shaver White laying hens breed. The Narushin–Takma model was compared with different growth models, including the Weibull growth model. The other was highly accurate and convergent, and the Weibull model was the most accurate compared with other models in the study of body growth curve; moreover, the logistic growth model was the best in studying the egg mass producing curve. Tabatabai et al. [4] studied the application of models logistics, Gompertz, Richards, and Weibull to medical and biomedical studies. These mathematical models describe the growth kinetics, which are very important for predicting many biological phenomena such as tumor volume, speed of disease progression, and determination of an optimal radiation and/or chemotherapy schedule. The Newton–Raphson method was used to estimate the parameters. A new family of growth models that predict the volumetric growth behavior of multicellular tumor spheroids with a high degree of accuracy was developed. The study concluded that the family of hyperbolastic models can be a valuable predictive tool in many areas of biomedical and epidemiological research such as cancer or stem cell growth and infectious disease outbreaks. Eruygur [5] presented in his paper two parts, the first of which was the formulation of the new technique (generalized maximum entropy). The second was the Monte Carlo simulation results to compare the results of GME with those of the OLS method in the context of nonnormal disturbances. The researcher concluded that the performance of the GME estimator is remarkably good when compared with that of the OLS estimator, especially for small sample sizes. In addition, in case of nonnormal disturbances, this performance becomes prominently better. Ciavolino [6] discussed two main sectors. The first part presented the two estimation methods, namely, Generalized Maximum Entropy (GME) and Partial Least Squares (PLS), and he gave an overview of the main characteristics of both methods. The second part gave a brief introduction to the job satisfaction model using simulations and compared both the results of the estimation with an existing multicollinearity problem and consequently found that the GME method gave the best estimation. Ciavolino and Al-Nasser [7] aimed at applying both the Generalized Maximum Entropy (GME) and Maximum Likelihood (MLE) methods based on two different sampling techniques to improve the estimations of the Gompertz’s model, which illustrates the relationship between the force of mortality and age. The simulations were used to differentiate between the two previous methods and found that GME is better than MLE using MSE. Cousineau and Helie [8] used a tutorial program describing the technique of estimating the parameters using the maximum a posteriori method. The estimates are based on the mode of the posterior distribution of a Bayesian analysis. The relationship between maximum a posteriori estimation, maximum likelihood estimation, and Bayesian estimation is discussed; nevertheless, example simulations are presented using the Weibull distribution. It shows that, for the Weibull distribution, the mode produces a less biased and more reliable point estimate of the parameters than the mean or the median of the posterior distribution. The study also discussed the advantages and limitations of the maximum a posteriori estimation. Raji et al. [9] compared seven growth models using body weight measurements for 300 progeny obtained from bred parents representing the Japanese quail, which lives in Nigeria. The study which lasted for 20 weeks was carried out at the University of Maiduguri Livestock Teaching and Research farm. The Weibull growth model gave the highest value for R2 as compared with the other models, while the lowest value was given for mean square error, standard deviation, and Akaikes information criterion. Mahanta and Borah [10] discussed some of the characteristics of two, three, and four parameters of the Weibull growth model and their application to forest trees. The parameters of these models were estimated using the Newton–Raphson iteration method for the mean diameter at breast height data and top height growth data originated from the Bowmont Norway spruce thinning experiment. The average height of 12 weeping Higan Cherry trees planted in Washington, D.C. is also used. The study found that the four parameter Weibull growth model was the best for the growth of forestry data sets.

The aim of this study is to present and estimate the parameters of the four parameter Weibull growth model as a nonlinear model; therefore, we used generalized maximum entropy, Bayes, and maximum a posteriori methods to compare the estimates of the parameters; moreover, we used the simulation technique to obtain the results.

To achieve the aim of the research, the paper is divided into five sections: the first section presents the introduction including the literature review, with a clarification of the aim and methodology, while the second section presents the four parameter Weibull growth model. The third section presents the methods used to estimate the parameters of the Weibull model. The fourth section describes the simulation experiment, and the results of the research are discussed in the fifth section. At last, the six section presents the conclusions.

2. Materials and Methods

2.1. Weibull Growth Model

The four parameter Weibull growth model is one of the nonlinear models, so dealing with it is not devoid of complexity that accompanies these models. This model can take the following form [2]:where is the size of growth at time , is the largest growth size at , is the scale parameter related with initial value, is the relative growth curve, is the shape parameter, and is the random error.

All parameters are greater than zero in the biological applications, in addition to .

The Weibull growth model is a sigmoidal symmetric growth curve at ; otherwise, it has no inflexion point [11, pp. 48‐49], and like a sigmoidal symmetric growth model, it has a part in which the shape of the curve increases exponentially and then inflects at a certain point to decline, creating a sigmoidal shape in which the two sides of the curve are symmetric, whereas in biological growth analyses, and are always positive [10].

The Weibull growth model that is explained in equation (1) can be reformulated as a regression formula as follows:where represents the nonlinear response function that is explained in equation (1), and is the response variable at i, where i = 1, 2, …, n.

Assuming random errors are uncorrelated and it has multivariate normal distribution, that is mean ε∼, by using factorial for the Taylor series of the nonlinear response that is explained in equation (2), and reducing it at first derivative produce [12, pp. 2281–2296]

Assume , , and .

Equation (3) can be compensated in model (2) to obtain:

Also, the following equation can be obtained:

The equation (5) is a linear formula and can be rewritten as a matrix as follows:where , vector of the rank (n × 1) for the response variable ; , matrix of the rank (n × 4) for the explanatory variables; , vector of the rank (4 × 1) for unknown parameters; , vector of the rank (n × 1) for random errors, as .

2.2. Bayes Method (BAY)

We can find the joint function of variables (likelihood function) depending on equation (6) as follows:

Taking the natural logarithm and then making the partial derivation of the parameter vector and solving the equations, we obtain the following estimation formula:

Equation (8) is the Maximum Likelihood Estimation (MLE) of the parameters that is equivalent Least Squares method LS(), consequently adding the estimator in equation (8) to initial estimating for obtaining MLE of the nonlinear model parameters in equation (2) , moreover iteration technical can obtained the final MLE of the parameters .

Using the Bayes method, we can estimate the linear model parameters that are explained in Equation (6), depending on the likelihood function shown in Equation (7); accordingly, probability density function of can be written proportionally as follows:where

In order to obtain the function of the posterior probability density through which a Bayes estimator is extracted, it is necessary to know the function of the prior probability density. The natural conjugate prior probability density function has been adopted. This type of prior distributions has an appropriate probability density function assumed depending on available information about the parameters [13, p. 97].

Assuming the multiple linear regression model described in equation (6) and when we have simple theoretical information about the parameter of standard deviation that represented the lower and upper limits, the second rule proposed by Jeffery can be used to obtain the following distribution [14]:where the vector of parameters is multivariate normal distribution:

We can obtain the joint prior probability density function of the parameters and from equations (10) and (11), that is,

Multiplying equation (12) with the likelihood function shown in equation (7), we obtain the joint posterior probability density function of the parameters and , that is,

Joint posterior probability density function of the parameters can be obtained by integrating equation (13) with respect to , that is,where

Equation (14) is the multivariate t distribution of freedom n with the means , which represents the Bayes estimator for the parameter using a square loss function:where the values of the vector are determined depending on the default values of the parameters ; moreover, the values of the matrix are as follows:

Estimation according to equation (15) is added to the initial estimation to obtain the Bayes estimation for the parameters of the nonlinear model shown in equation (2), and accordingly, we reach the final estimation of the parameters based on the repetition rule.

2.3. Maximum A Posteriori Method (MAP)

This method is a form of Bayes estimation based on the posterior distribution of the parameters for the specific model represented here in equation (6). This method is based on the data instead of the mean to find the estimation, and assuming that known, the following equation can be obtained:where , the prior distribution for with known , and it does not contain any value from and , likelihood function. is the probability distribution function of the observed data that is independent of and is extracted as follows:

The maximum a posteriori estimate is obtained by simplifying equation (17) as follows [8]:

Equation (19) is equivalent to the method of the MLE when adding the initial distribution.

Estimation according to equation (19) is added to the initial estimation to obtain the maximum a posteriori estimate for the parameters of the nonlinear model shown in equation (2), and based on the repetition rule, we can reach the final estimation of the parameters .

Although the maximum a posteriori estimating gives larger variances than Bayes estimating in general, it is characterized by easy dealing with the nonlinear growth models, as it is not required to obtain the posterior distribution in its full form, but only as approximation, which means we do not need to perform the numerical integration.

2.4. Generalized Maximum Entropy Method (GME)

According to this method, we perform reparameterization and reformulation of the linear regression model (6) in the form of expected values by writing its parameters , j = 1, 2, 3, 4 and random errors , i = 1, 2, …, n in the form of convex combination with a finite support variable whose dimensions usually range from 2 to 7 according to Ciavolino [6]. Thus, five support variables were identified for each parameter dj = (dj1, dj2, dj3, dj4, dj5) and for each error term ri = (ri1, ri2, ri3, ri4, ri5). Note that the support variables of parameters are not necessarily equal to the random errors.

The convex combination of (6) can be defined as follows:where the probabilities and , and then according to maximization of Shannon’s entropy function, we can estimate .

Equation (20) can be rewritten in terms of matrices as follows:where D, diagonal matrix with rank 4 × 20 of the support variables that were defined by equation (20). P is the vector with rank 20 × 1 of the probabilities that were defined by equation (20).

Also,where c is a constant symmetric around zero, where increasing c leads to increase in Mean Squares Errors (MSE) [5], so that c is chosen as 1.

Similarly, the same steps can be applied for random errors as follows:where the probabilities and , and then according to maximization of Shannon’s entropy function, we can estimate .

Equation (23) can be rewritten in terms of matrices as follows:where R, diagonal matrix with rank n × 5n of the support variables that were defined by equation (23). M is the vector with rank 5n × 1 of the probabilities that were defined by equation (23). Also,

Using equations (21) and (24) in the Weibull growth model (6), we can rewrite it as follows:

Note that the vector of probability values for parameters and random errors in the model (26) is unknown, so we must estimate the parameters (the values of are estimated based on the mathematical expectation law of the discrete random variable . To estimate the vectors M and P, we maximize the Shannon’s entropy function by using a Lagrange multiplier after determining the constraints as follows:where is Shannon’s entropy function.

Also,where , vector of one (1 1 1 1 1) and , defined as (20).

Using the Kronecker product of the matrix in equation (29), we get the final form of parameter constraints as follows:

As well as the random error constraints of the Weibull growth model using the previous steps (28)–(30), we obtain the final value of random error constraints as follows:

Also, the last constraint is the Weibull growth model constraint defined in (26).

Shannon’s entropy function can be written as Lagrange function:where λ, T, and Γ are the Lagrange coefficients.

By taking the gradient of L, it is possible to derive (32) with respect to (Γ, Θ, λ, M, P) and equal to zero, and the numerical technique should be used to compute the estimators [7]. Note that this method is difficult and more complex with the partial derivative, so Ciavolino [6], directly dealing with equation (27), rewrite it as follows:

Depending on the following function in Matlab,where lb and ub are, respectively, the lower and the upper bound and p0 is the vector of the initialization values, whose elements are between zero and one, and

The function (34) used the iteration method to reach the final values when there is convergence of two successive estimators. After estimating the vectors M and P that are the outputs function (34), we must represent it in equation (26) to obtain the predictive values of .

3. Results and Discussion

We generated data by using Matlab for experiments with size of 1000 to represent the random errors which have normal distribution with mean zero and standard deviation 0.5 to simulate the oil production in Iraq for the period from 2003 to 2015 [15]. To achieve that, we used six experiments based on the method of finding initial values assumed by Mahanta and Borah [10], as shown in Table 1.

The results of estimation of the parameters of the Weibull growth model were compared using Bayes (BAY), Maximum A Posteriori (MAP), and Generalized Maximum Entropy (GME). Consequently, Table 2 shows the values of the estimated parameters; moreover, Tables 3 and 4 show the results of the comparison of estimation methods based on Mean Squares Error (MSE) and Total Deviation (TD) for the estimated parameters, that is,where β is the default parameter and is the estimated parameter by using one of the methods that are used in this research.

Through Tables 3 and 4, it can be seen that Bayes is the best compared with the other methods, as it gave the minimum mean squares error and total deviation for most experiments; nevertheless, we also note that the generalized maximum entropy showed better results when the sample size was 10 and 12 only.

4. Conclusions

The empirical results showed that the Bayes method estimation for the parameters of the four parameter Weibull growth model was the best compared with the other methods, based on the mean squares error and total deviation. Moreover, the results also showed that the posteriori method is characterized by the easy dealing with the multiparameter nonlinear growth models; in addition, it is not required to obtain the posterior distribution in full form, but only as an approximation. With respect to the generalized maximum entropy, results have given estimates of the four parameter Weibull growth model better than other methods, in the case of very small size of samples, and these estimates are weakened as the sample size increases.

Data Availability

The data used to support this study are available on the website of the Organization of Arab Petroleum Exporting Countries OAPEC, (http://oapecdbsys.oapecorg.org:8080/apex/f?p=101:23:::NO:::). The six experiments performed in this study were based on the method of finding initial values assumed in Ref. [10] and are shown in Table 1.

Conflicts of Interest

The authors declare that they have no conflicts of interest.