Analytical Methods to Model NatureView this Special Issue
Approximate Maximum Likelihood Estimations for the Parameters of the Generalized Gudermannian Distribution and Its Characterizations
Generalized Gudermannian distribution is a symmetric distribution with location and scale parameters as an alternative to the well-known symmetric distributions such as normal, Laplace, and Cauchy. This distribution has a simple closed form for the probability density function (pdf) and cumulative distribution function (cdf) and is more flexible than normal distribution based on kurtosis criterion. Certain characterizations of this distribution are presented. For this distribution, due to the nonlinear form of the likelihood equations, the maximum likelihood estimations (MLEs) of the location and scale parameters do not have closed forms and need a numerical approximation method with suitable starting values. A simple method of deriving explicit estimators by approximating the likelihood equations is presented. The bias and variance of these estimators are examined numerically and are shown that these estimators are as efficient as the maximum likelihood estimators. Some pivotal quantities are proposed for finding confidence intervals for location and scale parameters based on asymptotic normality. From the coverage probability, the MLEs do not work well especially for the small sample sizes; thus, to improve the coverage probability, simulated percentiles based on the Monte Carlo method are used. Finally, a real data set is presented to illustrate the suggested method and its inference related to this data set.
The well-known symmetric distributions (Normal, t-student) are basis of all types of modelling such as regression, reliability, etc. A drawback for creating a new model based on these distributions is that the cumulative distribution function does not have a closed form. For example, a multicomponent stress strength model for a real valued data set cannot be created based on these distribution because of the reasons mentioned above. Rasekhi et al.  worked on this subject based on another symmetric distribution cdf (generalized logistic distribution) which has a closed form.
The standard generalized Gudermannian distribution, is a symmetric distribution about zero, with the following pdfand the corresponding cdf
He extended this distribution by adding a skew parameter to this distribution. In future, researchers can use this distribution in many areas of statistics such as regression analysis, Bayesian statistics, and so on. In this work, a version of (1) with location and scale parameters is considered. For and , the random variable is considered where Y has the generalized Gudermannian distribution with pdfwhere and are the location and scale parameters, respectively. Note that
In this paper, the estimation of the location and scale parameters of the generalized Gudermannian (GG) distribution is considered. The paper is organized as follows. In Section 2, we discuss MLEs of the location and scale parameters of the GG distribution and obtain explicit estimators by a suitable approximation of the likelihood equations. The expressions of the observed Fisher information are calculated in Section 3. In Section 4, coverage probabilities for asymptotic normal pivotal quantities are simulated and the simulated percentiles for these pivotal quantities are obtained. Results of a simulation study for illustrating the performance of the approximate MLEs and the MLEs are presented in Section 5. Section 6 provides a real data example to illustrate all the methods of inference on this data set. Certain characterizations of this distribution are presented in the last section.
2. Likelihood Function
In this section, the GG distribution is considered and the estimation of the location and scale parameters are discussed.
2.1. MLEs of the Parameters
Suppose is a random sample of size n from (4) and are the corresponding order statistics. Based on the order statistics, the likelihood function is as follows:
In view of (5), the likelihood function can be rewritten as follows:where . So, the log-likelihood function of GG distribution can be expressed as follows:
Taking partial derivatives from log-likelihood function with respect to and , we obtain the following equations which have to be solved simultaneously:
The system of equation (9) does not have closed form solutions for and and thus must be solved numerically to obtain the maximum likelihood estimations. For this purpose, the Newton–Raphson method can be used to obtain the MLEs of and , but this method needs suitable starting values for the parameters.
2.2. Approximate MLEs of the Parameters
Since the system of equation (9) is nonlinear, the solutions of these equations do not have a closed form. In what follows, we set . By approximating the system of equation (9), approximate MLEs (AMLEs) for and are provided. The AMLE method does not need any starting values. Based on the methods suggested by David  and Arnold and Balakrishnan , the function is approximated via expanding a Taylor series around . Some authors have used this method for other distributions such as: Balakrishnan and Asgharzadeh ; Asgharzadeh ; Balakrishnan and Hossain ; and Asgharzadeh et al. .
Let be as in (2), then we have the following:where is the order statistic from a sample of size n from the uniform distribution and is equality in distribution. So, we conclude thatwhere . So, keeping two first terms of expanding Taylor series of around and renaming them by and , we have as follows:where
First, we see that is negative for all value of since
From the first equation in (15), we have as follows:and the estimator of is as follows:
Second equation in (15) can be rewritten as follows:
After some computations, the last three terms are removed and the following quadratic equation is obtained:where and .
Two roots of the quadratic equation (20) are as follows:
Since , one of the roots is negative and thus omitted. So, the estimator of is obtained as follows:
The advantages of the AMLE method over the MLE method are the explicit mathematical expression of estimators and the fact that this method does not need any starting values for the parameters.
3. Observed Fisher Information
In this section, the observed Fisher information matrix is computed based on the likelihood and the approximate likelihood equations. Then, some pivotal quantities based on the limiting normal distribution are presented and the behavior of these quantities is examined based on a Monte Carlo simulation study. By differentiating (9), the observed Fisher information matrix is obtained as follows:
For deriving the observed information matrix, an inverted approximate asymptotic variance covariance matrix of the estimators is needed. Thus, based on equations (23)–(25) and ML estimators, we havewhere
If asymptotic normality is established, then the approximate asymptotic variance covariance matrices are valid. For this validation, certain regularity conditions (, page 121) must be established.
4. Coverage Probability and Percentage Point
For construction of a confidence interval or performing a hypotheses testing, pivotal quantities are needed. Since the MLE of the parameter vector has an asymptotic normal distribution, when , we have
The first distribution in (32) is correct since converges in distribution to and based on the consistency condition of MLE, tends to in probability. When is known, the second distribution in (32) is correct via convergence of to in distribution. Also, the third distribution in (32) is correct since converges in distribution to and based on the consistency condition of MLE, tends to in probability. Since the distributions for all of the in (32) do not depend on and , these functions are pivotal quantities for and . The coverage probabilities of confidence interval are computed for the following:via a simulation study. Similarly, for the pivotal quantities based on the AMLEs,
Confidence intervals for the parameters and can be obtained by finding asymptotic percentiles of and . It is worth to mention that the percentiles of pivotal quantities have very complicated forms for finite sample size, and hence, a Monte Carlo simulation method is used. For example if percentile of is , thenand we have
A 95% confidence interval of is
5. Simulation Study
In this section, performance of the MLEs and the AMLEs of generalized Gudermannian parameters are compared. Random samples are generated from GG distribution via the inverse method, i.e.,
Throughout this section, , , and sample sizes are considered. The AMLEs are computed from (17) and (22). The MLEs are calculated by solving the system of equation (9) via “optim” function in statistical R program. The optim function needs starting values of the parameters and the AMLEs are used for this purpose. Based on 1000 replications, the average values of the estimates, their variances, and their covariances are presented in Table 1. The variances and covariances are obtained by inverting the observed Fisher information matrix. In Table 1, the bias and variance of the AMLEs and the MLEs are approximately equal for all sample sizes; thus, the AMLEs are almost as efficient as the MLEs. The bias and variance of the estimators decrease when the sample size n increases. Figure 1 illustrates that the MSEs of MLEs and AMLEs are approximately equal and decrease to zero when the sample size increases. In Table 2, coverage probabilities of are computed based on 1000 replications when data are simulated from GG distribution. These values are determined based on the assumption of normality. When is unknown and sample size is small, the coverage probabilities are very unsatisfactory but for known , i.e., the coverage probabilities are approximately near .95 for all sample sizes. In practice, is unknown; thus, the normality assumption is not suitable for MLE when the sample size is small. To rectify this problem, percentiles of the pivotal quantities are used. Table 3 includes the percentiles of the pivotal quantities via a simulation study with the following steps:(1)Simulate 1000 replicates of the pivotal quantities ( and ) for each and sample size. (2) Compute average of 2.5 and 97.5 percentiles of simulated samples based on 1000 replications for each and the sample size. It is clear that the normal assumption is not justified for this process. In Table 3, especially for small sample sizes, the percentiles are very different from the percentiles of normal distribution in (33). These values seem to be good alternative to the normal distribution percentiles.
6. Real Data Example
In this section, a real data set is presented to illustrate the importance of the method mentioned in previous sections. The breaking strength of 100 yarn is analyzed for the data set was reported by Duncan . The data are as follows: 66, 117, 132, 111, 107, 85, 89, 79, 91, 97, 138, 103, 111, 86, 78, 96, 93, 101, 102, 110, 95, 96, 88, 122, 115, 92, 137, 91, 84, 96, 97, 100, 105,104, 137, 80, 104, 104, 106, 84, 92, 86, 104, 132, 94, 99, 102, 101, 104, 107, 99, 85, 95, 89, 102, 100, 98, 97, 104, 114, 111, 98, 99, 102, 91, 95, 111, 104, 97, 98, 102, 109, 88, 91, 103, 94, 105, 103, 96, 100, 101, 98, 97, 97, 101, 102, 98, 94, 100, 98, 99, 92, 102, 87, 99, 62, 92, 100, 96, and 98. These data have been used previously by Alizadeh Noughabi [12, 13].
Now, the AMLE of the parameters and is computed as follows:
The MLEs of the parameters are obtained by solving the system of equation (17) using the Newton–Raphson iterative procedure with “optim” function in R program: and . The AMLE is used for starting values of “optim” function. It is clear that AMLEs and MLEs are approximately equal. The percentiles of confidence intervals for and are , respectively. The confidence intervals for and based on MLEs are and , respectively. Also, the percentiles of confidence intervals are for and , respectively. The confidence intervals for and via AMLEs are and , respectively.
6.1. Existence and Uniqueness of MLEs
Figure 2 shows the behavior of the log-likelihood function when the AMLE parameters for breaking stress real data set are used. As we can see in Figure 2, the two plots have a unimodal shape and the mode of both of them corresponds to the values of AMLE parameters. Then, we can conclude that the values of AMLE parameters are global maximum values. A graphical description of the existence and uniqueness of the AMLE parameters in this case by using breaking stress real data set is presented in Figure 3. As we can see in Figure 2 that the roots are global maximum, we can deduce from Figure 3 that the log-likelihood function is a decreasing function and intersects the X-axis at one point which ensures that our root are unique and maximum which satisfies the existence and uniqueness of MLEs.
7. Characterization Results
In this section, we present characterizations of the GG distribution in terms of a simple relationship between two truncated moments. The first characterization result employs a theorem due to Glänzel ; see Theorem 1 below. Note that the result holds also when the interval is not closed. Moreover, it could be also applied when the cdf does not have a closed form. As shown in , this characterization is stable in the sense of weak convergence.
Theorem 1. Let be a given probability space and let be an interval for some . Let be a continuous random variable with the distribution function and let and be two real functions defined on such that the equation is defined with some real function . Assume that , , and is twice continuously differentiable and strictly monotone function on the set . Finally, we assume that the equation has no real solution in the interior of . Then, is uniquely determined by the functions , and , particularly,where the function is a solution of the differential equation and is the normalization constant, such that .
Corollary 2. The general solution of the differential equation in Corollary 1 is as follows:where is a constant.
Note that a set of functions satisfying the above differential equation is given in Proposition 1 with . However, it should be also noted that there are other triplets satisfying the conditions of Theorem 1.
In this paper, we suggest an approximation of maximum likelihood estimation (AMLE) for the generalized Gudermannian (GG) distribution. This distribution is a new symmetric probability distribution with the close form pdf and cdf but the maximum likelihood estimation of the parameters (location and scale) do not have close forms and need numerical optimization methods (such as “optim” code in R program). Also, to employ this type of numerical optimization method, we need a good choice for the starting values of the parameters. Our suggested AMLEs for GG distribution parameters are good choice for this purpose.
Linear and nonlinear regression models and related studies such as (change point, outliers, and etc.) based on GG distributions can be done by using our suggested AMLEs for GG distribution parameters in future works. For more readind see the following references [16–20].
All data are available in paper.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
M. Rasekhi, M. M. Saber, and H. M. Yousof, “Bayesian and classical inference of reliability in multicomponent stress-strength under the generalized logistic model,” Communications in Statistics-Theory and Methods, vol. 50, no. 21, pp. 5114–5125, 2021.View at: Google Scholar
E. Altun, “The generalized Gudermannian distribution: inference and volatility modelling,” Statistics, vol. 53, no. 2, pp. 364–386, 2019.View at: Google Scholar
J. S. Robertson, “Gudermann and the simple pendulum,” The College Mathematics Journal, vol. 28, no. 4, pp. 271–276, 1997.View at: Google Scholar
H. A. David, Order Statistics, John Wiley and Sons, New York, NY, USA, 1981.
B. Arnold and N. Balakrishnan, Relations, Bounds and Approximations for Order Statistics, Springer, New York, NY, USA, 1989.
N. Balakrishnan and A. Asgharzadeh, “Inference for the scaled half-logistic distribution based on progressively Type II censored samples,” Communications in Statistics-Theory and Methods, vol. 34, no. 1, pp. 73–87, 2005.View at: Google Scholar
A. Asgharzadeh, “Point and interval estimation for a generalized logistic distribution,” Communications in Statistics-Theory and Methods, vol. 35, no. 9, pp. 1685–1702, 2006.View at: Google Scholar
A. Asgharzadeh, L. Esmaily, and S. Nadarajah, “Approximate MLEs for the location and scale parameters of the skew logistic distribution,” Statistical Papers, vol. 54, pp. 391–411, 2013.View at: Google Scholar
T. S. Ferguson, A Course in Large Sample Theory, Chapman and Hall, London, UK, 1996.
A. J. Duncan, Quality Control and Industrial Statistics, Irwin, Homewood, 1974.
H. Alizadeh Noughabi, “Testing the validity of the logistic model based on the empirical distribution function,” Communications in Statistics-Simulation and Computation, vol. 46, no. 7, pp. 5531–5540, 2017.View at: Google Scholar
H. Alizadeh Noughabi, “A comprehensive study on power of tests for normality,” Journal of statistical theory and applications, vol. 17, no. 4, pp. 647–660, 2018.View at: Google Scholar
W. Glänzel, “A characterization theorem based on truncated moments and its application to some distribution families,” Mathematical Statistics and Probability Theory, Reidel, Dordrecht, vol. B, pp. 75–84, 1987.View at: Google Scholar
W. Glänzel, “Some consequences of a characterization theorem based on truncated moments,” Statistics, vol. 21, no. 4, pp. 613–618, 1990.View at: Google Scholar
M. E. Bakr, Abdulhakim A. Al-Babtain, Z. Mahmood et al., “Statistical modelling for a new family of generalized distributions with real data applications,” Mathematical Biosciences and Engineering, vol. 19, no. 9, pp. 8705–8740, 2022.View at: Google Scholar
M. Bouhadjar, A. M. Gemeay, E. M. Almetwally et al., “The Power XLindley Distribution: Statistical Inference, Fuzzy Reliability, and COVID-19 Application,” Journal of Function Spaces, vol. 2022, 2022.View at: Google Scholar
A. Z. Afify, A. M. Gemeay, N. M. Alfaer, G. M. Cordeiro, and E. H. Hafez, “Power-Modified Kies-Exponential Distribution: Properties, Classical and Bayesian Inference with an Application to Engineering Data,” Entropy, vol. 24, no. 7, p. 883, 2022.View at: Google Scholar
A. A. Al-Babtain, D. Kumar, A. M. Gemeay, S. Dey, and A. Z. Afify, “Modeling engineering data using extended power-Lindley distribution: Properties and estimation methods,” Journal of King Saud University-Science, vol. 33, no. 8, Article ID 101582, 2021.View at: Google Scholar
F. H. Riad, E. Hussam, A. M. Gemeay, R. A. Aldallal, and A. Z. Afify, “Classical and Bayesian inference of the weighted-exponential distribution with an application to insurance data,” Mathematical Biosciences and Engineering, vol. 19, no. 7, pp. 6551–6581, 2022.View at: Google Scholar