Abstract
In applied work, the two-parameter exponential distribution gives useful representations of many physical situations. Confidence interval for the scale parameter and predictive interval for a future independent observation have been studied by many, including Petropoulos (2011) and Lawless (1977), respectively. However, interval estimates for the threshold parameter have not been widely examined in statistical literature. The aim of this paper is to, first, obtain the exact significance function of the scale parameter by renormalizing the -formula. Then the approximate Studentization method is applied to obtain the significance function of the threshold parameter. Finally, a predictive density function of the two-parameter exponential distribution is derived. A real-life data set is used to show the implementation of the method. Simulation studies are then carried out to illustrate the accuracy of the proposed methods.
1. Introduction
The two-parameter exponential distribution with density: where is the threshold parameter, and is the scale parameter, is widely used in applied statistics. For example, Lawless [1] applied the two-parameter exponential distribution to analyze lifetime data, and Baten and Kamil [2] applied the distribution to analyze inventory management systems with hazardous items. Petropoulos [3] proposed two new classes of confidence interval for the scale parameter . Lawless [1] obtained a prediction interval for a future observation from the two-parameter exponential distribution.
In this paper, we consider a sample from the two-parameter exponential distribution with density (1.1). In Section 2, we show that by renormalizing the -formula, the exact significance function of the scale parameter can be obtained. In Section 3, the approximate Studentization method, based on the significance function of obtained in Section 2, is applied to obtain the significance function of the threshold parameter . In Section 4, we combine the results of the previous two sections and derive a predictive density for a future observation from the two-parameter exponential distribution. Some concluding remarks are given in Section 5. Throughout this paper, a real-life data set is used to show the implementation of the proposed methods, and simulation results are presented to illustrate the accuracy of the proposed methods.
2. Confidence Interval for the Scale Parameter
For the two-parameter exponential distribution with density (1.1), it can be shown that the marginal density of is With an observed sample , the log conditional likelihood function that depends only on , can be written as where is an additive constant and . Without loss of generality, in this paper, is set as 0. Moreover, denote and , then the log conditional likelihood function can be rewritten as Note that (2.3) has the same form as a log likelihood function of an exponential family model with canonical parameter . The maximum likelihood estimate of , is obtained by solving . Furthermore, an estimate of the variance of is , where is the observed information.
Under regularity conditions as stated in DiCiccio et al. [4], both the standardized maximum likelihood estimate and the signed log likelihood ratio statistic: have limiting standard normal distribution with rate of convergence . Hence the significance function of can be approximated by or where is the cumulative distribution function of the standard normal distribution.
Since the conditional log likelihood function given in (2.3) in exponential family form with being the canonical parameter, the modified signed log likelihood statistic by Barndorff-Nielsen [5, 6] can be simplified into where and are defined in (2.4) and (2.5) respectively, and has limiting standard normal distribution with rate of convergence . Hence the significance function of can be approximated by
Barndorff-Nielsen [7] derived the -formulaβan approximate density for the maximum likelihood estimator. In this case, we have where is the renormalizing constant. Since , by change of variable and renormalization, we have which is the exact density of , and it is free of . Therefore, inference concerning can be based on the distribution of . Thus, the exact confidence interval for can be obtained using and is
Grubbs [8] reported the following data set (see Table 1).
We have
Table 2 recorded the 90%, 95%, and 99% confidence interval for . The proposed method and the exact method (exact) give approximately the same confidence intervals, whereas the results obtained by the standardized maximum likelihood estimate method (mle) and the signed log likelihood ratio method are quite different. Moreover, from Table 2, it is clear that gives the narrowest confidence intervals and mle gives the widest confidence intervals. The significance functions of obtained from the four methods are plotted in Figure 1.
In order to compare the accuracies of the four methods, Monte Carlo simulation studies with 10,000 replicates are performed. For each simulation study, we generate sample of size from the two-parameter exponential distribution with scale parameter and threshold parameter . Then for each sample, the 95% confidence intervals for is calculated from the four methods discussed in this section.
Table 3 recorded the results of the simulation studies for some combinations of , , and . The βLower Errorβ is the proportion of the true that falls outside the lower limit of the 95% confidence intervals while the βUpper Errorβ is the proportion of the true that falls outside the upper limit of the 95% confidence intervals, and (1βLower ErrorβUpper Error) is recorded as βCentral Coverageβ. The nominal values for the βLower Errorβ, βUpper Errorβ, and βCentral Coverageβ are 0.025, 0.025, and 0.95, respectively. Moreover, for 10,000 Monte Carlo simulations, the standard errors for the βLower Errorβ and the βUpper Errorβ are the same and are . Similarly, the corresponding standard error for the βCentral Coverageβ is 0.0022.
From Table 3, even for the smallest possible sample size (), the proposed method and the exact method give almost identical results and have excellent coverage properties. The results obtained by the other two methods are not as satisfactory especially when the sample size is small.
Since mle method gives the poorest coverage, we excluded it from further investigation. Table 4 recorded the average width of the confidence interval for for the simulation study with . We can note that even though has the shortest average width of the confidence interval for , it also has the poorest coverage properties as demonstrated in Table 3. For inference purpose, coverage properties are more important than width of the confidence interval. Hence the proposed method is recommended for this problem.
3. Confidence Interval for the Threshold Parameter
For the two-parameter exponential distribution, Petropoulos [3] showed that is a sufficient statistic. Note that is completely contained in the marginal density of , but it also depends . In Section 2, the significance function of , , was obtained. Hence, we can apply the approximate Studentization method, which is discussed in Fraser and Wong [9], to eliminate the dependence of from the marginal density of . More specifically, the approximate Studentized marginal density of is where is the normalizing constant. Hence, we have and the corresponding significance function of is Thus, the explicit confidence interval for obtained by the approximate Studentization method is
Applying the approximate Studentization method to the Grubbs [8] data set, the 90%, 95%, and 99% confidence intervals for the threshold parameter are , , and , respectively.
To illustrate the accuracy of the proposed method, we performed a Monte Carlo simulation study. Table 5 records the results from this study. The proposed approximate Studentization method gives extremely good coverage properties even when the sample size is extremely small.
4. Prediction Interval for a Future Observation
The density of a future observation from the two-parameter exponential distribution is given in (1.1), which depends on both parameters and . With observed data , or equivalently, observed sufficient statistic , was obtained by the method in Section 2, and it gives us as much information about as we can extract from the observed data in the absence of knowledge of . Moreover, was obtained by the approximate Studentization method in Section 3 gives us as much information about as we can extract from the observed data after averaging out the effect of . Therefore, to eliminate and from the density of a future observation, we apply the approximate Studentization method to obtain a predictive density of : Hence, the corresponding predictive cumulative distribution function is Although the explicit form of the predictive interval is not available, it can be obtained numerically from softwares like Maple or Matlab.
Applying the proposed method to the Grubbs [8] data set, the predictive cumulative distribution function obtained in (4.2) is plotted in Figure 2, and the corresponding 90%, 95% and 99% predictive intervals are (161, 2980), (128, 3714), and (47, 5530), respectively. The corresponding intervals obtained by the method discussed in Lawless [1] are (161.00, 2982.23), (129.21, 3715.96), and (48.02, 5532.63), respectively. The two methods give almost identical results. Lawlessβs method is easy to apply but the derivation is more difficult. The derivation of the proposed method is easy to follow but it requires good numerical integration methods to carry out the calculation.
5. Conclusion
In this paper, by renormalizing the -formula, the exact significance function of the scale parameter of the two-parameter exponential distribution is obtained. This significance function is then used in the approximate Studentization method to obtain the significance function of the threshold parameter. Simulation results illustrated that these two significance functions have excellent coverage properties even when the sample size is extremely small. Finally, these two significance functions are used in the approximate Studentization method to obtain a predictive density and hence a predictive cumulative distribution function, of a future observation from the two-parameter exponential distribution.