Abstract

The type III discrete Weibull distribution can be used in reliability analysis for modeling failure data such as the number of shocks, cycles, or runs a component or a structure can overcome before failing. This paper describes three methods for estimating its parameters: two customary techniques and a technique particularly suitable for discrete distributions, which, in contrast to the two other techniques, provides analytical estimates, whose derivation is detailed here. The techniques’ peculiarities and practical limits are outlined. A Monte Carlo simulation study has been performed to assess the statistical performance of these methods for different parameter combinations and sample sizes and then give some indication for their mindful use. Two applications of real data are provided with the aim of showing how the type III discrete Weibull distribution can fit real data, even better than other popular discrete models, and how the inferential procedures work. A software implementation of the model is also provided.

1. Introduction

Most reliability studies assume that time is continuous, and continuous probability distributions such as exponential, gamma, Weibull, normal, and lognormal are commonly used to model the lifetime of a component or a structure. These distributions and the methods for estimating their parameters are well known. In many practical situations, however, lifetime is not measured with calendar time: for example, when a machine works in cycles or on demands and the number of cycles or demands before failure is observed; or when the regular operation of a system is monitored once per period, and the number of time periods successfully completed is observed. Moreover, reliability data are often grouped into classes or truncated according to some censoring criterion. In all these cases, lifetime is modeled as a discrete random variable (r.v.). Indeed, not much work has been done in discrete reliability. Generally, most reliability concepts for continuous lifetimes have been adapted to the discrete case; in particular, discrete analogues of continuous distributions have been introduced [1]. In this context, geometric and negative binomial distributions are the corresponding discrete alternatives for the exponential and gamma distributions, respectively. Yet, discrete lifetime distributions can be defined without any reference to a continuous counterpart. Bracquemond and Gaudoin [2] provided an exhaustive survey on discrete lifetime distributions.

The (two-parameter) continuous Weibull distribution is one of the most widely used stochastic distributions for modeling the life of a component; it is very flexible since, for different chosen shape parameters, it models either increasing or decreasing failure rates [3]. As a discrete alternative to the continuous Weibull distribution, three main forms have been proposed. The first one was introduced in Nakagawa and Osaki [4] and is referred to as type I discrete Weibull; it mimics the cumulative distribution function of the continuous Weibull distribution. The second one (type II) was proposed and studied in Stein and Dattero [5]; it mimics the hazard rate of its continuous counterpart. The third, which is referred to as type III and this paper considers, was introduced in Padgett and Spurrier [6]: their approach does not start from the continuous Weibull distribution but tries to generalize the notions of hazard rate and mean residual life to the discrete case [7]. From a different perspective and with a different objective, Roy and Dasgupta [8] proposed a discretization method for continuous random variables for computing reliability in complex stress-strength models, with a specific application to the Weibull r.v.; however, the support of the discrete r.v. thus provided is not the set of nonnegative integers.

Type III discrete Weibull (henceforth simply discrete Weibull) r.v. is not similar in functional form to any of the functions describing a continuous Weibull distribution. It is defined by the following cumulative distribution: with and (and not , as stated in Padgett and Spurrier [6]) or equivalently by the following probability mass function: which can be more compactly rewritten as follows: letting if . The corresponding survival function is and the hazard rate (or failure rate) function is

The complexity of the expression of the probability mass function (3) has somehow hindered the use and diffusion of this discrete model. A plot of the probability mass function of the discrete Weibull for three combinations of its parameters is given in Figure 1. Note that for , the discrete Weibull reduces to a geometric r.v. with parameter , characterized then by a constant failure rate; for , the distribution has an increasing failure rate, whereas for , it has a decreasing failure rate. The first and second moments cannot in general be expressed in a closed form but can only be expressed as an infinite series

The first moment (6) is finite for or for and ; the second moment (7) is finite for or for and (see the appendix).

This study first describes and discusses procedures for estimating parameters and of the discrete Weibull r.v., one of which is an original adaptation of a technique used for some discrete distributions, emphasizing their practical limits (Section 2). An extensive Monte Carlo study, implemented through an adhoc package developed under the R environment, assesses and compares the performance of these estimators, in terms of bias and variability, for different combinations of the parameters and sample sizes (Section 3). The estimation procedures are applied to two datasets taken from the literature (Section 4), and, finally, the summarizing remarks conclude the paper (Section 5).

2. Parameters Point Estimation

Focusing on the point estimation of the parameters of the discrete Weibull r.v., based on an observed simple random sample of size , three techniques are now described: the method of proportion, which is strictly related to the specific features of the distribution function of the discrete Weibull r.v.; the maximum likelihood method; and the method of moments. Below, we assume that both parameters are unknown. Special attention will be devoted to detecting samples leading to implausible estimates or to no estimate at all with either technique.

Method of Proportion. This method was originally introduced in Khan et al. [9] for type I discrete Weibull and was extended by Jazi et al. [10] to discrete inverse Weibull. It relies upon the estimation of a probability (or two or more probabilities, according to the number of parameters involved) using the corresponding sample proportion(s). For the present model, we have , which can be estimated through the proportion of 0’s in the sample, , where denotes the number of 0’s in the sample, the indicator function, which equals if is true and if is false. Thus, an estimate of is Similarly, the probability can be estimated by the proportion of 1’s in the sample, , with . Consequently, also substituting for its estimate (8) (see the appendix), is estimated by Though the method of proportion easily provides an analytical expression for both estimates, an apparent weakness is that this method does not exploit all the information contained in the sample, since the estimates involve only the rates of 0’s and 1’s. Moreover, it fails to provide feasible estimates for if there are no 0’s in the sample. In this case, , which is a boundary value for , and in (9) cannot be computed. This is particularly common for small samples, and for small values of . If in the sample there are no 1’s (), then by formula (9), the estimate is not available again. The method fails in computing an estimate for even if the sample contains only 0’s and 1’s ().

The method of proportion can also lead to “implausible” estimates of , that is, estimates that do not belong to its parameter space: . This happens when which is equivalent to say or in terms of The probability of an “implausible” estimate could be theoretically derived from the last inequality, remembering that and are the first two marginal distributions of a trinomial r.v. with parameters (, , , ). Figure 2 shows the combinations of and leading to such implausible estimates as a subset of the triangular region (; ). For example, in the sample , with and , the method of proportion provides, through (8) and (9), the estimates and .

Maximum Likelihood Method. Having defined the log-likelihood function as , we obtain The maximum likelihood estimates of and are defined as the values that maximize the log-likelihood function The two first derivatives of are quite easily computed, but, as already noted in Padgett and Spurrier [6], the solution to the maximization of (with the constraints that and belong to in their natural parameter spaces) cannot be derived in a closed form, by equating the expressions in (15) to zero, but can be obtained only numerically, for example, using one of the functions nlm, optim, and Rsolnp in the R programming environment, which allows the user to solve nonlinear constrained or unconstrained minimization/maximization problems. Even this method cannot be successfully applied to every possible sample; in particular, the method fails in providing a solution if (i.e., if the sample contains only 0’s and 1’s). In this case, in fact, it can be easily proved that the first-order derivative of the log-likelihood is never null; the log-likelihood does not have an absolute maximum in the parameter space. Let us consider, for example, the following sample: . The corresponding likelihood function is given by and the log-likelihood The first-order derivatives areThe derivative (18b) is never null but tends to zero for ; when the third addend in the derivative (18a) goes to zero, and solving the equation one can obtain . Note that this is the same estimate the method of proportion supplies.

Method of Moments. Through this method, the parameter estimates are obtained solving, in terms of and , the equations and , where and are the first and second sample moments: , . Since they cannot be solved analytically, as suggested in Khan et al. [9], one can numerically minimize, with respect to and , the following quadratic loss function: The task can be carried out, for example, again using the functions nlm, optim, or solnp in the R environment [11]. The solution is the couple , which should correspond to the value . Note that the minimization should be subject to the natural constraints and . Without these constraints, the minimization algorithm may get stuck in implausible intermediate-step solutions. Indeed, the optim function under the R environment seems to provide excellent results in terms of convergence to the optimal solution, even without setting the constraints on and . As to the launch values for and , required by any minimization function, one can set and . This pair of values is always computable and feasible (unless the sample contains all 0’s) and ensures at the first iteration that : in fact, recalling (6), .

When the sample contains only 0’s and 1’s, the method of moments is not applicable. To see this, first consider the probability mass function of the discrete Weibull r.v., and note that by letting tend to in (3), it degenerates into a r.v. that takes only two values: with probability and with probability . It is then clear that if the sample is made up of a fraction of 0’s and a fraction of 1’s, the equality of both first and second moments computed on the sample and on the original r.v. holds for ; that is, and . In this case, the loss function does not admit an absolute minimum, it only admits an inferior limit.

For some samples, the numerical minimization procedure can require a huge computation time, much larger than that required by the maximum likelihood method. This is due to the iterated calculation of the first and second moments, which is itself numerical and particularly time consuming for the negative values of (in this case, in fact, the convergence of the series in (6) and (7) is slower).

Given the complexity of the estimators derived through the methods listed in this section (and only the method of proportion provides an analytical expression for them), not as much can be analytically derived about their statistical properties for finite sample size, that is, bias and variability. When the method of proportion can be applied, it provides a consistent estimator for ; is consistent as well, but nothing can be said about their unbiasedness (see the analogous discussion in Khan et al. [9] for type I discrete Weibull). For large samples, the general properties of the estimators derived from the maximum likelihood method and the method of moments can be recalled. In the next section, an extensive simulation study is presented, which was performed to investigate the performance of the estimation methods and outline some practical advice for their use.

3. Simulation Study

The estimators presented in the previous section were investigated through an extensive Monte Carlo study; they were compared in terms of bias (), defined as , and root mean square error (RMSE), defined as , where denotes one of the two parameters ( or ) and denotes one of the three corresponding estimators, according to the method indicated by the subscript (method of proportion, ; maximum likelihood method, ML and method of moments, ).

3.1. Simulation Design

In this study, several parameter combinations and sample sizes () were considered. The values were chosen in order to explore a large spectrum of the discrete Weibull distribution, in particular to comprise increasing, constant, and decreasing failure rates. At the same time, the parameters were set in order to keep the discrete nature of the distribution reasonable: values entailing a nonnegligible probability for a large number of integer values were deliberately excluded (in this case, a continuous r.v. should be preferred to model failure data); parameter values ensuring a nonnegligible probability for just the first integers were avoided as well (one assumes the component that is monitored is likely to last for more than 1 or 2 cycles). Table 1 shows the combinations of values of and examined in the simulation study along with the corresponding expected value, standard deviation, and 99% quantile. A note about the computation of the expected value and standard deviation is due: they are calculated numerically—see formulas (6) and (7)—considering the first integers, with and as small as possible (here, , which actually proved to be a satisfactory practical choice for the considered scenarios).

3.2. Software Implementation

The simulation study was based on 5,000 Monte Carlo replications and carried out under the R environment [11]. In particular, the necessary code was developed to implement the probability mass function, the cumulative distribution function and its inverse, and the random generation for the discrete Weibull model; to compute the first and second moments; and to to realize the algorithms corresponding to each estimation method. This code, structured as an R package, DiscreteWeibull, is freely available in the CRAN repository [12].

3.3. Simulation Results

Table 2 shows the bias and root mean square error for both estimators derived from each of the three methods, under each combination of the two parameters and for each sample size. The results took into account the applicability limits of the three methods, computing the Monte Carlo averages over the feasible samples only. For the smallest sample size () and, more rarely, for , the method of proportion met a certain percentage of infeasible samples under several scenarios, the highest (about ) with , , and . The maximum likelihood method and the method of moments encountered only a few infeasible samples under some scenarios, whose rate was in any case always smaller than .

Note that strictly speaking, to compare the behavior of the estimators under different scenarios, one should use their relative bias and relative root mean square error as performance indexes, that is, the bias and the root mean square error divided by the MC expected value of the estimator. This way of proceeding would provide a more correct “double” reading of the results, assessing the performance of a single estimator moving through different scenarios and comparing the performance of the three estimators under a fixed scenario. Our choice of using “absolute” instead of “relative” indexes was partially dictated by the necessity of including negative and null values for , which would represent difficulty when computing the relative indexes (the denominator could take very small values, dramatically amplifying the corresponding absolute value of the index and making the reading arduous).

Trying to synthesize the results, let us start with the largest sample size () and the estimators of parameter . It is quite evident that all three estimators show a modest bias, regardless of the values of and . The maximum likelihood method looks the best in this sense, while the method of moments shows the largest bias in absolute value for high values of and negative values of , and it is often negatively biased. As to the root mean square error, the method of moments and the maximum likelihood method present very similar values under each scenario; the method of proportion shows a larger value, especially for small values of . The magnitude of RMSE seems to be much more affected by the value of rather than by for all the estimators.

As to the estimators of , on average, the method of proportion provides the least biased estimator, unless is too small, while the maximum likelihood method and, to a larger extent, the method of moments are significantly biased. While and ML estimators always overestimate the true value of the parameter, the estimator seems to underestimate it for lower values. In absolute value, the bias of the ML estimator increases as increases for a fixed ; it increases as increases for a fixed . The bias of the estimator increases as increases for a fixed ; it increases as decreases for a fixed . In terms of variability, the and ML estimators perform in a similar way for each parameter combination, while the estimator is much more variable and presents very different values moving through the scenarios: its RMSE is from 2 to 6 times larger than the corresponding value of its competitors. The worst scenarios for the estimator, in this sense, are those connected with , and this is quite reasonable. In fact, in this case, the probability equals , which corresponds to the expected fraction of zeros in the sample, and this small value leads to a modest performance of the method, which uses only proportions of 0’s and 1’s of the samples and discards the other information.

Decreasing the sample size to and , the bias in absolute value and the root mean square error of the estimators for each scenario obviously increase. However, the behaviors and trends exposed for hold still. The bias of the estimators of derived by the method of moments and the maximum likelihood method becomes substantial especially for large values of and negative values of ; in these cases, the method of proportion is less biased, but its RMSE is still much larger than those of its competitors. For high values of (viz., equal to or larger than ) and , the bias in absolute value of the estimators of for the method of proportion and the method of moments tends to become much more substantial than that for the method of maximum likelihood.

Looking at the overall values of RMSE for the estimators of and obtained through the three methods, it is also evident that much more uncertainty is associated with the point estimation of the second parameter.

Figure 3 shows the MC distributions of the estimators of and for and three combinations of parameters. From the analysis, it emerges that the positive bias of and is due to the presence of a certain number of samples providing estimates much larger than the true value of , while the MC medians of both estimators are very close to it. These boxplots also emphasize the presence of implausible values for the estimates of yielded by the method of proportion ().

Trying to synthesize all the results presented here—which are not, however, exhaustive—the method of proportion, despite its straightforward analytical derivation of the estimators, performs worse overall (especially in terms of variability) than the method of moments and the maximum likelihood method, and it is competitive only for some specific scenarios, namely, for medium/high values of , where it can better exploit the information contained in the sample.

4. Applications to Real Data

The methods that have been illustrated and empirically investigated in the previous sections are applied to two datasets. The first one, called J1 [13], contains the number of failures of software observed over 62 weeks, and its frequency distribution is reported in Table 3. Assuming that the statistical distribution underlying the data is a type III discrete Weibull, it is possible to compute the estimates yielded by the three estimators described in Section 3, which are reported in Table 4. Note that the method of moments and the maximum likelihood method provide very similar estimates for and ; the method of proportion supplies an estimate of that is quite different from the other two methods (moreover, it is negative), while the estimate of , is quite close to and . If we want to test the goodness of fit of the discrete Weibull model for the data, we can use the chi-square statistic , where denotes the category frequencies and denotes the probability of an observation falling into the th category under the study model, such that and . If the model holds, is asymptotically distributed as a chi-square with degrees of freedom, where is the number of parameters to be estimated (in this case, ). Before computing the empirical value of , under each model we have to group the categories in such a way that all the expected frequencies are not smaller than . Table 5 reports these groupings. The goodness-of-fit chi-square test accepts the type III Weibull model with the ML and parameter estimates (statistic’s value of and value of for ML, and for ), while it refuses the model with the parameter estimates (statistic’s value of and -value of ) at a nominal level of significance. Again, there is a notable difference between the results for the method of proportion and the maximum likelihood method and the method of moments.

The second example considers the data regarding accidents to 647 women working on H. E. Shells during five weeks [14], whose frequency distribution is reported in Table 6. Even if these are not strictly failure data, in the meaning explained in Section 1, nevertheless they are count data, and the discrete Weibull r.v. can be used to model them. The parameter estimates yielded by the three methods are reported in Table 7. The estimates of are again very close to each other; all the estimates of are negative and quite small in absolute value. Note that all the pairs fall outside the scenarios explored in the simulation study; nevertheless, the empirical distribution of the data (many 0’s and 1’s) is favorable to the method of proportion, and the large sample size () should ensure that all three methods are reliable. Testing the goodness-of-fit of the type III discrete Weibull model with the ML parameters, which requires grouping the last two categories (then ), the value of the chi-square statistic with degrees of freedom is , and the corresponding -value is . Note that this model fits the data better than the natural negative binomial model considered by Greenwood and Yule [14] and the generalized Poisson model proposed by Consul and Jain [15].

5. Conclusion

This paper examined three estimators for the parameters of the type III discrete Weibull random variable, which represents an alternative distribution to the geometric and negative binomial for modeling discrete reliability data, and can ensure increasing and decreasing failure rates. One of the methods presented (method of proportion) has recently been introduced in discrete models, and here it is newly phrased; it provides a closed form for the estimates of both parameters. The other two methods (maximum likelihood method and method of moments) are standard approaches for estimating parameters, but due to the complex expression of the probability mass function, they provide the estimates as a numerical solution to a minimization/maximization problem. It follows that not as much can be said about the statistical properties of the estimators for a finite sample size; then, an extensive Monte Carlo simulation study was carried out to assess their behavior. Far from giving a definitive solution to the problem, the study highlighted that the method of proportion, when applicable, can provide reliable estimates even for small sample sizes only under specific parameter configurations, whereas under other configurations it may provide poor results, especially in terms of the accuracy of the estimator of the second parameter. The other two standard methods can be usefully adopted under most scenarios; caution is necessary since some parameter combinations may lead to nonnegligible bias of the corresponding estimators. The paper stressed the potential practical applicability limits of each method, also in terms of computational burden, and illustrated their use through two examples with real data.

Appendix

A. Existence of First- and Second-Order Moments

Let us denote with , , the general element of the series in (6), . exists finite if and only if the series is convergent. If , the series is convergent according to the ratio criterion, since the limit is if and is if .

If , note that , and then

The series with the minoring term is convergent (e.g., using the comparison criterion with the harmonic converging series) and thus the original series in (6), too.

If , note that and then

The series with the minoring term is convergent for and thus the original series in (6) is too. Moreover, since for

The series with the minoring term is divergent for and thus the original series in (6) is too.

If , then is greater than zero, and the series is divergent.

Let us now denote with , , the general element of the series in (7), ; then . If , the series is convergent according to the ratio criterion, since the limit is if and is if ; thus is finite.

If , The series with the minoring term is convergent (e.g., using the comparison criterion with the harmonic converging series) and thus the original series in (7) is too. Then is finite.

If , The series with the minoring term is convergent for and thus the original series in (7) is too. Then is finite. Moreover, for ,

The series with the minoring term is divergent for , and thus the original series in (7) is too. is not finite.

For , is not finite, since is not finite and .

B. Derivation of

Equating the probability to the sample proportion , we get and after taking the natural logarithm of both sides of the equation twice, the second time after changing signs, we obtain Substituting with its estimate (8) in (B.2) we get and further simplifying it which represents the estimate .

Acknowledgments

The author would like to thank the editor and an anonymous reviewer for their suggestions on the early version of the paper.