Abstract

We propose an asymptotic nonparametric confidence interval for quantile-based process capability indices (PCIs) based on the superstructure modified from which contains the four basic PCIs, , , and , as special cases. Since the asymptotic variance of the estimator for quantile-based PCIs involves the density function of the underlying process, the existing asymptotic results cannot be used directly to construct confidence limits for PCIs. To obtain a consistent estimator for the asymptotic variance of the estimated quantile-based PCIs, in this paper, we propose to use the kernel density estimator for the underlying process. Consequently, the confidence limits for PCIs are established based on the consistent estimates. A real-life example from manufacturing engineering is used to illustrate the implementation of the proposed methods. Simulation studies are also presented in this paper to compare the two quantile estimators that are used in the definition of PCIs.

1. Introduction

The process capability index (PCI) as a quality monitoring tool has been playing an increasingly important role in analyzing process quality and productivity. It has been widely applied in industrial engineering particularly in quality monitoring and improvement programs in manufacturing systems as well as reliability engineering and environmental engineering. Many PCIs have been proposed since Juran et al. [1] proposed the first PCI . Let USL and LSL be the upper and lower specification limits, , and be the target values. The process mean and standard deviation are denoted by and . Vännman [2] proposed the following unified superstructure which includes the commonly used four basic PCIs, , , , and , where and are nonnegative constants. We can see from (1) that , , , and . Since the process mean and standard deviation-based PCIs implicitly assume the normality of the underlying process, applying such PCIs to skewed process may cause invalid results. Let be the th quantile of the process, that is, . Define to be the vector of quantiles to be used in the definition of PCIs. Chen and Pearn [3] modified Vännman’s [2] and proposed a quantile-based PCI superstructure without assuming implicitly the normality of the underlying process as follows: where and which guarantee that the proportion of nonconforming parts of the process is at most 0.27% if the corresponding four PCIs take value greater than 1 for on target normal processes. will be used throughout this paper. Since two different types of quantile functions will be used in this paper, we will use notation meaning that the PCI superstructure is defined based on the specified quantile .

Chen and Hsu [4] studied the asymptotic properties of and proved that the point estimators of quantile-based superstructure using different quantile estimators are asymptotically normally distributed. Since the variance of the estimator involves the density function of the underlying process, the asymptotic results cannot be used directly to construct confidence limits of and perform testing hypotheses on PCIs.

In this paper, a consistent nonparametric estimator of the asymptotic variance of the quantile-based PCIs using the kernel density estimator of the underlying process. Based on the consistent variance estimator, we can construct confidence limits of desired PCIs within the frame work of superstructure.

The rest of this paper is organized as follows. Two point estimators of quantile-based PCIs will be proposed in Section 2. The distribution and inferential properties of the consistent estimators of based on different quantile estimators are addressed in Section 3. Nonparametric asymptotic confidence limits will be detailed in Section 4. In Section 5, we present an illustrative example showing the steps of constructing confidence intervals using proposed methods. Some comparative analysis based on simulations will also be presented in this section. The proof of Theorem 2 is given in the appendix.

2. Asymptotic Property of the Quantile-Based Estimators

Since the quantile-based PCIs are functions of process quantiles and some preset numerical characteristics regarding process capability such as process target and specification limits, different estimators of process quantiles give different point estimators of PCIs. In this section, we introduce two point estimators of process quantile functions, namely, (raw) sample quantiles and interpolated quantiles based on which the consistent PCI estimators are proposed.

2.1. Notations and Definitions

Let be the vector of (raw) sample quantiles. For ease of developing asymptotic results of estimators of the quantile-based PCIs, we reexpress (2) as follows: Let Furthermore, we define vectors, for , where

Finally, denote where is an indicator function taking value 1 if the argument is true and 0 otherwise. That is, is a linear combination of , , and . Note also that depends only on the form of .

2.2. Two Estimators of Quantile-Based PCIs

Among all available quantile estimators in the literature, the simplest consistent estimator of a process quantile is the sample quantile. We consider point estimators of the quantile based PCIs using sample quantiles and the refined estimator of quantiles-based on the linear interpolation of sample quantiles.

Let be the vector of sample quantiles and be the density function of the underlying process. It is well known that is a strong consistent estimator of . Furthermore,where is the covariance of and (see Serfling [5], 1980, Section  2 for details). The first point estimator of (2) is proposed as follows

Pearn and Chen [6] generalized Chang and Lu’s [7] interpolation-based quantile estimators and proposed the following estimators (referred to hereafter as interpolated quantiles), for , where and is the sample size and is the floor function which returns the largest integer less than or equal to . Let . Since the interpolation-based quantile, , is a linear combination of some (raw) sample quantiles, is a consistent estimator of . The second point estimator of PCI based on interpolated quantiles is proposed with the following form: For , define , , and . Denote and , where is th quantile. The sample version of is given by , where is the th sample quantile. With these notation, we have the following linear relationship between and :

Using a similar argument on (8) gives the variance-covariance matrix of as follows: Therefore, the variance-covariance matrix of expressed in sample quantiles and the density function of underlying process is given by

2.3. Asymptotic Normality of and

Chen and Hsu [4] have proposed several estimators of based on several quantile estimators available in the literature. They also proved the corresponding central limit theorems associated with the proposed estimators. Here we prove a version of central limit theorem regarding in a similar way but with a simpler form of the asymptotic variance of . Similar to that of Chen and Hsu [4], our variance is dependent on the explicit expression of the density of the process.

Lemma 1. Let be the density function of the underlying process. Assume further that is positive and continuous in a neighborhood of for and for , respectively. Let and be the vectors of sample quantiles of and , respectively. The vector is defined as in previous subsection. Then where and are specified in (8) and (15).

The proof of this Lemma follows immediately from Theorem B in Serfling [5, page 80]. The following theorem gives the asymptotic distributions of the sample quantile-based estimators and . The proof will be provided in the appendix.

Theorem 2. Under the assumption in Lemma 1, one has where , , and is specified in (7).

Remark 3. The point estimators and are asymptotically unbiased estimators of .

The primary application of Theorem 2 is to construct asymptotic confidence limits of and test hypotheses on PCIs. We will address how to construct confidence limits of PCIs in the following section.

3. Asymptotic Confidence Intervals

If the density function of the underlying process is known, we can use Theorem 2 to construct the asymptotic confidence limits for . If the underlying distribution is unknown, we cannot use Theorem 2 directly to construct confidence limits for . In this section, we first propose a consistent kernel estimator for the density function, , of the underlying process. Then we can estimate the asymptotic variances in Theorem 2 using the sample information.

3.1. A Consistent Variance Estimator for

Suppose that is a random sample collected from the underlying process with density . The kernel density estimator of , denoted by , is given by

where is the kernel function which is nonnegative, unimodal, and symmetric with respect to the vertical axis and integrates to unity. is the bandwidth which controls the degree of smoothing of the estimation. Among several commonly used kernel functions, Gaussian kernels are the most often used. The quality of a kernel estimate depends less on the shape of the than on the value of its bandwidth . In this paper, we will use Gaussian kernel and the rule of thumb of Silverman [8] for choosing the bandwidth, which yields a mean integrated square error within 10% of the optimum for all the t distributions, of a Gaussian kernel density estimator.

The kernel density estimator has many desirable properties. It is asymptotically unbiased, consistent in a mean-square sense, and uniformly consistent in probability. For references Silverman [8] and Wand and Jones [9] are the excellent monographs. Bowman and Azzalini [10] is a nice nontechnical introduction to kernel density estimation.

Using the kernel estimator of the density as well as the quantile estimators quantile-based PCIs for the underlying process, we propose the following variance estimators:

Since both variances are continuous function of the estimated density function and the estimated process quantiles, the consistency of the the two kernel density estimator-based variance estimators and follows immediately from the fact that the is (strongly) consistent with .

3.2. Asymptotic Confidence Interval

With the kernel estimator of the density of the underlying process, we define the asymptotic confidence interval for the quantile-based PCIs. Let be the th sample percentile and define

The lower confidence limits of using different quantile-based estimators are given by

Since the above standard errors are explicitly expressed in data measurements and the estimated density function, the confidence limits can be calculated through simple programming.

4. Numerical Results

We now present some numerical results in this section to apply the proposed procedures. The first example is based on real-life data. We also provide a simulation study to compare the accuracy of the two estimators of quantile functions and the impact on the estimators of PCIs.

4.1. An Illustrative Example

The rubber-edge is one of the key components in audio speaker manufacturing that reflect sound quality of the driver units and clarity of the sound. As an illustrative example, we use the data given Chen and Pearn [3]. One characteristic of the rubber edge is the weight. The upper and lower specification limits, USL and LSL, of the weight for a particular model of rubber edge are set at 8.94 and 8.46 (in grams). The target value is the midpoint between LSL and USL, which is 8.70 grams.

In order to assess the capability of the process, Chen and Pearn [3] collected following random sample with 100 data measurements (see also Table IV of Chen and Pearn [3]).

8.61 8.64 8.53 8.76 8.68 8.55 8.66 8.67 8.76 8.71 8.81 8.68 8.74 8.71 8.55
8.71 8.72 8.71 8.64 8.81 8.72 8.98 8.59 8.71 8.73 8.74 8.81 8.63 8.54 8.60
8.69 8.70 8.69 8.67 8.67 8.70 8.63 8.74 8.71 8.64 8.65 8.74 8.70 8.67 8.71
8.62 8.78 8.67 8.69 8.71 8.64 8.75 9.03 8.68 8.73 8.61 8.64 8.69 8.80 8.75
8.68 8.66 8.83 8.69 8.67 8.79 8.66 8.69 8.70 8.67 8.74 9.00 8.87 8.74 8.68
8.69 8.63 8.68 8.59 8.73 8.68 8.64 8.79 8.80 8.69 8.68 8.71 8.70 8.53 8.61
8.67 8.70 8.68 8.59 8.74 8.77 8.99 8.81 8.74 8.84

There are four measurements which are above the upper specification limit USL = 8.94 (see Figure 1) and the data also failed to pass the normality test. Chen and Pearn [3] used quantile-based PCI values and claimed that the process was incapable at the time of data collection.

After some adjustments were performed on the process, another 100 data measurements were sampled (see also Table V of Chen and Pearn [3]).

8.70 8.65 8.72 8.80 8.94 8.65 8.63 8.67 8.66 8.69 8.69 8.66 8.80 8.70 8.68
8.74 8.70 8.68 8.69 8.65 8.71 8.69 8.72 8.52 8.62 8.75 8.92 8.69 8.68 8.93
8.70 8.71 8.94 8.65 8.70 8.69 8.71 8.67 8.69 8.64 8.66 8.69 8.81 8.73 8.69
8.70 8.67 8.69 8.68 8.67 8.67 8.71 8.67 8.70 8.66 8.70 8.62 8.69 8.73 8.64
8.68 8.68 8.74 8.55 8.70 8.56 8.68 8.52 8.73 8.68 8.73 8.87 8.71 8.76 8.81
8.67 8.70 8.65 8.67 8.77 8.66 8.70 8.75 8.73 8.69 8.71 8.64 8.70 8.83 8.64
8.72 8.69 8.73 8.71 8.72 8.64 8.70 8.69 8.71 8.81

There are still two measurements which are equal to the upper specification limit. Further more, since the underlying process failed passing normality tests, Chen and Pearn [3] reported the quantile-based PCIs again and found that PCIs are bigger than 1 which implies that the capability of the underlying process was improved.

In this analysis, we report the point estimates of quantile-based PCIs and the corresponding standard error (s.e.) and 95% lower confidence limit (LCL). The results are summarized in Table 1.

We can see from Table 1 that the standard errors of point estimates of PCIs based on sample quantiles are uniformly smaller than those of PCIs-based on interpolated quantiles. The point estimates of PCIs based sample and interpolated quantiles happened to be the same after the process was adjusted.

4.2. A Monte Carlo Simulation Study

In this simulation, we assume the underlying process follows a Weibull distribution with density function

where is the scale parameter and is a shape parameter. The theoretical th quantile of this Weilbull process, denoted by , is implicitly determined by that is, the th quantile of the Weibull process is explicitly expressed by

The fundamental goal of this simulation is to compare the estimators of PCIs and the corresponding kernel density-based asymptotic variance defined based on sample quantiles with that defined based on interpolated quantiles using mean square error (MSE) and examine how sample size and shape of the process distribution affect the accuracy of PCI estimators and their associated variance estimators as well. To this end, we first choose fixed values of shape and scale parameter, say and . The specification limits and the target value are chosen to be . With these choices of quality characteristics, the true PCIs are , and . The sample sizes we used are , and 1000. For each sample size, we simulated 1000 random samples from the Weibull population and evaluate the true PCIs and the corresponding asymptotic variance (in Theorem 2) according to the sample size. The simulated mean square errors (MSE) of the point estimate of PCIs and their corresponding asymptotic variances based on raw sample quantiles () and interpolated quantiles (), respectively, in Table 2.

From Table 2, we can see that the (true value) asymptotic variance of PCIs decreases as the sample size increases; the MSEs of and are uniformly greater than that of and , respectively; the MSEs of (raw) sample and interpolated quantile-based point estimators of PCIs and their corresponding asymptotic variances decreases as the sample size increases. Based on these observations it tours out that the raw sample quantile-based PCIs have better asymptotic performance than the interpolated quantile-based ones.

Figure 2 gives the simulated density curves of quantile-based PCIs under different sample sizes. It is seen that all curves with sample size as small as 50 are bell shaped and similar to the density curves of normal random variables which indicate the validity of asymptotic inference.

Next we examine how the change of shape parameter (related to skewness) in the Weibull process affects the biases of the point estimators of PCIs defined using sample quantile and interpolated quantiles. We still use the same LSL, USL and the process target as those used in the previous simulation. The values of shape parameter are chosen in the sequence from 2.0 to 5.0 with increment 0.2. Since Figure 2 indicates that the asymptotic normal approximation is reasonably well with sample size as small as 50, we choose the fixed sample size 50 to simulate the biases using different values shape parameter specified in the above sequence.

It turns out, from Figure 3, that biases of the estimators of PCIs based on interpolated quantiles are greater than those defined based on raw sample quantiles across the various values of the shape parameter. But the significance of the differences of bias needs to be rigorously tested.

The analysis conducted in this section is carried out using the free open source statistical package R. The code can also be run in the commercial package S-PLUS. The program is available on request.

5. Discussion and Summary

The formulation of , , , and using underlying process mean and standard deviation implicitly assumed the certain degree of normality which limits the application of PCIs in many skewed processes. The quantile-based basic PCIs of Chen and Pearn [3] eliminate the implicit normality assumption and therefore can be used essentially in any processes. The asymptotic properties of quantile-based PCIs with superstructure have been studied by Chen and Hsu [4]. Since the asymptotic variance of quantile-based superstructure contains the underlying process density distribution function, the lower confidence limits of quantile-based PCIs cannot be constructed without knowing the density function of the underlying process.

In this paper, we use nonparametric kernel density estimation approach to estimate the underlying density function which allows to propose a consistent estimator of the asymptotic variance of the quantile-based PCIs. We also use two different quantile estimators, namely-raw sample quantile and interpolated quantile estimator originally proposed by Chen and Pearn [3]. Simulation study based on Weibull process indicates that, in terms of mean square error, the performance raw sample quantile-based PCIs are uniformly better than interpolated quantile-based PCIs.

There are at least 9 different quantile estimators available in different software packages. The detailed definitions of these quantile estimators can be found in Hyndman and Fan [11]. It is of interest to examine performance of quantile-based PCIs defined based on these different quantile estimators.

Another interesting issue is to compare the performance of process mean standard deviation-based superstructure of Vännman [2] and the quantile-based superstructure of Chen and Pearn [3] via simulation using both normal and nonnormal processes.

Appendix

Proof of Theorem 2. For the asymptotic normality of , we have the following approximation using Taylor expansion where is the Euclidean distance. The -consistency of from Lemma 1 implies that . Therefore, A direct application of Slutsky’s theorem gives where is specified in (8).For the asymptotic normality of , we have Note that where is specified in (14). Similar to the arguments used in proving the normality of , we have where is specified in (15). The proof of Theorem 2 is complete.

Acknowledgments

The authors are grateful to the valuable comments from an anonymous reviewer and the Editor that improved the presentation of the this paper. The research of J. Xu was supported by the Natural Science Foundation of the City of Ningbo, China, under Contract 2011A610166.