Abstract

Shewhart or control charts are important Statistical Process Control (SPC) techniques used for prompt detection of failures in a manufacturing process and minimization of production costs which are modelled with nonlinear functions (cost functions). Heuristic methods have been used to find the chart’s parameters integrated within the cost function that best comply with economic and statistical restrictions. However heuristic estimation is highly dependent on the size of the search space, the set of initial solutions, and the exploration operators. In this paper the 3D analysis of the cost function is presented to more accurately identify the search space associated with each parameter of control charts and to improve estimation. The parameters estimated with this approach were more accurate than those estimated with Hooke and Jeeves (HJ) and Genetic Algorithms (GAs) under different failure distributions. The results presented in this work can be used as a benchmark to evaluate and improve the performance of other heuristic methods.

1. Introduction

Shewhart or control charts are tools used to determine whether or not a production process is in a state of statistical control and thus if the entities being produced are within quality requirements. These requirements are set by Upper and Lower Control Limits (UCL, LCL). If the quality attribute or feature from sampled entities (e.g., weight, length) is not within these limits, then the process is in an out-of-control state (nonconforming entities are being produced). In this case finding and correcting the “assignable cause” of this state (e.g., a failure) must be performed [1].

A control chart requires three main parameters: , the size of the sample; , the length of the interval between samples; and , the coefficient of the control limits [1]. These parameters are estimated based on economic and statistical restrictions because there are costs and times associated with sampling and searching assignable causes. Considering arbitrary values for these parameters without the economic and statistical factors may lead to unnecessary expenses in the manufacturing process.(i)Small intervals between samples (small ) would increase sampling frequency and the process’ cycle time with the associated production costs. Depending on the nature of the entities being produced, unnecessary sampling may lead to product loss.(ii)Close control limits (small ) would increase the frequency of (possibly false) failure alarms and the associated costs of interrupting the manufacturing process. A false alarm may lead to rejection of products that do not necessarily have low quality (rejection of conforming entities).

The Economic Statistical Design (ESD) of control charts considers the statistical requirements such as the probabilities of error Types I (, detecting an out-of-control state when the process is fine) and II (, not detecting an out-of-control state when the process is not fine) for the estimation of , , and . These parameters are integrated within a cost function which includes the time and money costs associated with sampling and searching/repairing assignable causes [210]. Finding the most suitable values of , , and that minimize the cost function considering the economic and statistical restrictions is not an easy task. This is because of the number of variables and the complexity of the cost function model. The ESD of control charts is considered a combinatorial optimization problem [11].

Among the approaches, methods, or algorithms to find a suitable solution for the ESD of control charts, Hooke and Jeeves (HJ) [4, 12], Genetic Algorithms (GAs) [2, 5, 6, 11, 13, 14], Combinatorial Methods (CB) [7], Tabu Search (TS) [15], and Teaching-Learning Based Optimization (TLBO) [16] have been proposed. Heuristic methods however do not guarantee finding the optimal values for , , and that minimize the multivariable cost function. This is because these methods or algorithms select the better solutions in every iteration step knowing only the solutions in the previous iteration steps [17]. Particularly, the search space defined by the initial set of solutions of these methods may be different; thus convergence to feasible solutions may be achieved at different times.

Achieving efficient algorithms is important because the effectiveness of the ESD and cost minimization depends on the accuracy of determination of the , , and parameters [16]. In the absence of the optimal results the assessment of these algorithms is performed by comparing them with other algorithms for solving the same problem or case study. Execution time (speed), computational and space efficiency, accuracy, and precision are criteria for evaluating the performance of algorithms [18]. In order to evaluate the performance of algorithms considering these criteria the following approaches have been proposed.(i)Development of benchmarks: these may consist of sets of optimal results, or sets of specifically designed problems to test the abilities (search strategies, operators) of the algorithms (e.g., benchmark problems) [17, 19].(ii)Algorithm visualization: it enables the externalization of the space exploration strategies of the algorithm for analysis and improvements [20].(iii)Characterization of the type of problems for which the algorithms have good performance [19].

The contribution of this paper is the characterization and visualization of the cost function for the development of benchmarks for the ESD of   control charts. The analysis and 3D visualization of the cost function are presented to provide more specific intervals for , , and to identify the most suitable search space for the global optimum and also to provide an insight of the behavior of the cost function for different failure mechanisms. This information is important to provide a benchmark to evaluate and improve the performance of algorithms for the ESD of   control charts.

The cost function presented by Rahim and Banerjee [21] was used for this work as it considers general failure distribution in contrast to the well known cost function of Lorenzen and Vance [22] which considers an exponential distribution. The 3D visualization was achieved with support of reductions of the nonlinear factors associated with the cost function such as the failure probability distributions to make computations faster. In order to test the outcome of the 3D visualization a direct search was performed on the specific 3D search space identified with this approach. The parameters estimated within this space were statistically more accurate than those obtained with the HJ and GAs heuristic methods.

The structure of this paper is as follows: in Section 2 the details of the cost function and the failure probability distributions are presented. Then, in Section 3 the details of the method used to create the 3D visualization models are presented while in Section 4 sets of these visualizations (libraries) are presented. In Section 5 the application of the 3D visualizations to improve estimation of parameters is discussed. A comparison of performance with other methods as HJ and GA is presented. Finally in Section 6 the conclusions and future work are discussed.

2. Cost Function

The cost function models the interaction between the costs, times, failure patterns, and restrictions associated with a production cycle which represents the interval from the starting production time (in-control state) to the time when a failure caused by an assignable cause occurs. This cycle includes the time required to detect and repair the assignable cause [21].

The objective of the ESD is to minimize the average cost per unit of time of a production cycle which is expressed as where is the Expected Cost per Cycle and the Expected Cycle Length [4, 21, 23]. In the following sections the expressions for and with general failure distributions (which represent the failure patterns) are presented.

2.1. Expected Cost per Cycle and Cycle Length

The cost function considered in this work is an extension of the function presented by Lorenzen and Vance [22] which has been used and adapted by many researchers in the SPC field [2430]. The ratio of (2) and (3) represents this cost function as presented by Rahim and Banerjee [21] for processes with general failure distribution and constant sampling intervals. In Table 1 the elements of (2) and (3) are described:

While    and    are explicitly presented in the cost function, is implicitly presented in and which are defined by the following expressions for the control chart assuming normality [26, 31]: where is the magnitude of the change of the mean of the process to be detected. The mean of the process changes from to where is the standard deviation. denotes the distribution function of a standard normal variable .

2.2. Failure Probability Distributions

Failure density functions () are important elements for reliability prediction [32]. These functions are used to model the failure rate patterns of a process and must be considered for the estimation of production costs. For any density function the associated cumulative probability distribution function () represents the probability that a unit, randomly taken from a population, will fail at most in time [33].

If instead of taking one unit units are taken at the end of a time interval , where represents the probability of the process failing (changing to an out-of-control state) at the end of the sampling interval [34]. Considering a failure density function , the failure probability distribution can be estimated by the following expression:

In this work the exponential and Weibull functions are considered. The exponential function is commonly used for the estimation of failure patterns and is defined by the following expression: where is the main parameter of the density function and represents the known number of failures per unit of time.

The Weibull density function is used for processes where causes of failures are assumed to be of different nature [35]. This function is defined by the following expression [36]: where represents the time when the process is likely to fail, identifying in this way the life expectancy of the process. is known as the scale parameter and as the form/shape parameter. When the Weibull density function is approximated to the exponential function with . Failure patterns which conform to a Weibull probability distribution with are considered to be age-related failures while those conforming to a Weibull distribution with are considered to be random failures [35].

In order to compute the cost function defined by the ratio of (2) and (3), must be computed for (7) and (8). Particularly for the Weibull function, the estimation of the element involves some complex integrals. In Section 3.3 the methodology performed for the computation of and is presented.

3. Building the 3D Benchmark

The ESD problem consists in finding the set of parameters , , and that minimizes the cost function . In this case there are four variables: three to be estimated and one which is dependent on these variables (the associated value). Because of this situation a direct 3D visualization is not possible. Hence, the 3D benchmark consists of the integration of different 2D and 3D models. In this section the steps followed to build these model sets are presented.

In order to perform the computations required to build the 2D and 3D models for the 3D benchmarks the following equipment was used: DELL Inspiron 15R with Intel Core i7-4500U CPU at 2.40 GHz with 8 GB in RAM. Programming was performed with MATLAB R2008a (version 7.6.0.324).

3.1. Finding Optimal Intervals for , , and

An initial link between , , , and is defined by (4) and (5) where and . and are maximum error levels and commonly and . At this point there is no influence from other elements or costs from .

The first step is to find an optimal interval for . For this purpose the approximation presented in [37] for with a specific value for is used:

While depends on and , is dependent only on . From (4)   is expressed as

By replacing (12) into (11) the estimation of now depends on and . For any and the interval for can be obtained with the following expression:

The optimal interval for now depends on required intervals for and . In this case the interval from 0.01 (minimum) to 0.20 (maximum) was considered for both error probabilities (e.g., and ).

The second step is to estimate an optimal interval for . This is performed directly from (13) as defined by (12). In Table 2 the values estimated for are presented where an inverse correlation between both parameters is observed.

In Figure 1 the 3D visualizations of (13) considering the optimal intervals for and and three levels are presented.

For from (12) it is estimated that (inverse interval). While the overall 3D pattern does not change through Figures 1(a)1(c), the difference is mainly observed on the magnitude of which is highly dependent on . In Figure 2 the 2D visualization of the intervals (minimum and maximum values) for considering different values for across and is presented. The intervals presented in Figures 1 and 2 can be used for any cost function for control charts assuming normality.

Although in practice is considered to be equal to 0.5, the overall effect of on has not been presented in graphical form as in Figure 2. For this work three scenarios are considered for based on the levels defined in [2, 4, 31, 38]: . This leads to the following optimal intervals for based on data presented in Figure 2: = (73–570, 19–143, 5–36). In Table 3 the associated intervals for , , and are summarized.

3.2. Setting the Constant Elements of

The cost function defined by (2) and (3) has many elements besides the , , , and parameters which were described in Table 1. In the literature this cost function has been studied with a set of standard values for the constant elements , , , , , , , and for specific values [2, 4, 31, 38]. In Table 4 these sets are presented for each level of .

3.3. Estimating the Probability Distributions and

for the exponential and Weibull density functions is obtained from (6) as follows:

An important element of (2) is which is defined as the point of time within where the process changes from the in-control state to the out-of-control state. This element is estimated as

Considering the exponential distribution is expressed as

Equation (16) is widely presented in the literature [5, 11, 26, 30]; however for the Weibull distribution an expression for for any and has not been found. In this work an expression for was obtained with support of the symbolic toolbox of MATLAB and additional algebraic reductions:

In (17)   is a complex term which is defined as follows:

The elements and in (19) are the Whittaker functions and which are defined in terms of Kummer’s confluent hypergeometric function (also known as the function) [39] as follows: where

The hypergeometric functions from (20) were computed with the MATLAB function hypergeom [40, 41] as follows:

3.4. Estimating the 3D Visualization of

Although the and values defined within the intervals presented in Table 3 would produce complying and values for the statistical restrictions , , some values may actually produce undefined elements in . Also, the suitable interval for has yet to be identified. Thus, a second set of intervals within those presented in Table 3 must be identified and this is the objective of the 3D visualization of .

In Figure 3 the structure and pseudocode of the 3D plotting algorithm is presented. Initially the cost and time parameters for must be provided in addition to the type and associated parameters of the failure probability distribution () and . Then sets of evaluation vectors , , and are created to specify the values of , , and to be evaluated in . These values must be within the associated intervals presented in Table 3. For the first element of is which is the lower bound for while the last element is which is the upper bound for . As presented in Table 3 these bounds are dependent on .

Each set of , , and values are defined in such a way that vectors , , and have the same number of elements. Then each set of , , and values is evaluated in and the 3D matrix is created where contains the value for the th element of , the th element of , and the th element of . Then this 3D space is divided into to visualize the patterns of the cost magnitudes in . In general five gaps were considered and these were identified with a specific color. The colored points of these gaps were plotted into the 3D space by using the command in MATLAB.

In Figure 4 the 3D visualization of with = 0.25 and exponential distribution for and are presented. In this case Gap1 concentrates the minimum Cost values within 1.0% of the distance between the total minimum and maximum costs in . This gap is delimited by the following intervals: , , and .

Something very important to mention about this technique is that the total minimum found in this process is not the optimal (or near optimal) minimum of the whole function. This is because , , and do not contain all continuous values within the search space given the condition that , , and must have the same size. Hence just some points of the search space are actually plotted.

In this case the 3D visualization serves as a guide to identify the regions where the optimal parameters are likely to be found by a more specific search algorithm instead of searching across the whole solution space. Hence, the 3D visualization can support the explorative performance of the search process which then can be focused on the exploitative search.

For more specific values for (where is given by Table 2 and (12)) and the minimum values may be present in a different gap. In such case Figures 4(a) and 4(b) can be used in the following way to obtain an estimate for the parameters’ intervals:(i)by having a specific value for a required a lower and upper bound for can be obtained;(ii)then, an interval for can be estimated based on the region delimited by the associated bounds of and . Each gap’s color can be used to guide the delimitation of the region.

In Section 4 sets of 3D visualizations with different settings of and failure distributions are presented. Then in Section 5 examples about how to use these 3D visualizations to improve estimation of parameters are presented and discussed.

4. Libraries of 3D Visualizations

In Table 5 the parameters considered for the 3D visualizations with different failure distributions are presented. In Figure 5 the selected 3D visualizations for the exponential distribution are presented while in Figures 6, 7, and 8 the visualizations for the Weibull distribution are presented.

5. Experiments and Results

Experiments were performed in order to evaluate the performance of the 3D visualizations to improve estimation of the control chart’s parameters. For these experiments and were considered. A direct search method across the regions identified with the 3D visualizations was performed to evaluate the performance of this technique. Hooke and Jeeves (HJ) and Genetic Algorithms (GA) were considered for comparison purposes. For this, the schemes presented in [12] for HJ and in [6] for GA were used.

5.1. Example of Estimation Process

In this case the ESD of an control chart with and Weibull failure distribution is considered. The Weibull distribution has the following parameters: and . The 3D model presented in Figure 6(b) is used for this case. As presented in Figure 9 two main views (front and left side) of the 3D model are used to estimate the intervals for each parameter.

The first step is to identify the gap where the minimum is likely to be found according to . For this purpose the lower bound for is estimated from the restriction which from (12) implies . As observed in the front view of the 3D model the region delimited by Gap1 is within the space defined by . Hence Gap1 is the region where the overall minimum of is likely to be found for this example.

From the same front view the upper bound for is estimated and the interval is identified. Using the same front view the lower and upper bound for are estimated and the interval is identified. Finally with the left-side view the lower and upper bound for are estimated and the interval is identified.

From this point an estimate of , , and can be given in terms of intervals. However to provide a specific value for and the set of parameters a search method must be performed. In this case a direct search method over all possible values within the (more specific) region delimited by these intervals can be applied leading to a more exploitative search.

5.2. Results for Different Cases

HJ and GA were used to find the optimal parameters for each case presented in Figures 5, 6, 7, and 8. For the 3D visualization method (3D Vis) the intervals identified for each case are presented in Table 6. The parameters estimated with HJ, GA, and 3D Vis are presented in Tables 7, 8, 9, and 10 for the exponential and Weibull distributions.

A factorial analysis was performed on the data presented in Tables 7 to 10 to assess the effect of the proposed method on the task of minimizing . In Figure 10 the “Main Effects” plots on the mean of across the following factors and levels are presented.(i)The failure probability distribution (): exponential (Exp), Weibull (Weib) with , , and .(ii)The magnitude of the change of the mean of the process to be detected (): 0.25, 0.50, and 1.00.(iii)The estimation method for the control chart’s parameters: HJ, GA, and 3D Vis.(iv)The failure rate of the distributions (determined by for the exponential distribution and for the Weibull distribution): 0.05, 0.25, 0.50.

The mean value of changes according to these factors observing a positive correlation between and the failure rate given by and . For the estimation method, GA presents the higher with 347.705 while HJ presents an average of 346.050. In contrast, the parameters estimated with the 3D Vis method provided an average of 341.147. This difference was tested for statistical significance by means of a “2-sample -test” performed with Minitab Ver. 16.2.3.0.

In Figures 11 and 12 the results of the statistical tests for the hyphoteses and are presented. In both cases the paired -test concluded that the results obtained with GA and HJ were significantly different from the results obtained with the proposed 3D method thus rejecting .

Because the results presented in Tables 7 to 10 for the 3D Vis method were obtained with a direct search within the specific regions delimited by Gap1 it can be assumed that these are optimal (or at least near optimal) results. As presented in Figures 11 and 12 these results were statistically better than those obtained with well established heuristics as HJ and GA. Thus, these results can be used as benchmark to evaluate the performance of other algorithms or methods.

6. Conclusions and Future Work

In this work an important cost function considered for the ESD of   control charts was analyzed to provide benchmark (optimal) values for the estimation of the parameters , , and . This was also performed to provide the means for evaluation of heuristic algorithms for the estimation of the chart’s parameters. Particularly the analyses presented for and -- ((13), Figure 2) and and ((12), Table 2) are valid for the general ESD when normality is assumed regardless of the cost function considered.

Among the applications of the present work the following can be mentioned.(i)The 3D plotting algorithm presented in Figure 3 can be adapted to visualize other multivariable cost functions and (18) and (19) can be used to explore the patterns of cost functions with different Weibull parameters.(ii)The libraries presented in Figures 5 to 8 provide the means to further analyze and/or understand the effect of the failure distribution and associated parameters on the minimum and maximum values of the cost function .(iii)If alternative values for the constant parameters of the cost function (Table 4) do not change the behavior of the 3D visualization (e.g., the gap’s regions) the libraries can be used to get a graphical and fast estimation of the chart’s parameters as described in Section 5.1.(iv)The intervals identified with the 3D visualizations can be used to guide the search strategies of heuristic algorithms.

The concept of 3D visualization is important to understand the dynamics of different processes and thus it should be explored for other cases of ESD. Future directions of this work are the following.(i)To perform the 3D visualizations of other cost functions and failure distributions for other control charts as EWMA [11, 30], [13], and [42, 43].(ii)To estimate the valid intervals for , , and considering nonnormality [2, 44] and also to analyze the effect of the assumption of nonnormality on the behavior of the optimal regions of the cost function.(iii)To perform the 3D visualization of cost functions with variable sampling intervals.(iv)To develop the mathematical foundations to analyze the effect of alternative constant parameters on the behavior of the cost function. In such case, the number of variables in the cost function will be increased and thus alternative -D visualization models have to be developed.(v)To develop the mathematical model of the gaps’ patterns observed in the 3D visualizations to reduce the complexity of the cost function.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.