Abstract

Multivariate statistical process control is the continuation and development of unitary statistical process control. Most multivariate statistical quality control charts are usually used (in manufacturing and service industries) to determine whether a process is performing as intended or if there are some unnatural causes of variation upon an overall statistics. Once the control chart detects out-of-control signals, one difficulty encountered with multivariate control charts is the interpretation of an out-of-control signal. That is, we have to determine whether one or more or a combination of variables is responsible for the abnormal signal. A novel approach for diagnosing the out-of-control signals in the multivariate process is described in this paper. The proposed methodology uses the optimized support vector machines (support vector machine classification based on genetic algorithm) to recognize set of subclasses of multivariate abnormal patters, identify the responsible variable(s) on the occurrence of abnormal pattern. Multiple sets of experiments are used to verify this model. The performance of the proposed approach demonstrates that this model can accurately classify the source(s) of out-of-control signal and even outperforms the conventional multivariate control scheme.

1. Introduction

Statistical process control (SPC) is one of the most effective tools in total quality management (TQM), which is used to monitor manufacture process variation. Control charts are the most widely applied SPC tools used to reveal abnormal variations of monitored measurements, as well as to locate their assignable causes [1, 2]. According to the state of control chart, we can easily be informed whether the manufacture process is in in-control state or not; quality practitioners or engineers search for the assignable causes and take some necessary corrections and adjustments to bring the out-of-control process back to the in-control state [3].

In the multiply quality diagnose field, there are often two or more quality characteristics which should be monitored at the same time. A nature idea is that we can maintain a separate control chart for each characteristic. The traditional SPC technology is based on the process observation data meeting the independent and identically distributed. However, many manufacturing processes do not meet this assumption. It would result in high error discrimination phenomenon when the characteristics are highly correlated. Hotelling is the first to recognize the defects if we simply expand the univariate control chart to multivariate process; thus the concept of multivariate quality was proposed in 1947. Hotelling's statistic [47] might be the most common tool in multivariate analysis for identifying whether the whole process is in out-of-control state. The primary possible causes for the mean shifts result from the introduction of new workers, machines, material, or methods, a change in the measurement method or standard, and so forth. The test statistic of the mean vector is not only considered the volatility of the mean, but also the correlation among the variables. it is the optimal test statistic for detecting a general shift. Besides control chart, Mason et al. [8] proposed a cause-selecting procedure by using the decomposition of the statistic. The user can see the contribution of each variable by decomposing the statistic. Moreover, this decomposition allows the user to observe which variable(s) with a significant contribution is (are) the cause of abnormal processes. The drawbacks of this method are additional computation and its sensitivity to the number of variables. Other multivariate control charts, such as multivariate exponentially weighted moving average (MEWMA) control chart, and multivariate cumulative sum (MCUSUM) control chart have been proposed to detect mean shifts. Lowry et al. [9] proposed a multivariate extension of the exponentially weighted moving average (EWMA) control chart. They compared their method to a multivariate cumulative sum (MCUSUM) control chart based on the average running length (ARL) performance and their results showed that the MEWMA chart was similar to the MCUSUM chart in detecting mean shifts of a multivariate normal distribution.

Besides the traditional multivariate statistical process control (MSPC) technology, many scholars have been trying to diagnose the abnormal process in multivariate process via data mining and artificial intelligence especially the neural networks [10, 11]. Neural networks (NNs) have excellent noise tolerance in real time, requiring no hypothesis on statistical distribution of monitored measurements. These important features make NNs promising and effective tools that can be used to improve data analysis in manufacturing quality control applications. In the past two decades, various NNs with different structures and learning algorithms have been adopted widely in quality control [12]. Multilayer perceptron (MLP) NN [13], learning vector quantization (LVQ) NN [14], probabilistic NN [15], adaptive resonance theory (ART) NN [16], modular NN [17], backpropagation NN (BPN) [18], and so forth, were used to detect abnormal signals and (or) identify some basic abnormal patters such as shift, trend, cycle, and mixture patterns. These NN-based models have shown their efficiency and effectiveness in SPC.

Summarizing the multiple variables of interest in one single statistic does not release source(s) of the out-of-control signals; that is, it cannot tell which one of the variables or a combination of variables has caused the out-of-control signals. You know, the knowledge of separation out-of-control variables can be helpful to narrow the set of possible assignable causes, which is more rapid identification of the particular causes and reducing of the adjustment cost.

There are a few multivariate studies found with neural networks in the literature, such as an effective detection out-of-control signals or identification source(s) of out-of-control signals by mean shift in multivariate processes. El-Midany et al. [19] developed artificial neural networks to recognition control chart pattern recognition in multivariate process. Yu et al. [1, 20] proposed a selective neural network (NN) ensemble approach (DPSOEN, discrete particle swarm optimization) to monitor and diagnose out-of-control signals in bivariate process. Extensive experiments demonstrate that this method is effective. Salehi et al. [21] presented a hybrid learning-based model for on-line analysis of out-of-control signals in multivariate manufacturing processes. The main contributions of this work are recognizing of the type of unnatural pattern and classification of major parameters for shift, trend and cycle and for each variable simultaneously by proposed hybrid model. Kim et al. [22] attempted to integrate state-of-the-art data mining algorithms including artificial neural networks, support vector regression, and multivariate adaptive regression splines with SPC techniques to achieve efficient monitoring in multivariate and autocorrelated processes. Simulation results from various scenarios indicated that data mining model-based control charts perform better than traditional time-series model-based control charts. Cheng et al. [23] developed using of SVR to predict the magnitudes of process shift. Feed-forward neural networks with different training algorithms and CUSUM-based estimator are used as benchmarks for comparison.

Research indicates that identifying source(s) of out-of-control signals using neural network has been proved to be an effective and useful tool in multivariate manufacturing processes, which has better estimation capabilities than CUSUM and MEWMA. However, ANN classifier just has the judgment and classification ability only after it is studied via machine learning. It is training based on the principle of empirical risk minimization, which leads to a long training time, a poor generalization ability, and easiness of fallng into local minimum. It is not suitable for the quality diagnosis when the process is dynamic and changeable. Although support vector machines has shown excellent generalization performance in a number of applications, one problem that faces the user of an SVM is how to choose a kernel and the specific parameters for that kernel. Applications of an SVM therefore require a search for the optimum settings for a particular application. The kernel functions map the original data into higher-dimension space and make the input data be set linearly separable in the transformed space. The choice of kernel functions is highly problem dependent and it is the most important factor in support vector machine applications. In this work, the RBF kernel is used as the kernel function of the SVM because it tends to give better performance.

In this study, considering the robust recognition power of support vector machine and the global search capability of the genetic algorithm, an optimized SVM approach named GA-SVM (support vector machine classification based on genetic algorithms) was developed; a framework of control chart pattern recognition of multivariate observation data is proposed for classifying source(s) of out-of-control signals in multivariate processes. The SVM classification is used to recognize selected subclasses of multivariate abnormal patterns, identify the variate(s) that is (are) responsible for the occurrence of the abnormal pattern, while the role of GA is that it optimizes SVM parameters such as kernel function parameters and the penalty factor.

The remainder of this paper is organized as follows: the patterns generation of multivariate process and support vector machine algorithms are presented in Section 2; Section 3 describes the diagnosis SVM classifier based on GA in the process mean for its content; in Section 4 we use numerical example and series tests to verify the proposed method; conclusions are presented in Section 5.

2. Methodology

This section provides a practical overview for the patterns generation of multivariate process and support vector machine algorithms. It is preparing for the diagnosis model to be proposed in later chapter.

2.1. The Patterns Generation of Multivariate Process

According to multivariate quality control theory [2426], the statistic can reflect the state of the correlation structure and the mean vector of the multivariate process data. The overall quality of a multivariate process can be monitored by comparing the above statistic against a positive UCL. These charts are easy to construct as long as the process parameters (i.e., is the mean vector and is the covariance matrix) are known. However, the covariance matrix is usually unknown in many quality process control fields; we could not assume the value arbitrarily. In this case, the covariance matrix should be estimated via limit sample. The mean vector and the covariance matrix are sufficient to characterize any multivariate normal distribution. Thus, random multivariate vectors, such as , represent a vector of observations from quality characteristics of independent normal distributions at time . In the multivariate statistical process control, the mean vector of statistic is thus obtained using the following form:

Random samples of size are collected regularly from the process. Thus, represents the mean vector for each of the multivariate vectors, such as for to . The term is a number of quality characteristics represented by 1 column vector and is the transpose of , that is, a by row vector. Finally, , which is a by matrix, is the inverse of the matrix. Therefore, the statistic is the result of a by matrix (i.e., it is a scalar value). Each normal multivariate mean vector must result in a value of that is less than or equal to a UCL given by , where is the upper tail of the distribution that acts as the type I error, when the confidence limit equals percent and with degrees of freedom. Any vector that results in plotting above limit is discarded from the simulated data set. The sample size, , is kept to 1, so that the vector value of each simulated multivariate point is used directly to represent one sample observation.

2.2. A Brief Introduction to Support Vector Machine Algorithms

The support vector machine method is a new and promising classification and regression technology [2731]. The basic idea of an SVM can be stated briefly as follows. An SVM initially maps the input vectors into a high-dimensional feature space, either linearly or nonlinearly, which is relevant to the selection of the kernel function. The input or feature vectors in the feature space are then classified linearly by a numerically optimized hyperplane, separating the two classes (this can be extended to multiclass). The SVM training always seeks a global optimized solution and the hyperplane depends only on a subset of training examples [3235].

Let , , , be the training set with input vectors and output . Here, is the number of sample observations, is the dimension of each observation, and is the known target. The algorithm is to seek the hyperplane , where is the vector of hyperplane and is a bias term, to separate the data from two classes with maximal margin width , and all the points under the boundary are named the support vector. In order to optimize the hyperplane, the SVM solves the following optimization problem:

It is difficult to solve (2). Thus, SVM transforms the optimization problem to be dual problem by the Lagrange method. The value of in the Lagrange method must be nonnegative real coefficients. Equation (2) is transformed into the following constrained form:

In (3), is the penalty factor and determines the degree of penalty assigned to an error. It can be viewed as a tuning parameter, which can be used to control the trade-off between maximizing the margin and the classification error. In order to separate two classes exactly, add as lack variable in the Lagrange equation to make , , . An objective of slack variable is to increase the flexible buffer of boundary.

In general, it could not find the linear separate hyperplane for all application data. For problems that cannot be linearly separated in the input space, the SVM uses the kernel method to transform the original input space into a high-dimensional feature space, where an optimal linear separating hyperplane can be found. The common kernel functions are linear, polynomial, radial basis function (RBF), and sigmoid. Although several choices for the kernel function are available, the most widely used kernel function is the RBF, which is defined as [36] where is the parameter of the kernel function. Consequently, the RBF is employed in this study. Also, in this study, we used multiclass SVM method to build the diagnosis model.

3. Diagnosis Model Based on Optimized Support Vector Machine in the Process

The parameters that must be determined are the kernel parameter , the regularization parameter , and . Kernel parameter defines the structure of the high-dimensional feature space. The kernel parameter is selected through genetic algorithm (GA). The regularization parameter should be chosen with caution to avoid overfitting.

In this paper, in order to achieve the goal of diagnosis on the mean vector, as well as identification and location of the abnormal variables in multivariate process, firstly, we are assuming that the abnormal factors in the process only make the changeable of the mean vector, instead of the changeable of the covariance matrix. Under this premise, a diagnosis of the out-of-control signals in the multivariate process based on hotelling chart and optimized support vector machines is proposed in this paper. The structure of the diagnosis model is briefly illustrated in Figure 1.

The scheme can be briefly introduced as follows: the detection part and the diagnostic process. When an out-of-control signal is detected by statistic, more observations from the process are collected and they are regarded as a chromosome which was then performed with binary code. A multiclass support vector machine classifier will be used to recognize a random chromosome and recognition accuracy is considered as the fitness function to evaluate the fitness of individual feature. By the operations of selection, crossover, and mutation, with GA self-adaptive optimizing for penalty parameter and kernel parameter, we obtain the optimal model, which is finally used for identifying the source(s) of the out-of-control signal, as well as identification and location of the abnormal variables.

In Figure 1, we identify and locate the abnormal variables via translating them into a pattern recognition problem. For a -dimensional manufacturing process, there are cases: the mean of each variable is either normal or abnormal. Under these circumstances, the mean vector of -dimensional totally has normal (abnormal) states. In other words, only one normal state and abnormal state exist. Once the control chart is out-of-control, it must be one of the abnormal states. In order to distinguish between these states, we define this abnormal state as patterns which should be identified; we use optimized SVM model to identify the abnormal signal, which is obtained via genetic algorithm.

Taken the bivariate process as an example, the reference mean vector is and covariance matrix ; the abnormal state of the mean vector only has three circumstances that should be identified: the first variable is out of control and the second variable is in control ; the first variable is in control, the second variable is out of control ; both the two variables are out of control . If denoting the out-of-control variable as 1 and in-control variable as 0, then the target vectors for these cases were values (0,0,0,0), (0,1,0,0), (0,0,1,0), and (0,0,0,1), which are used for identifying that the normal, the first and the second, or the both quality characteristics are in out of control state, where the mark (0,0,0,0) represents the normal patterns and (0,1,0,0), (0,0,1,0), and (0,0,0,1) represent the three abnormal patterns. When an out-of-control signal is detected by statistic, and if the optimal SVM model detects the signal (0,1,0,0) pattern, it indicates that in this process, the first variable is in control while the second variable is out-of-control; once the optimal SVM model detect the signal (0,0,0,1) pattern, it indicates that in this process, both the two variables are out of control, and so on. In the idea mentioned above, the optimal SVM model is not only a substitute for hotelling control chart, but also identification and location of the abnormal variables when an out-of-control signal is detected by the control chart.

4. Simulation and Analysis

In this section, an example derived from a bivariate instance of the chemistry process of Montgomery is given to illustrate the use of the proposed approach. The parameters of this process represent process target values of and ; the reference mean vector and the reference variance matrix are obtained by the first fifteenth observation data. Thus, the is and the is

In the following part, we would compare the estimation capabilities of different statistical approaches, such as BPNN, SVM, and optimized SVM.

4.1. Examples Development and Parameter Setting

Training data is very critical in applications of SVM, which determines the recognition efficiency of the classifier work. In this study, the Monte Carlo simulation method was applied to generate the required data sets of normal and abnormal examples for training and testing. Multivariate simulation can provide only a simplified picture of the reality. The best alternative is, where possible, to collect various data from real-world manufacturing systems.

In a bivariate SPC application, an input to the neural system should consist of a time-series window of bivariate vectors. In this study, five distinct types of shift (i.e., , and ) associated with the th quality characteristic are considered, which cover evenly the range of whole process shifts. For generating training data sets, we shift the mean vector as , , and in this bivariate process, in order to verify the good ability of SVM in small sample; five hundred input vectors shifted 1.0 for each abnormal model were generated by using statistic, which were used as the training data set of the optimized SVM.

The number of SVM in the output layer is determined by the dimension of quality characteristics. The output vector consists of four elements and uses the largest value to identify the source(s) of the whole signals; in this study, the number of the output nodes is 4.

In order to evaluate the performance of optimized SVM, different sample vectors that shifted , , , , and were generated. Similar to generated training data sets, we applied statistic to detect the overall out-of-control signal for each sample vector with the type I error being equal to 0.5. We did so until we obtained 500 examples for each abnormal case, which were used as the testing data sets of optimized SVM.

4.2. Optimization Using Genetic Algorithm (GA)

SVM is a powerful machine learning and data mining tool that has already been shown to possess classification power. However, the performance of SVM is greatly affected by the penalty (cost) parameter and the variance of the RBF kernel function parameter .

Global optimization attempts to locate the absolutely best set of optimum conditions that results in the highest objective value. It is usually a very difficult problem. Traditional optimization methods such as gradient ascent/descent search in the direction of the local gradient vector and thus easily get stuck in a problem with a multimodal objective function. The optimization method used in our study is the genetic algorithm (GA), which is one of the most popular and widely used techniques for global optimization. Figure 2 presents the structure of the parameter optimization using genetic algorithm (GA).

4.3. Performance Analysis for the Bivariate Process

In this section, the influences of the key factors of optimized SVM model upon their generalization performance are analyzed by implementing the following experiments. The analysis can help us to obtain the suitable parameter setting to improve the generalization performance of the proposed model.

4.3.1. Test 1: The Selection Process of the Optimal Parameters

In this study the free parameters of SVM were selected following a genetic algorithm optimization experiment. The whole process was implemented in Matlab 2009b. In order to seek the optimal parameters, we chose as types of shift for the three different volatilities; each type generates 100 samples. The relevant parameters of the genetic operations are set as follows: population size is 20, the maximum number of iterations is 200, the genetic generation gap and mutation probability are taken as 0.9 and 0.01, respectively, the single is 0.01, the chosen kernel was based on RBF with being 1.0834 and the regularization parameter was set to 1.5136. Figures 3(a) and 3(b) show an example of the grid search result, where the -axis and the -axis are and , respectively. The -axis is the accuracy performance. The findings of this experiment were that SVM is quite robust against parameter selections.

4.3.2. Test 2: Sensitivity Analysis to Different Model

Table 1 presents comparison of the performance among the raw data-based BPNN model, the SVM model, and the optimized SVM model trained and tested using raw data. The overall total percentages of correct recognition of the three models are 92.11, 94.23, and 97.96, respectively. The optimized SVM shows the better identification performance in major cases in comparison with those of the BP model and SVM. This indicates that the type of classifier and the choice of nuclear parameter and penalty (cost) parameter can help to affect the recognition performance of classification. Thus, the genetic algorithm can strengthen the pattern feature of the out-of-control signals.

From Table 1, it can be observed that the SVM shows similar performance to that of the BPNN model. Thus, the optimized SVM can be applied in some multivariate processes that have many quality characteristics (e.g., more than three characteristics), which can decrease the complexity of the structure of NNs and which helps to construct SVM with more simple structure. Moreover, the optimized SVM shows good performance in identifying the source(s) of out-of-control signals.

4.3.3. Test 2: Sensitivity Analysis to the Number of the Training Examples

Training sets of different sizes were generated to train optimized. Table 2 presents the test results of optimized SVM trained by different examples. From these results, some conclusions are drawn as follows.(1)Increasing the training examples can improve the performance of GASVM up to a certain level of accuracy at any given value of shift magnitudes. This could be explained by the fact that with enough large training sets there is a better chance of true representation of a problem space. However, any further increase of the training size after reaching such limits will not improve the performance of GASVM. Moreover, the larger training set results in higher time cost of training.(2)Abnormal patterns with small shift magnitudes require more representation of the density distribution by larger training sets.(3)Abnormal patterns with large shift magnitudes require smaller training data sets as they are easier to identify. This is due to the fact that abnormal patterns with large shift magnitudes have stronger features that separate easier themselves from other abnormal patterns.

5. Conclusion and Further Work

In this paper, we have proposed optimized SVM as an approach to estimate process shift size. Genetic algorithm is used to self-adaptively optimize penalty parameter and kernel parameter; then this optimized SVM modeling was introduced to identify the responsible variable(s) on the occurrence of abnormal pattern. The performances of the proposed approach were evaluated by estimating the various patterns of recognition accuracy using simulation. The proposed optimized SVM approach has been compared to ANN and SVM. Extensive comparisons show that the proposed approach presented in this paper offers a competitive alternative to existing control procedures. Our study reveals that SVM achieves the best performance. The results indicate that SVM is a promising tool in estimating the magnitude of mean shift. Future research can be performed in a number of areas. A further improvement would be to combine several other optimization algorithms in SVM, which could further improve the generalization performance of SVM. A second area would be the construction of input vector based on features extracted from original input observations.

Acknowledgments

This work was supported by the National Science Foundation of China (no. 51075418), the National Science Foundation of China (no. 61174015), and Chongqing CMEC Foundations of China (no. CSTC2010BB2285).