Abstract

The monitoring of a multivariate process with the use of multivariate statistical process control (MSPC) charts has received considerable attention. However, in practice, the use of MSPC chart typically encounters a difficulty. This difficult involves which quality variable or which set of the quality variables is responsible for the generation of the signal. This study proposes a hybrid scheme which is composed of independent component analysis (ICA) and support vector machine (SVM) to determine the fault quality variables when a step-change disturbance existed in a multivariate process. The proposed hybrid ICA-SVM scheme initially applies ICA to the Hotelling T2 MSPC chart to generate independent components (ICs). The hidden information of the fault quality variables can be identified in these ICs. The ICs are then served as the input variables of the classifier SVM for performing the classification process. The performance of various process designs is investigated and compared with the typical classification method. Using the proposed approach, the fault quality variables for a multivariate process can be accurately and reliably determined.

1. Introduction

In recent years, considerable concern has arisen over the multivariate statistical process control (MSPC) charts in monitoring a multivariate process [1–6]. The MSPC chart is one of the most effective techniques to detect the occurrence of a multivariate process disturbance. An out-of-control signal implies that disturbances have been occurred in the process. When a signal is triggered by the MSPC chart, the process personnel should begin to search for the root causes of the underlying disturbance. Once the root causes have been determined, the process personnel would significantly decrease the effects of the disturbance and then bring the underlying process back in a state of statistical control.

When the root causes have been determined, the necessary remedial actions can be properly taken in order to compensate for the effects of the underlying disturbance. Also, the identification and fixing of the root causes would mainly depend on the accurate identification of the quality variables at fault. As a consequence, the identification of the quality variables at fault in a multivariate process is a very important research issue.

However, the use of the MSPC charts typically encounters a major problem in the interpretation of the signal. Although the MSPC chart’s signal will indicate that the underlying process is out of control, the quality variables at fault are very difficult to determine. The degree of difficulty increases when the number of quality variables (𝑝) in the multivariate process increases. Typically, there are 2π‘βˆ’1 possible sets of quality variable at fault in an out-of-control multivariate process which has 𝑝 quality variables. For example, there are 31 possible sets of quality variables at fault in a multivariate process with 5 quality variables. When a MSPC signal is triggered, it is not straightforward to determine which one of the 31 possible combinations is responsible for this signal.

Runger et al. [1] introduced a decomposition method to overcome this problem. They computed an approximate chi-square statistic to determine which of the monitored quality variables invoked the MSPC signal. However, their method has some limitations in certain situations [2]. Specifically, their approach may not be able to offer an accurate identification rate (AIR) when a small magnitude of process disturbance exists in a multivariate process. Some classification techniques are therefore developed to overcome the drawback of their approach [2, 3]. Shao and Hsu [2] used the Artificial Neural Networks (ANNs) and support vector machine (SVM) approaches to determine the quality variables at fault in the case of process mean shifts. C. S. Cheng and H. P. Cheng [3] also studied the ANN and SVM techniques to determine the quality variables at fault in the case of process variance shifts.

Huang et al. [4] demonstrated that performance of hierarchical support vector machine technique is better than the traditional SVM. Also, Shao et al. [5] proposed decomposition schemes and developed useful statistics to estimate the quality variables at fault in the case of variance shifts that have occurred in a multivariate process. However, in their approach, the sample size needed was very large, which may be different from what is encountered in practice.

Many studies on the utilization of one-shot or one-step classifiers’ approach have been conducted [1–4, 6]. However, very little is known about the hybrid scheme for determining the quality variables at fault in a manufacturing process [7, 8]. In this paper, we present the use of a hybrid mechanism, which integrates independent component analysis (ICA) and SVM as processing methods to improve the results in determining the quality variables at fault in an out-of-control multivariate process. The basic concept of the proposed hybrid approach is that the most useful information to determine the quality variables at fault may be embedded in the monitor statistics, for example, the Hotelling T2 statistics in the Hotelling T2 control chart. We could enhance the AIR if we decompose the monitor statistics and input the decomposed factors to the classifiers.

Due to its frequent use in real applications [2, 9, 10], this study uses the Hotelling T2 control chart to detect the process mean shifts in a multivariate process. In addition, since the ICA has been reported to have the capability of distinguishability [11–19], this study uses the ICA as the first-step technique to extract the independent components (ICs) from Hotelling T2 statistics. The hidden useful information of the quality variables at fault would be embedded in these ICs. In the second step of classification, those ICs are then used as the input variables of the classifiers. This study considers the SVM as a classifier for the reason of its great potential and superior performance in practical applications [20–27].

This study is organized as follows. Section 2 discusses the individual components of the proposed hybrid mechanism. Section 3 addresses the appropriate models for determining the quality variables at fault when the process mean shifts are introduced in a multivariate process. In this section, the various experimental settings and the simulation results are also discussed. The final section summarizes the research findings and presents our conclusions.

2. Methodologies

There are two components in our proposed hybrid scheme, and they include independent component analysis and the support vector machine. The following section addresses the applications and the use of these two techniques.

2.1. Independent Component Analysis

The present study employs ICA to enhance the accurate identification rate (AIR) of the proposed hybrid scheme. There are some ICA applications for process monitoring. Lu et al. [11] successfully combined the ICA and SVM to identify the control chart patterns. Kano et al. [12] applied the ICs, instead of the original measurements, to monitor a process. In their study, a set of devised statistical process control charts have been developed effectively for each IC. Lee et al. [13] used the utilization of kernel density estimation to define the control limits of ICs that do not satisfy Gaussian distribution. In order to monitor the batch processes which combine independent component analysis and kernel estimation, Lee et al. [14] extended their original method to multiway ICA. Xia and Howell [15] developed a spectral ICA approach to transform the process measurements from the time domain to the frequency domain and to identify major oscillations.

Let 𝐗=[𝐱1,𝐱2,…,π±π‘š]𝑇 be a matrix of size π‘šΓ—π‘›, π‘šβ‰€π‘›, consisting of observed mixture signals 𝐱𝑖 of size 1×𝑛, 𝑖=1,2,…,π‘š. In the basic ICA model, the matrix 𝐗 can be modeled as follows: 𝐗=𝐀𝐒=π‘šξ“π‘–=1πšπ‘–π¬π‘–,(2.1) where πšπ‘– is the 𝑖th column of the π‘šΓ—π‘š unknown mixing matrix 𝐀; 𝐬𝑖 is the 𝑖th row of the π‘šΓ—π‘› source matrix 𝐒. The vectors 𝐬𝑖 are latent source signals that cannot be directly observed from the observed mixture signals 𝐱𝑖. The ICA model aims at finding an π‘šΓ—π‘š demixing matrix 𝐁 such that ξ€Ίπ²π˜=𝑖𝐛=𝐁𝐗=𝑖𝐗,(2.2) where 𝐲𝑖 is the 𝑖th row of the matrix 𝐘, 𝑖=1,2,…,π‘š. The vectors 𝐲𝑖 must be as statistically independent as possible and are called independent components (ICs). ICs are used to estimate the latent source signals 𝐬𝑖. The vector 𝐛𝑖 in (2.2) is the 𝑖th row of the demixing matrix 𝐁, 𝑖=1,2,…,π‘š. It is used to filter the observed signals 𝐗 to generate the corresponding independent component 𝐲𝑖, that is, 𝐲𝑖=𝐛𝑖𝐗, 𝑖=1,2,…,π‘š.

The ICA modeling is formulated as an optimization problem by setting up the measure of the independence of ICs as an objective function and using some optimization techniques for solving the demixing matrix 𝐁 [28, 29]. The ICs with non-Gaussian distributions imply the statistical independence [28, 29], and the non-Gaussianity of the ICs can be measured by the negentropy [28]: 𝐽𝐲(𝐲)=𝐻gaussξ€Έβˆ’π»(𝐲),(2.3) where 𝐲gauss is a Gaussian random vector having the same covariance matrix as 𝐲. 𝐻 is the entropy of a random vector 𝐲 with density 𝑝(𝐲) defined as ∫𝐻(𝐲)=βˆ’π‘(𝐲)log𝑝(𝐲)𝑑𝐲.

The negentropy is always nonnegative and is zero if and only if 𝐲 has a Gaussian distribution. Since the problem in using negentropy is computationally very difficult, an approximation of negentropy is proposed [28] as follows: []𝐽(𝑦)β‰ˆπΈ{𝐺(𝑦)}βˆ’πΈ{𝐺(𝑣)}2,(2.4) where 𝑣 is a Gaussian variable of zero mean and unit variance, and 𝑦 is a random variable with zero mean and unit variance. 𝐺 is a nonquadratic function and is given by 𝐺(𝑦)=log(cosh𝑦) in this study. The FastICA algorithm proposed by [28] is adopted in this paper to solve for the demixing matrix 𝐖. Two preprocessing steps are common in the ICA modeling, centering and whitening [28]. Firstly, the input matrix 𝐗 is centered by subtracting the row means of the input matrix, that is, 𝐱𝑖←(π±π‘–βˆ’πΈ(𝐱𝑖)). The matrix 𝐗 with zero mean is then passed through the whitening matrix 𝐕 to remove the second-order statistic of the input matrix, that is, 𝐙=𝐕𝐗. The whitening matrix 𝐕 is twice the inverse square root of the covariance matrix of the input matrix, that is, 𝐕=2(𝐢𝐗))βˆ’(1/2), where 𝐢𝐗=𝐸(𝐱𝐱𝑇) is the covariance matrix of 𝐗. The rows of the whitened input matrix 𝐙, denoted by 𝐳, are uncorrelated and have unit variance, that is, 𝐸(𝐳𝐳𝑇)=𝐈. In this study, it is assumed that the training and testing process datasets are centered and whitened.

2.2. Support Vector Machine

The use of SVM algorithm can be described as follows. Let {(𝐱𝑖,𝑦𝑖)}𝑁𝑖=1, π±π‘–βˆˆπ‘…π‘‘, π‘¦π‘–βˆˆ{βˆ’1,1} be the training set with input vectors and labels. Here, 𝑁 is the number of sample observations and 𝑑 is the dimension of each observation, 𝑦𝑖 is known target. The algorithm is to seek the hyperplane 𝐰⋅𝐱𝑖+π‘ž=0, where 𝐰 is the vector of hyperplane and π‘ž is a bias term, to separate the data from two classes with maximal margin width 2/‖𝐰‖2, and all the points under the boundary are named support vector. In order to obtain the optimal hyperplane, the SVM was used to solve the following optimization problem [30]: 1MinΞ¦(𝐱)=2‖𝐰‖2s.t.𝑦𝑖𝐰𝑇𝐱𝑖+𝑏β‰₯1,𝑖=1,2,…,𝑁.(2.5)

It is difficult to solve (2.5), and we need to transform the optimization problem to be dual problem by Lagrange method. The value of 𝛼 in the Lagrange method must be nonnegative real coefficients. Equation (2.5) is transformed into the following constrained form [30]: MaxΞ¦(𝐰,π‘ž,πœ‰,𝛼,𝛽)=𝑁𝑖=1π›Όπ‘–βˆ’12𝑁𝑖=1,𝑗=1𝛼𝑖𝛼𝑗𝑦𝑖𝑦𝑗𝐱𝑇𝑖𝐱𝑗s.t.𝑁𝑗=1𝛼𝑗𝑦𝑗=00≀𝛼𝑖≀𝐢,𝑖=1,2,…,𝑁.(2.6) In (2.6), 𝐢 is the penalty factor and determines the degree of penalty assigned to an error. It can be viewed as a tuning parameter which can be used to control the tradeoff between maximizing the margin and the classification error.

In general, it could not find the linear separate hyperplane in all application data. For problems that cannot be linearly separated in the input space, the SVM uses the kernel method to transform the original input space into a high-dimensional feature space where an optimal linear separating hyperplane can be found. The common kernel function is linear, polynomial, radial basis function (RBF), and sigmoid. In this study, we used multiclass SVM method proposed by Hsu and Lin [31].

3. The Proposed Approach and the Example

3.1. The ICA-SVM Scheme

This study integrates ICA and SVM for determining the quality variables at fault of an out-of-control multivariate process. In the training phase, the aim of the proposed scheme is to obtain the proper parameter setting for the SVM model. Since the RBF kernel function is adopted in this study, the performance of SVM is primarily affected by the setting of parameters 𝐢 and 𝛾. There are no general rules for the choice of those two parameters. This study uses the grid search proposed by Hsu et al. [32] for these two parameters setting. The trained SVM model with proper parameter setting is preserved and employed in the testing phase.

The proposed model first collects two sets of Hotelling T2 statistics from an out-of-control process. The ICA model is used to generate the two estimated ICs from the observed Hotelling T2 statistics. Subsequently, the proposed approach considers those two ICs and 3 averaged quality variables, 4 averaged quality variables, and 5 averaged quality variables as the inputs for SVM in the case of processes with 3 quality characteristics, 4 quality characteristics, and 5 quality characteristics, respectively.

3.2. The Simulated Example

This study employs a simulated example to demonstrate the use of our proposed approach. In our simulation, we assume that a multivariate process is initially in control, and the sample observations come from a multivariate normal distribution with known mean vector πœ‡βˆΌ0 and covariance matrix Ξ£0. This study assumes that a disturbance has intruded into the underlying process at time 𝑑. It results in a mean vector change which is shifted from πœ‡βˆΌ0 to πœ‡βˆΌ1.

This study applies Hotelling T2 control chart to monitor a multivariate process in the cases of 3, 4, and 5 quality characteristics. For each type of process, this study considers the following types of correlation, 𝜌, between any two quality variables: (1) no correlation (i.e., 𝜌=0), (2) moderate correlation (i.e., 𝜌=0.6), and (3) high correlation (i.e., 𝜌=0.9). Now, consider a case of out-of-control multivariate normal process with 3 quality characteristics. Since the process has 3 quality characteristics (i.e., 𝑝=3), the possible sets of quality variables at fault would be 2π‘βˆ’1=7. In our study, we use the following notations: (1,0,0), (0,1,0), (0,0,1), (1,1,0), (1,0,1), (0,1,1), and (1,1,1) to represent the 7 possible sets, in which β€œ0” stands for the β€œin-control” state and β€œ1” stands for the β€œout-of-control” state. The meaning of (1,1,0) stands for the first and second quality variables (i.e., 𝑋1 and 𝑋2) that are at fault while the third quality variable (i.e., 𝑋3) is not at fault.

Without loss of generality, we assume that each quality characteristic for an in-control process is sampled from a normal distribution with zero mean and one standard deviation. We also assume that the out-of-control process has a mean shift of 1 standard deviation, and, thus, the out-of-control control process is sampled from a normal distribution with a mean of one and one standard deviation. The sample size (𝑛) is assumed to be 5.

The sample averages (𝑋𝑖, 𝑖=1,2, and 3) are used to calculate the Hotelling T2 statistics. The Hotelling T2 statistics are computed as follows: T2ξ‚΅=π‘›π‘‹βˆ’π‘‹ξ…žξ‚Άπ‘†βˆ’1ξ‚΅π‘‹βˆ’π‘‹ξ‚Ά,(3.1) where 𝑛: the sample size, 𝑋: the mean vector at the time 𝑑, 𝑋: the grand mean vector of the quality characteristics, and π‘†βˆ’1: the inverse of variance and covariance matrix.

This study generates 100 data sets of observations (each of sample size 5) for every possible combination of fault sets. Since there are 7 possible sets of quality variables at fault in the case of 𝑝=3, we have 700 data sets in a simulation run. Those 700 data sets are initially used to serve as the training data. This study generates another 700 data sets for the purpose of the testing. Figure 1 displays the 700 data sets of 𝑋1, 𝑋2, and 𝑋3 in the cases of 𝜌=0, 𝜌=0.6, and 𝜌=0.9, respectively. In the first step of classification, we also use the data set of out-of-control Hotelling T2 statistics which is shown in Figure 2. Figure 3 displays the two ICs which are generated by using ICA technique.

3.3. The Results

Consider the case of a multivariate process with a three-quality characteristics (i.e., 𝑝=3). The typical approach directly uses four variables, 𝑋1, 𝑋2, 𝑋3, and the Hotelling T2 statistics as inputs for SVM. Different from the typical approach, the proposed approach initially decomposes the Hotelling T2 statistics as two ICs, and then the proposed approach uses those two ICs as the inputs for SVM classifier. Therefore, the proposed approach employs five variables, 𝑋1, 𝑋2, 𝑋3, and the two ICs, as the inputs for the classifier SVM. Tables 1, 2, and 3 report the accurate identification rates (AIRs) when the typical and proposed approaches apply to the multivariate process when 𝑝=2, 𝑝=3, and 𝑝=5. In Table 1, in the case of 𝜌=0, we notice that the AIRs are 79.6% and 78.2%, respectively, for the typical and proposed approaches. The same AIR interpretations apply to the remaining conditions for Tables 1, 2, and 3.

Observing Table 1, one is able to conclude that the AIR for the proposed approach is almost larger (or better) than the cases of typical approach except for the case of 𝜌=0. This implies that the proposed approach has a better performance. Also, in the case of 𝜌=0, the difference in performance between the two approaches is not significant. Those findings are displayed in Figure 4.

Observing Tables 2 and 3 for the cases of 𝑝=3 and 𝑝=4, respectively, we can be very sure that the proposed approach outperforms the typical approach. The AIR values for the proposed approach are always larger. In addition, it is apparently that the AIR values become larger when the values of 𝜌 become larger. The values of AIR are smaller when the number of quality characteristics increases. Those research findings are demonstrated in Figures 5 and 6.

4. Conclusion

Determination of the quality variables at fault for an out-of-control multivariate process is very important in practice. While most of the studies use the single step of classification, this study proposes a hybrid or a two-step approach, ICA-SVM, to enhance the performance of the typical approach. Accordingly, our proposed approach has two more extra inputs, two ICs, for the SVM classifier models. Again, those two ICs are obtained from running the ICA models as the first-step modeling in our proposed scheme. The two ICs are then served as inputs for the second-step modeling in our proposed scheme. The proposed ICA-SVM hybrid mechanism is able to enhance the accurate identification rate for the determination of quality variables at fault in a multivariate process.

In this study, a multivariate process with 2, 3, and 5 quality variables and various correlations structures are considered for evaluating the performance between the typical one-step and proposed hybrid approaches. Experimental results strongly agreed that the proposed hybrid ICA-SVM scheme is able to produce the better accurate identification rate for the testing datasets. Observing the experimental results, we can strongly conclude that the proposed hybrid approach is able to effectively determine the quality variables for a multivariate process.

Our approach requires several steps and to total is quite complicated; therefore, we have not attempted analytic evaluation. However, we believe that our simulation example is generically applicable for monitoring real manufacturing processes when the circumstances of the processes resemble to the simulation conditions of this study. To make the proposed method more applicable, a multivariate process with 6 to 10 quality characteristics and a different set of correlations between quality characteristics will be discussed in future research.

Acknowledgment

This work is partially supported by the National Science Council of the Republic of China, Grant nos. NSC 99-2221-E-030-014-MY3 and NSC 101-2221-E-231-006.