Abstract

Due to the recent rapid growth of advanced sensing and production technologies, the monitoring and diagnosis of multivariate process operating performance have drawn increasing interest in process industries. The multivariate statistical process control (MSPC) chart is one of the most commonly used tools for detecting process faults. However, an out-of-control MSPC signal only indicates that process faults have intruded the underlying process. Identifying which of the monitored quality variables is responsible for the MSPC signal is fairly difficult. Pinpointing the responsible variable is vital for process improvement because it effectively determines the root causes of the process faults. Accordingly, this identification has become an important research issue concerning recent multivariate process applications. In contrast with the traditional single classifier approach, the present study proposes hybrid modeling schemes to address problems that involve a large number of quality variables in a multivariate normal process. The proposed scheme includes multivariate adaptive regression splines (MARS), logistic regression (LR), and artificial neural network (ANN). By applying MARS and LR techniques, we may obtain fewer but more significant quality variables, which can serve as inputs to the ANN classifier. The performance of our proposed approaches was evaluated by conducting a series of experiments.

1. Introduction

A multivariate process monitors two or more quality variables. When a signal is triggered by the multivariate statistical process control (MSPC) chart, process personnel are typically only aware that the underlying process is in an unstable state. Identifying which of the monitored quality characteristics (or variables) is responsible for this MSPC signal is challenging. Accordingly, effective determination of the source of process faults becomes an important and challenging issue in MSPC applications, because these sources are associated with specific assignable causes that adversely affect the process.

Typically, a literature review has shown that there are different kinds of approaches to investigate on source identification of faults in a multivariate process. The first type of approach uses various graphical techniques, such as polygonal charts [1], line charts [2], multivariate profile charts [3], and boxplot charts [4] to assist in determining the quality variables at fault in a process. However, the operations of these graphical approaches are tedious and subjective.

The second type of approach uses the statistical decomposition techniques to interpret the contributors to an MSPC signal. Mason et al. [5] proposed the method to decompose the statistic into independent parts, each of which reflects the contribution of an individual quality variable. Since the decomposition of the statistic into independent components is not unique, Mason et al. [6] provided a computing scheme that can reduce the computational effort. The same concept to decompose the statistics has been proposed by the studies [7, 8]. However, these approaches have not been analyzed in terms of the percentage of success in the classification of the variables that have actually shifted in the process [9, 10]. The study [11] investigated the method of principal components analysis (PCA) to determine the quality variables at fault in a multivariate process. The statistic is expressed in terms of normalized principal components scores of the multinormal variables. The normalized score with high values are detected when an MSPC signal is triggered. The contribution plots can then be used to determine the variables which are responsible for the signal. In addition, the contribution plots were used by the studies [12, 13]. However, the PCA approach can be argued that the dimensionality of data may not be efficiently reduced by linear transformation. Also, the problem of the PCA consists in the fact that the directions maximizing variance do not always maximize information. More recently, the study [14] developed a statistical decomposition method to estimate the sources of process variance shifts in a multivariate normal process. Although the performance of the approach was acceptable, the decomposition method requires a large sample size, which may not be feasible for some practical applications.

The third type of approach employs the machine learning (ML) mechanisms, such as artificial neural networks (ANN) and support vector machine (SVM), to identify the quality variables which are responsible for the MSPC signal. A comparative study has been conducted by the studies [9, 10]. While the study [9] made a comparison between neural network approaches with the method of Mason et al. [5], the study [10] made a comparison between ANN and SVM with the method of Runger et al. [8]. Both studies [9, 10] concluded that ML methods are in general better than those obtained using the decomposition approach. The study [15] proposed a backpropagation-net based model which can identify the group of quality variables at faults and can classify the magnitude of the process shifts. The study [16] developed a two level-based model using control chart for detecting the signals and an ANN for identifying the sources of the signals. The study [17] proposed an ANN-based model to identify and quantify the mean shifts in bivariate processes. The authors [18] developed a neural-network-based identifier to detect the mean shifts and simultaneously to identify the sources of the shifts for a multivariate autocorrelated processes. They benchmarked the run-length performance of the proposed method against the Hotelling , the MEWMA, and the Z control charts. The authors [19] investigated the sources of process variance faults with the use of ANN and SVM; however, their considerations of process variance shifts were large. The authors [20] proposed a hybrid model for online analysis of MSPC signals in multivariate manufacturing processes. Their model consisted of two modules in which the first module used a SVM to recognize the unnatural pattern, and then, the magnitude of different shifts can be determined by using the second module, the NN models. The authors [21] also proposed a hybrid model for online analysis of MSPC signals in multivariate manufacturing processes. They also used the SVM to recognize the mean and variance shifts in the first module. In the second module, they employed two neural network models to recognize the magnitude of shifts for each variable simultaneously. The study [22] proposed a hybrid scheme which is composed of independent component analysis (ICA) and SVM to decide the fault quality variables when a step change disturbance existed in a multivariate process.

The literature review has shown that most of the existing studies are concerned with the determination of which variable or group of variables has caused the signal through single step modeling. However, there is a difficulty that may not have been addressed yet. When the number of quality characteristics is large, the existing decomposition methods and/or machine learning methods may lack the capability to handle such a situation. In addition, because process faults are typically attributed to mean shifts and the multivariate normal process is one of the most widely used applications, the present study is motivated by addressing mean shift faults for a multivariate normal process with a large number of quality variables. A review of relevant literature also indicates that the application of ANN for process fault determination is promising; however, it suffers from the requirement of a large number of controlling parameters and the risk of model overfitting [23ā€“25]. Consequently, contrary to the existing approaches, the present study proposes two-stage hybrid schemes to identify which quality variable or group of variables is responsible for process mean shift faults. The proposed schemes integrate multivariate adaptive regression splines (MARS), logistic regression (LR), and artificial neural networks, which are referred to as the MARS-ANN and LR-ANN schemes, respectively. The performance of the proposed approaches was examined by a series of computer simulations.

The rest of this paper is organized as follows. Section 2 provides brief overviews of process models and the proposed schemes. The various experimental conditions are addressed in Section 3. This study is concluded in Section 4.

2. Process Models and Methodologies

The structure of the process model is addressed. The proposed hybrid schemes are also described in this section.

2.1. Structure of the Process and the Mean Shift

This study considers the situation of process mean shifts and assumes that the multivariate process is initially in a normal state and the sample observations are derived from a -dimensional multivariate normal distribution , where After a certain length of time, this study assumes that the mean vector changes from to , where Let be a vector that represents characteristics on the observation in subgroup . The resulting sample mean vector is as follows: To detect a multivariate process mean shift, Hotelling [26] proposed the following chi-square statistic This statistic is asymptotically distributed as a chi-square distribution with degrees of freedom. The control chart that uses as a monitoring statistic in (6) has the upper control limit where is the upper percentile of the chi-square distribution with degrees of freedom. If the plotted statistic falls outside the UCL, the process is considered to be in an abnormal state, and our proposed method can be applied to identify the source of mean shifts. The proposed two-stage hybrid methods integrate the framework of MARS, LR, and ANN. In the initial stage, influencing variables are selected using multivariate adaptive regression splines or logistic regression. In the second stage, the significant influencing variables selected are taken as the input variables of the ANN. The following sections address these three components.

2.2. Logistic Regression

The purpose of performing logistic regression modeling in stage I was to identify important influencing variables and refine the entire set of input variables. The structure of the logistic regression model can be briefly described as follows. Let represent the dependent variables ( denotes ā€œthe abnormal stateā€ and denotes ā€œthe normal stateā€) and let be the conditional probability of event with a given series of independent variables , where is the sample mean of the th characteristic. The logistic regression model is then defined as follows:

Before screening significant independent variables, we performed the collinearity diagnosis procedure to exclude variables that exhibited high collinearity. After this diagnosis, the remaining variables served as independent variables for logistic regression modeling and testing. The Wald forward method was applied to identify independent variables with significant influence on an abnormal state probability. These significant independent variables and the dependent variable were then substituted into the ANN to construct a two-stage model.

2.3. Multivariate Adaptive Regression Splines

The superior performance of the MARS has been reported in many applications [27ā€“32]. MARS is typically capable of revealing important data patterns and relationships for the complex data structure that is often concealed in high-dimensional data [28, 31]. The MARS model can be represented as [33] where and are the parameters, is the number of basis functions (BF), is the number of knots, takes on values of either 1 or āˆ’1 and indicates the right or left sense of the associated step function, is the label of the independent variable, and is the knot location. The optimal MARS model is obtained in two steps. The purpose of the first step is to construct a large number of basis functions that initially fit the data. The purpose of the second step is to delete basis functions in order of least contribution using the generalized cross-validation (GCV) criterion. The variable importance measure was obtained by observing the decrease in the calculated GCV values when a variable was removed from the model. The GCV is described as where is the number of observations and is the cost penalty measure of a model containing basis functions.

2.4. The Artificial Neural Network

The ANN has been widely used in many SPC applications [34, 35]. The ANN is a parallel system comprised of highly interconnected processing elements that are based on neurobiological models. The ANN processes information through the interactions of a large number of simple processing elements called neurons.

Figure 1 illustrates that neurons in networks take inputs from the previous layer and send outputs to the next layer. Typically, ANN nodes consist of three layers: the input, output, and hidden layers. The nodes in the input layers receive input signals from an external source and the nodes in the output layers generate the target output signals. The output of each neuron in the input layer is the same as the input to that neuron. For each neuron in the hidden layer and neuron in the output layer, the net inputs are given by where is a neuron in the previous layer, is the output of node , and is the connection weight from neuron to neuron . The neuron outputs are given by where is the input signal from the external source to the node in the input layer and is a bias. The transformation function shown in (14) is called a sigmoid function and is the most commonly utilized function to date. As a result, this study used the sigmoid function.

3. Experiments and Analysis

3.1. The Parameter Settings

To evaluate the performance of the proposed approach, a series of simulations were conducted. Without loss of generality, this study assumed that each quality characteristic was initially sampled from a normal distribution with zero mean and one standard deviation. In addition, we assumed that twenty quality characteristics were monitored simultaneously (i.e., ), and the covariance matrix was defined as in (2).

Because we considered 20 quality characteristics for the multivariate normal process, there are possible types of mean shifts. They are represented by , , , and , where 1 denotes a quality characteristic that is at fault and 0 denotes a quality characteristic that is not at fault. For an abnormal mean vector structure, we considered three types of mean shifts for demonstration: , and . This study also considered three different values of : 0.1, 0.5, and 0.9. The sample size was assumed to be 10. Two values of were considered: 0.5 and 1.0. We repeated the simulation 500 times for each data structure. The structure of the ANN is established as follows. When applying ANN in the single stage in this study, we had 20 input nodes and one output node in the ANN structure. The hidden nodes were set to the range to , where is the number of input variables. Thus, in the initial phase, the hidden nodes were 18, 19, 20, 21, and 22.

According to the suggestions of the study [36], the learning rates were set to 0.01, 0.005, and 0.001. After performing ANN modeling, we obtained the topology with a learning rate of 0.01, which provides the best result with the minimum test RMSE. Here, denotes the number of neurons in the input layer, number of neurons in the hidden layer, and number of neurons in the output layer, respectively.

3.2. The Results

For the hybrid LR-ANN model, this study calculated the variance inflation factor (VIF) to examine the presence of collinearity, used a 0.05 significance level, and employed logistic regression analysis to select important influencing variables in the initial stage. Values of VIFs greater than 10 were considered large enough to suspect serious multicollinearity [37ā€“39]. As shown in Table 1, all of the VIFs are less than 10. Consequently, collinearity was not too high among the independent variables. The analysis results of LR modeling are summarized in Table 2. The significant variables selected in this stage served as the input variables of the ANN.

For the hybrid MARS-ANN model, we obtained the selection results of the variables after performing the MARS procedure. Tables 3, 4, 5, 6, 7, and 8 list the selection results for the MARS models for 6 different combinations of and . In this selection procedure, the important explanatory variables were chosen; their relative importance indicators are listed in the last column of Tables 3 to 8.

When the first stage of hybrid modeling was completed, the ANN topology settings were established. Table 9 displays the corresponding ANN topologies for various types of hybrid models. The network topology with the minimum test RMSE was again considered as the optimal network topology. The learning rate of 0.01 was used for all of those models.

This study used the classical single stage of an ANN model and the proposed two-stage of MARS-ANN and LR-ANN models to determine the source of mean shift faults in a multivariate process. The experimental results are displayed in Table 10.

Table 10 reveals that the two-stage MARS-ANN and LR-ANN approaches exhibit better performance than the classical single-stage ANN method in many situations. Based on the results shown in Table 10, it is noted that when the type of mean shift is , the LR-ANN approach exhibits the best performance in terms of accurate identification rates (AIR) for all combinations. The MARS-ANN approach was preferable to the single stage of the ANN in almost every case. The last two rows of Table 10 list the average and standard errors of the accurate identification rates. The proposed hybrid approaches, LR-ANN and MARS-ANN, outperformed the classical method, which is the single stage of the ANN. The proposed MARS-ANN approach had the smallest standard error, which implies the robustness of the mechanism. After comparing the performances of the LR-ANN and MARS-ANN approaches, we determined that the MARS-ANN approach is superior. The reason may be that population stratification in logistic regression analysis can lead to bias in estimates and test statistics. As a result, the results of the LR-ANN approach were somewhat unstable.

Table 11 summarizes the AIR with consideration of three different correlations, namely, the low, the moderate, and the strong correlations, respectively. The standard deviations for those AIR values are listed in parentheses. By observing Table 11, one is able to observe that the performance of the proposed hybrid models almost completely outperforms the classical single-stage ANN model. In particular, the proposed MARS-ANN has the best and the most robust performance among those three modeling approaches.

Table 12 shows the overall improvement percentage of the proposed model in comparison with the classical single-stage model. The AIR improvements of the proposed LR-ANN model over the classical ANN model for three types of correlations are 18.73%, 10.67%, and āˆ’2.50%, respectively. Although there is a poor improvement for the case of , the average AIR improvement is 8.97%. In addition, the AIR improvements of the proposed MARS-ANN model over the classical ANN model for three types of correlations are 14.39%, 15.85%, and 6.96%, respectively. Accordingly, the average AIR improvement reaches 12.73%.

One important result is that our proposed approach is useful in dealing with difficulties of the smaller shifts for a multivariate process. The case of the smaller shift value (i.e., ) drew particular attention from industries because it is very difficult to identify the sources of small mean shifts. Considering all the cases of , Table 10 illustrates that the 21.12% and 17.00% improvement in identification can be achieved when the proposed LR-ANN and MARS-ANN schemes are used. The improvements in identification are significant.

4. Conclusions

The ANN has been criticized for its long training process; however, the combination of LR/MARS and ANN is a good alternative for performing classification tasks. Accordingly, the proposed combination of the LR-ANN and MARS-ANN schemes was proven to be useful for determining the mean shift faults in a multivariate process.

The rationale behind the proposed schemes was initially to obtain fewer important explanatory variables by performing LR or MARS modeling. The resulting significant variables served as inputs to the designed ANN models. The proposed LR-ANN and MARS-ANN models not only have fewer input variables but also possess better classification capabilities.

The proposed hybrid two-stage models in this study are not the only combination techniques; other artificial intelligence techniques, such as decision tree or genetic algorithms, can be integrated with neural networks or a support vector machine to further refine the structure of the classifiers and improve classification accuracy. The applications of other process faults, such as variance shift faults, for a multivariate process should be further investigated.

The data-driven methods of multivariate statistical process control have been the subject of considerable interest from both the academic community and industry as an important implement in the process monitoring area. Since the practical systems become more and more complicated and the physical models become extremely hard to obtain, considering the related topics within data-driven framework seems more meaningful in the current and future work to achieve more industrial oriented results [40ā€“44]. In addition, real-time implementation of fault tolerant control system with performance optimization is an important issue in modern industries [45]. Extensions of the proposed procedures to data-driven design or real-time implementation of fault tolerant control system are possible. Such works deserve further research and are our future concern.

Acknowledgment

This work is partially supported by the National Science Council of the Republic of China, Grant no. NSC 102-2221-E-030-019 and Grant no. NSC 102-2118-M-030-001.