Discrete Dynamics in Nature and Society

Volume 2015, Article ID 892740, 7 pages

http://dx.doi.org/10.1155/2015/892740

## Change Point Determination for an Attribute Process Using an Artificial Neural Network-Based Approach

Department of Statistics and Information Science, Fu Jen Catholic University, New Taipei City 24205, Taiwan

Received 28 January 2015; Revised 6 May 2015; Accepted 6 May 2015

Academic Editor: Carlo Piccardi

Copyright © 2015 Yuehjen E. Shao and Ke-Shan Lin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The change point identification has played a vital role in process improvement for an attribute process. This identification is able to effectively help process personnel to quickly determine the corresponding root causes and significantly improve the underlying process. Although many studies have focused on identifying the change point of a process, a generic identification approach has not been developed. The typical maximum likelihood estimator (MLE) approach has limitations: particularly, the known prior process distribution and mathematical difficulties. These deficiencies are commonly encountered in practice. Accordingly, this study proposes an artificial neural network (ANN) mechanism to overcome the difficulties of typical MLE approach in determining the change point of an attribute process. Specifically, the performance among the statistical process control (SPC) chart alone, the typical MLE approach, and the proposed ANN mechanism are investigated for the following cases: (1) a known attribute process distribution with the associated MLE being available to be used, (2) an unknown attribute process distribution with the MLE being unable to be used, and (3) an unknown attribute process distribution with the MLE being misused. The superior results and the performance of the proposed approach are reported and discussed.

#### 1. Introduction

The statistical process control (SPC) charts have been extensively reported the success in monitoring manufacturing processes. The SPC signal is triggered when evidence suggests that a disturbance has intruded into the underlying process. The signal implies that the process personnel needs to search for the root causes of the disturbance. The sooner the root causes that have been correctly identified, the better the process improvement that can be achieved. Typically, the search of the root causes mainly depends on the identification of change point or starting time of a disturbance. The change point carries the most related information about the disturbance, and the process personnel is much easier able to correctly determine the root causes based on the change point information. As a consequence, the identification of the change point has become a promising research topic.

In recent years, there have been many studies reported on change point determination [1–8]. For example, the combination of MLE approach with and control charts, respectively, was used to monitor the normal processes [1, 2]. The combination of EWMA and Cusum control charts with MLE was also investigated to determine the change point of a normal process [3]. In addition to the univariate applications, the change point determination for the multivariate process applications has been reported [4, 5]. Those studies have the same assumption; that is, the process distribution is known. Although those studies have reported the effectiveness of the MLE approach [1–4], the MLE has one major drawback for the estimation of change point [5]. That is, the distribution of a process must be preassumed. If the process distribution cannot be confirmed in advance, which is typical in practice, the MLE approach would cause the problem of underestimation of the true change point.

While most of the research have investigated the approach of MLE to determine the change point for variable processes [1–4], fewer studies have focused on the MLE for the attribute processes [6–8]. In addition, a generic identification approach for the change point determination has not been developed. Accordingly, we propose a generic approach to overcome the difficulties of MLE in determining the change point for an attribute process. Our proposed approach involves the integrated use of the artificial neural network (ANN) and the binomial cumulative probability. Using the proposed approach, the change point for an attribute process can be accurately and reliably determined.

This study considers three general cases to evaluate the performance of the proposed and typical approaches. Case 1 assumes the situation where a process distribution is known and the corresponding MLE can be derived. Case 2 is involved with the situation where the process distribution may either be known or unknown and the corresponding MLE cannot be derived. Case 3 considers the situation where the process distribution is unknown, but the MLE is misused.

The structure of this study is organized as follows. The following section discusses the concept of the proposed generic approach for determining the change point of an attribute process. Section 3 discusses three cases in which the typical and the proposed approaches are used. The performances are demonstrated and addressed. The final section concludes this study.

#### 2. The Proposed Approach

In contrast to typical change point applications, this study proposes a generic approach to deal with the change point determination for an attribute process. Since the MLE method may not be used to estimate the change point in this study, the underlying process can be viewed as a distribution-free type. As a consequence, the proposed approach is applicable to any type of an attribute process. Our proposed approach involves the integrated use of the ANN and the binomial cumulative probability.

##### 2.1. The Identification Strategy

Suppose that an attribute process is monitored by an attribute control chart and an out-of-control signal is triggered at time . This signal implies that a disturbance has been intruded into a process at (or before) time unless the signal is a false alarm. Typically, the process personnel may conclude that the change point has occurred at time by only observing the SPC signal. However, a process disturbance would typically be intruded into a process at time , , , or much earlier than time . As a consequence, we know that the change point determination should not judged only by the SPC signal.

Some researchers use the MLE method to derive the estimate of the change point. For example, consider an attribute process which follows a binomial distribution. Assume that the binomial process is initially in control and the observations come from a binomial distribution with known parameters and , where is the sample size and stands for the probability of obtaining a nonconforming product in a state of statistical control. After an unknown time at , the process parameter changes from to , where is the unknown magnitude of the change and stands for the probability of obtaining a nonconforming product in an out-of-control state. Let be the observation (i.e., nonconforming product) at time* i* with binomial distribution function of , and we havewhere the notation “” stands for “has a binomial distribution ,” is the change point of a process, is the signal time that a sample point exceeds the attribute control chart’s limits, and .

When a disturbance has occurred in a process after time , the in control binomial process which followed a distribution would be changed to an out-of-control state (i.e., . As a consequence, the likelihood function would follow:It can be shown that an MLE of a true change point can be obtained as follows [6]:whereAlthough the performance of MLE is acceptable, the difficulty is that the MLE cannot be obtained when the underlying process distribution is unknown, which is commonly seen in practice. Therefore, a generic approach, which does not require the process distribution, is developed.

In this study, we initially apply the classifier ANN to predict the values of the output variable from time to 1, in a backward sequence. This study classifies the output variable as a binary digit, either 1 or 0. When the value of output variable is predicted as 0, we assume that the process is in control; that is, the process disturbance has not been introduced yet. On the other hand, when the value of outcome is classified as 1, it implies that the process is out-of-control and the disturbance has been intruded already. Thus, if we can consecutively, from time to (where ), and obtain the value of outcome which equal 1, we should be able to draw the conclusion that the change point is equal to time .

However, when the value of outcome is 0 at time and the value of outcome is 1 at time , what is the conclusion? It is not straightforward to provide the solution to this question. Actually, the values of 1 and 0 can be viewed as success and failure of a binomial experiment, respectively. A binomial experiment possesses the following properties.(1)There are two types of outcomes, success or failure, in each Bernoulli trial.(2)The success rate of each trial is and the failure rate of each trial is .(3)Each trial is mutually independent. That is, the outcome of a trial would not influence the other trial’s outcome.Since the decision about the change point of a process should not appropriately made by only one single outcome of the output variable, we could use the cumulative probability distribution of a binomial experiment to determine the change point. Suppose that a SPC signal is triggered at time and the change point occurred at time . If ANN has a perfect classification capability (i.e., 100% accurate identification rate (AIR)), the values of output variable should be classified as 0 from time 1 to , and the values of output variable are 1 from time to . Accordingly, if we have the accumulation of the perfect ANN output values, in a backward sequence, as a binomial random variable, , the corresponding cumulative probability would be

In general, if the proposed ANN has a good classification capability, we can be sure that most of the output values, from time to , could be classified as 1. It is equivalent to indicate that the cumulative probability of the binomial distribution near 1. Since there are no perfect classifiers in practice, the misclassification ANN outputs must exist. Consequently, the cumulative probability of the binomial distribution is rationally less than a certain threshold value. This threshold value must be less than 1. That is, if the value of cumulative probability is greater than a threshold at a certain time , we can conclude that the change point has occurred at time .

However, there seems no theoretical threshold value. According to our experience and numerous simulations results, we could determine the thresholds in the following steps.(1)During the training and testing in the ANN modeling phase (i.e., it was named as the phase I), we can obtain an accurate identification rate (AIR) for the classification tasks. This AIR is equivalent to the probability of successful rate () of the binomial experiments. Since the numbers of success must be an integer, the following relationship holds: where stands for the number of success in binomial trials and [] is the smallest integer which is greater than or equal to the value of . This [] can be served as a standard, and the corresponding cumulative probability is deemed as the threshold. Accordingly, this threshold is calculated as follows: where is the accumulation of the binomial trial outputs.(2)When phase I is completed, the ANN parameters are all set. In order to perform the confirmation test, we simulate other new process data vectors. We use the phase I ANN model to classify the new process data vectors. This confirmation test is referred to as the phase II. The accumulation of the ANN outputs in phase II which is defined as would also follows a binomial distribution. This study defines the number of success of the ANN outputs in phase II as . At time , we can compute the value of the cumulative probability for this binomial random variable; that is,(3)As a consequence, our decision rule is described as follows:

Notice that we could have many time periods identified as the change point. Since the change point is only occurred at a certain single time, we should determine the first appearance of the change point as the estimate of the change point.

##### 2.2. The ANN Modeling

In recent years, a large number of studies have been reported for ANN applications [9–16]. ANN is a massively parallel system comprised of highly interconnected, interacting processing elements, or units that are based on neurobiological models. ANNs process information through the interactions of a large number of simple processing elements or units, also known as neurons. Knowledge is not stored within individual processing units but is represented by the strength between units [9].

To utilize the ANN, we need to design its structure. All training data sets include 1000 data vectors. While the first 500 data vectors are all from an in control state (i.e., no disturbance involved), the last 500 data vectors are from an out-of-control state. The structure of the testing data sets is same as the training data sets; that is, the testing data sets involve 1000 data vectors. The first 500 data vectors are from an in control state, and the last 500 data vectors are from an out-of-control state.

The ANN nodes can be divided into three layers: the input layer, the output layer, and one or more hidden layers. The nodes in the input layer receive input signals from an external source and the nodes in the output layer provide the target output signals. The output of each neuron in the input layer is the same as the input to that neuron. For each neuron* j* in the hidden layer and neuron* k* in the output layer, the net inputs are given by [17]where is a neuron in the previous layer, is the output of node , and is the connection weight from neuron to neuron . The neuron outputs are given by where is the input signal from the external source to the node in the input layer and is a bias. The transformation function shown in (12) is called sigmoid function and is the one most commonly utilized to date. Consequently, sigmoid function is used in this study.

The generalized delta rule is the conventional technique used to derive the connection weights of the feedforward network. Initially, a set of random numbers is assigned to the connection weights. Then for a presentation of a pattern with target output vector , the sum of squared error to be minimized is given bywhere is the number of output nodes. By minimizing the error using the technique of gradient descent, the connection weights can be updated by using the following equations:where for output nodesand for other nodesNote that the learning rate affects the network’s generalization and the learning speed to a great extent.

The input to the ANN is the values of the process outputs. The ANN output consists of one node. This output node indicates the classification of the process status. The value of 0 concludes that the process is in control, and the value of 1 indicates that the process is out-of-control.

#### 3. The Experimental Examples

In this section, we consider three general cases to compare the performance of the SPC chart alone, the MLE method, and our proposed approach, respectively. Case 1 involves a known binomial process distribution and the corresponding MLE can be obtained. This study uses a typical* np* control chart to monitor this process. MLE can be obtained as shown in (3). In case 2, this study assumes that the underlying process distribution is known; however, the MLE cannot be accessed. In this case, we apply the control chart [18] to monitor a negative binomial distribution. In case 3, we assume that the underlying process is not known, but the process personnel uses the “binomial” MLE to the underlying process. Under this condition, we assume that the underlying process follows a discrete uniform distribution. However, we assume that the process personnel misunderstand the process as a binomial distribution. Thus, the process personnel uses the binomial MLE to estimate the change point for a uniform process.

##### 3.1. Case 1: A Known Process Distribution with a Known MLE

Suppose that an attribute process would follow the binomial distribution with parameters (). This binomial process operates in a statistical control during the period of 1 to 100. The in control parameters are arbitrarily chosen as and . After sample period 100, the out-of-control parameters have been shifted to , 0.14, 0.16, 0.18, and 0.20, respectively. In this condition, a change point is set to be sample period 101.

Three methods, control chart alone, MLE, and the proposed approaches, are employed to determine the change point. Since there is no other extra information to assist the method of control chart alone, we only can judge the change point as the SPC signal time. Consider a simulation with the following conditions. Suppose that an in control binomial process has the initial parameter setting, (, ). A disturbance has occurred at time 101. Consequently, the out of control binomial process is denoted by (, ). Figure 1 depicts control chart for monitoring the binomial process as mentioned above. The upper control limit (UCL), center line (CL), and the lower control limit (LCL) are computed as follows:Therefore, UCL = 19, CL = 10, and LCL = 1. Observing Figure 1, it is apparent that the change point (i.e., at time 101) is not equal to chart’s signal (i.e., at time 138).