Abstract

The paper proposed a new method to classify and establish the monitoring model for diversified processes data with multiscale. The advantages of the proposed approach are listed as follows. (1) The issues of diversified processes data with multiscale are considered and the fault monitoring effect is enhanced. (2) From a new perspective, the common and specific characteristic subspaces are extracted to help simplify the structure of the monitoring model. (3) It makes the correlation between the common subspace itself and input-output dataset of each mode as close as possible. The effect of the proposed method has been shown in the Experiment Results section.

1. Introduction

With the advance of data driven technology, feature extracting has been applied widely in the complex industry [15]. Statistical-based multivariate monitoring algorithms [59] such as principal component analysis (PCA) [10, 11], partial least squares (PLS) [12], and independent component analysis (ICA) [1315] have been extensively used to analyze the process data and find the further relationship between variables [16].

Electrofused magnesium furnace (EFMF) is a kind of equipment for producing magnesia. Studying the proper monitoring method of the process performance is a key way to keep the safety and quality of products [1719]. The methods above play an important role in disposing the monitoring problems. However, there are intricate conditions in the EFMF such as changing frequently, strong nonlinearity, and diversified modes [20]. These features lead to difficulties for monitoring application. It is an interesting and challenging issue for modeling and monitoring in the diversified processes [3].

For monitoring diversified processes, a variety of knowledgeable strategies have been presented [4, 2123] including subPLS modeling algorithm [2426], recursive or adaptive PCA [27], model library based method [28], localized discriminant analysis [29], multiblock PLS, discriminant analysis [26], gaussian mixture model [30], and diversified statistical analysis method [20, 31, 32]. Among the existing nonlinear methods, kernel-based techniques have been successfully developed for tackling the nonlinear problem in recent years [33, 34]. MPLS uses the variables of the whole modes, which uncovers well the time correlations throughout the cycle and shows the cumulative effects in the product process. Nevertheless, when the data are handled in a single matrix for multiple phases or diversified processes, the connection will be missed. It is widely believed that more latent knowledge can be mined by dividing the datum into meaningful blocks and multiple specific relationships are built for the whole data [35]. The impact of the blocks can be understood expressly and the subsequent analysis can be proceeded [17, 3540]. MBPLS is one of such methods, which builds the variable correlation model within each mode under the influence of other modes. Compared to MPLS, MBPLS algorithm is applied to monitor large-scale continuous processes. Zhang et al. [41] has made a new comprehensive evaluation of multiblock methods and presented that the super scores of MBPLS are consistent with the scores of regular MPLS. The diversified processes model could be established in a totally different way from the traditional methods.

In this paper, a new method for modeling and monitoring the diversified processes with multiscale like EFMF is proposed. From a philosophical perspective, though the processes are diversified, there are common characteristics among them in the same industrial produce. Based on this view, the common and specific characteristic subspaces are extracted. Common characteristic subspaces represent the essential attributes in different process data spaces. And the specific characteristic subspaces represent the nonessential attributes; these subspaces are also very important part in the database. The fundamental purpose of extracting the two kinds of subspaces is to discover the invariability in diversified process and get more useful knowledge about the process behaviors. Comparing to the traditional diversified processes data-driven approaches, modeling in the common and specific characteristic subspaces and monitoring executed in them are the main differences. Furthermore, the relation of input and output is found in this approach, which plays a key role in the industrial production process. It makes the correlation between the common subspaces itself and each mode input dataset and output dataset as close as possible. Experiment results show the proposed method is effective.

The organization of this paper is as follows. Section 2 illustrates the theory which contains diversified classification method and new diversified processes with unequal scale modeling. And also gives the monitoring approach in the separated subspace. Section 3 describes the process and shows the experiment results and discussion. At last, the conclusions part is given in Section 4.

2. Modeling and Monitoring of Diversified Processes

2.1. Diversified Processes Modeling Based on Subspace Separation Method

In this section, part of the potential mode variations that will stay consistent between two modes and reveal the same process characteristics is considered. A key point is how to separate the two different types of process variation. It can be classified into two modes based on underlying process characteristics in the EFMF diversified processes, where input dataset    and output dataset    are obtained, where the subscript denotes different operation modes, denotes the number of samples for different modes, and the scale of in each mode may be different, denotes the number of variables for the industrial process monitoring, and is the total scale of two modes. Standardize the data set of two modes. We first map into a feature space via a nonlinear mapping (feature space), where , . Then subspace will be extracted, which can explore the common structure hidden in high-dimensional dataset of each mode. In this section, new modeling method based on subspace separation is proposed for diversified processes with unequal scale . Some same variable correlation exists in the relationships between two modes. In the space of two modes we can find out a common subspace, which shows the common contribution to diversified processes. Better monitoring performance of each mode is obtained by multiple modeling methods, but the correlations of each mode are neglected. In the proposed approach, the diversified processes are separated correctly since the correlations of two modes are considered. The common subspace of diversified processes with unequal scale should be transformed into the equivalent approximate common space, which has the same size with input dataset to ensure that the specific subspace can be separated. The notations in this paper are listed in Notations Section.

To extract the common knowledge, kernel locally linear embedding (KLLE) is used [42], which is modified based on the original locally linear embedding (LLE) algorithm [43]. For each sample point and its neighbor points , the following error of construct weight matrix is minimized:where the weight represents the reconstructed contribution size of the sample points to and . The approach in this paper demands that and its neighbors can reflex the structure weight relation, after minimizing the following function:where . The method requires constraints shows in (2):

While the scale of in the mode is various and the dimension of is not equal to , the common subspace should be transformed into the approximate equality common characteristic space, which has the equal dimension with input dataset to insure that the specific characteristic subspace can be separated. To build the correlation model between the common space , , and the approximate combination coefficients matrix is introduced. The equivalent approximate common space of each mode is denoted as . Let be the score vector of and . Among them, is the load vector of , which is a unit vector; that is, . Similarly operate for ; let be the score vectors of , , is the load vectors of , . The common space should make the correlation between itself and each mode as close as possible. A constant scalar is introduced and the object function and constraints are as follows:

The model of output-input dataset is built by maximizing the covariance of output datasets , , and after extracting common space of .

Let be the score vectors of , , is the load vectors of , . The common space should make the correlation between itself and each mode as close as possible, that is, to find the maximum of the polynomial . The differences of and should be very small. A constant scalar is introduced and optimization problem is expressed as

The common characteristic subspace in diversified processes should explore the common characteristic structure which is hidden in high-dimensional dataset of each mode and makes the correlation between itself and each mode and as close as possible. Three terms are approximately considered to be equally important and thus the weights of each mode can be determined. According to (4)-(5), the approximate equivalent common space can be gotten:

The extracting of the common subspace is summarized as follows.

Step 1. Choose a proper nonlinear mapping ; map the input data into the feature space .

Step 2. Search the nearest neighbors of each sample point using the Euclidean distance to get the weight value matrix .

Step 3. The approximate combination coefficients matrix and constant scalars are introduced. The equivalent approximate common space of each mode is denoted as . Extract the common subspace according to the invariable weight matrix and making the correlation between itself and each mode and as close as possible.

Step 4. The common subspace dataset is extracted.

When is gotten, the specific characteristic subspace is computed by

It makes the correlation between itself and each mode and as close as possible. For the step’s details, please see the Appendix.

2.2. Diversified Processes Monitoring

According to the characteristic subspace separation approach in this paper, monitoring program would be introduced in the following content. The universality reflects the same characteristics of two different processes and individuality shows the difference. In consequence the characteristic knowledge of the system can be gotten.

In this way, the input dataset is separated into common characteristic subspace and specific characteristic subspace, and monitoring program is mainly executed in the specific characteristic subspaces. Once faults occur in the processes, the other variables will be affected; the monitoring method can detect them easily. Specific knowledge is resolved as

Fault detection is mainly monitoring for the above two subspaces. When a new sample is obtained, score and residual are calculated by the following two equations:

Then statistics and statistics can be calculated by the following equations:where is the sample covariance of score . and are defined in (9).

The upper control limit for could be calculated using the -distribution because does not obey Gaussian distribution. In the paper, kernel density estimation is applied to determine the control limit for . The control limit for is computed according to the following weighted distribution [8]:where and are the parameters of the .

3. Experiment Results

The purpose of the section is to analyze and explain two modes in diversified processes with unequal scale and examine whether there are faults and in which mode during the operation processes of controlling the electrodes the fault may occur. The information of EFMF can be found in [42]. We can get huge data from EFMF process, which contain normal and abnormal data. We used the normal data to model and abnormal data to test the approach. The average time of the whole EFMF diversified processes is 10 h. The current value and voltage value of three phases and the temperature of furnace all can be online measured, which provides abundant process knowledge.

In this experiment, the data set for modeling has 1500 × 4 sample points, which contains the current value and temperature value, respectively. Because there are two kinds of raw material that can be melted in the furnace, the EFMF process can be separated into two modes, which is based on the proposed diversified classification method. The raw data obtained from EFMF process is shown in Figure 1.

Firstly, the standardized matrixes are obtained from the input and output dataset matrixes. Define a scale as the minimum scale of each mode, in this experiment . If one situation is that the scale of “mode” is greater than , this “mode” can be considered as a real mode; it will belong to mode 1 or mode 2. Conversely, if the scale of one “mode” is less than , this “mode” is marked as noise which is abnormal and should be neglected. Moreover, two modes have different characteristics and effects on process, which are illustrated by each specific subspace. According to this diversified classification method, diversified processes can be divided into two different modes. As shown in Figure 1, the previous 600 sample points belong to mode 1, and the others belong to mode 2. Actually, the raw material of EFMF is caustic magnesite in powder in mode 1 and magnesite ore in mode 2.

Secondly, the common knowledge of two modes is extracted by the proposed subspace separation method. And then the specific knowledge is obtained by separating the common subspace from the two modes data. Moreover, the variations of diversified processes are explained by specific subspace.

Two faults are studied in this section. These faults which are all relevant with output dataset are used to test whether the proposed method can effectively monitor the diversified processes. Before the monitoring experiment, we choose 1500 normal sample for modeling using and statistics. statistics is used to monitor common characteristic subspace in two modes and statistics is used to monitor specific characteristic subspace in two modes, respectively. The modeling result is shown in Figure 2.

In the monitoring phase, if and statistics are all in the corresponding normal range, that is, they meet the control limit, we consider the diversified processes to be normal; otherwise, we consider that the faults have occurred. The monitoring results are shown in Figures 3 and 4. From the figures we can find that fault 1 occurs in mode 1, and fault 2 occurs in mode 2. In addition to these results, both and can detect them when they occur. In particular, shows the good performance.

The experiment explains that our diversified processes monitoring method can examine effectively whether there are faults. In addition, this provides a strong clue as to where to look for the responsible for the out-of-control situation, that is, the fault may occur in which mode.

By the proposed diversified processes modeling and monitoring method, the underlying characteristics knowledge is decomposed into two subspaces. Actually, the underlying characteristics knowledge in two modes can be forcefully and completely extracted by the new subspace separation method. The direct relationship of input dataset and output dataset is reflected in the proposed method, which improves the monitoring performance and the accuracy and stability of monitoring fault.

4. Conclusions

In this paper, a way to monitor the diversified processes with multiscale has been proposed. The given approach can extract the common characteristic knowledge and the specific characteristic knowledge from different processes in same industrial produce. This knowledge may guide the researchers to analyze the diversified processes. And also the direct relationship of input dataset and output dataset could be obtained according to the proposed method. The experimental results show how the proposed method performs and the action, which demonstrates the satisfactory improvement in large-scale processes with unequal length modeling and monitoring. And, of course, there are some defects in the proposed method; we hope significant improvement could be made in the future.

Appendix

Lagrange function is defined as follows in Steps 14:

Solve the partial derivative for and enable them to be equal to zero. Consider

The following can be obtained from the above equations:

The following can be obtained from the above equations:

To simplify the calculation, can be set as a sequence of values to ensure that there exist two orthometric eigenvectors of ; that is, , , , and , are two orthometric eigenvectors of and two corresponding eigenvalues. To obtain equivalent approximate common space of each mode, and is introduced.

Notations

Important notations used in the proposed method are as follows.
Operation mode
:The number of samples
:The number of variables
:The dimension of common subspace
:Diversified input dataset
:Diversified output dataset
:Sample point of input dataset
:Sample point of output dataset
:Nonlinear mapping
:Nonlinear mapping of sample point
:Neighboring point
:Common subspace dataset
:Data point of common subspace
:Number of neighbor of
:Weight vector in the proposed method
:Error of construct weight matrix
:Loss function
:Score vector of
:Load vector of
:Score vector of
:Load vector of
:
:Combination coefficients matrix
:Equivalent approximate common space
:Score vector of
:Load vector of
:Specific subspace dataset.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by China’s National 973 program (2009 CB 320602 and 2009 CB 320604) and the NSF (60974057 and 61020106003).