Abstract

Polyvinyl chloride (PVC) polymerizing production process is a typical complex controlled object, with complexity features, such as nonlinear, multivariable, strong coupling, and large time-delay. Aiming at the real-time fault diagnosis and optimized monitoring requirements of the large-scale key polymerization equipment of PVC production process, a real-time fault diagnosis strategy is proposed based on rough sets theory with the improved discernibility matrix and BP neural networks. The improved discernibility matrix is adopted to reduct the attributes of rough sets in order to decrease the input dimensionality of fault characteristics effectively. Levenberg-Marquardt BP neural network is trained to diagnose the polymerize faults according to the reducted decision table, which realizes the nonlinear mapping from fault symptom set to polymerize fault set. Simulation experiments are carried out combining with the industry history datum to show the effectiveness of the proposed rough set neural networks fault diagnosis method. The proposed strategy greatly increased the accuracy rate and efficiency of the polymerization fault diagnosis system.

1. Introduction

Polyvinyl chloride (PVC) is one of the five largest thermoplastic synthetic resins, and its production is second only to the polyethylene (PE) and polypropylene (PP). PVC is a kind of general colophony, which is good in quality and is widely used. It has good mechanical properties, antichemical properties and it is corrosion-resistant and difficult to burn. With vinyl chloride monomer (VCM) as a raw material, the suspension method to produce polyvinyl chloride (PVC) resin is a kind of typical batch chemical production process. PVC polymerization process is a complex control system with multivariable, uncertain, nonlinear, and strong coupling.

Polymerization kettle is the key equipment of the PVC production process, where vinyl chlorides go on the polymerization reaction to generate polyvinyl chloride [1]. Whether the polymerization kettle can run steadily is directly related to the working conditions of the PVC production device. On the other hand, the motor, reducer, and machine seal are key equipments to ensure that the polymerization kettle device runs normally. Once they failed to work, the serious losses will be brought to the PVC polymerizing process. Therefore, the earlier diagnosis of the fault type and location of polymerization kettle can avoid the huge economic losses which are caused by the parking of polymerization kettle, which has the important practical significance to improve the product quality and reduce the production costs.

Rough set (RS) theory is a kind of knowledge mining theory and a mathematical tool to describe imperfection and uncertainty [2]. The neural network can effectively distinguish fault patterns brought out by sensor failures, the mismatch between process and model, noises and disturbances [35]. The main goal of the data-driven fault diagnosis is to realize the fault diagnosis and isolation by finding the hidden fault pattern and the relationship between data and fault pattern. The hybrid fault diagnosis method by combining RS theory and neural network takes advantage of the two technologies, which takes the rough set method as the input system of neural network to simplify the complexity of the neural network and improve the precision and efficiency of fault diagnosis [6, 7]. And the utilizing of neural networks as a rear information recognition system is to make up the deficiency of the rough set, which is widely used in many industry fields, such as electric power, metallurgy, and chemical industry [811].

Therefore, a kind of fault diagnosis method of polymerization kettle based on the improved discernibility matrix and BP neural network is put forward in this paper. The improved discernibility matrix is used to reduct the fault diagnosis attributes of polymerization kettle to simplify the input dimension of the fault characters. Then LM-BP neural network is used to diagnose the fault pattern of polymerization kettle in order to improve the accuracy and efficiency of fault diagnosis.

The paper is organized as follows. In Section 2, the technique of PVC polymerizing process is introduced. In Section 3, the RS-NN fault diagnosis method of polymerization kettle equipment is summarized. In Section 4, the improved LM-BP neural network algorithm is introduced in details. The simulation experiments are discussed in details in Section 5. Finally, the conclusion illustrates the last part.

2. Technique of PVC Polymerization Process

A flowchart of the typical PVC polymerization kettle production process is shown in Figure 1 [12].

In polyvinyl chloride (PVC) polymerizing process, all kinds of raw materials and auxiliary agents are placed into the reaction kettle. They are fully and evenly dispersed under the function of stirring. Then, we begin to ventilate the cooling water to the clipset of the reaction kettle and baffle plate constantly in order to remove homopolymer. When the conversion rate of VCM reaches a certain value, there is a proper pressure drop. Then, the reactions are terminated and the finished product is created. The discussed polymerizing reaction selects SG-5 as the example. 26 ton VCM is fed into the polymerization kettle. The conversion rate is about 85%, the reaction temperature is 55.4°C, and the heat released from PVC polymerizing reaction is 1600 KJ/Kg.

3. RS-NN Fault Diagnosis Method of Polymerization Kettle

3.1. Structure of Polymerization Kettle Fault Diagnosis System

The structure of the proposed RS-NN fault diagnosis system of polymerization kettle is shown in Figure 2. Firstly, a group of fault samples is utilized to train BP neural network to determine the structure and parameters of the network. After training, the classification of fault patterns is carried out in accordance with the given symptoms to realize the nonlinear mapping from the fault symptom set to the fault set.

3.2. Formation of Fault Information Table

A 70 M3 polymerization kettle is utilized to set up the RS-NN fault diagnosis system. The condition attributes of the decision table are the reducer vibration value of polymerization kettle (, mm), the stirring current (, A), the pressure of mechanical seal (, MPa), the operating pressure (, MPa), the stirring speed (, r/min), the reducer temperature (, °C), the operating temperature of polymerization kettle (, °C), and the mechanical seal temperature (, °C), whose corresponding variables are denoted as , , , , , , , and , respectively.

The fault of polymerization reactor includes the motor fault, the deceleration machine fault, and the machine seal fault. And the main mechanical seal failure forms of machine seal faults are the gland-shaft damage and components damage of machine seal. Assume is the decision attribute in accordance with the direct reasons of the faults; that is to say, stands for the normal working conditions of the polymerization kettle, stands for the motor fault, stands for the reducer fault, stands for gland-shaft fault of the polymerization machine seal, and stands for the fault of polymerization component. The corresponding BPNN outputs of the five working conditions in the polymerization kettle production process are the normal working condition (0000), the motor fault (0001), the deceleration machine fault (0010), the machine seal gland-shaft fault (0100), and the fault of machine seal component (1000), respectively. The historical data of the polymerization kettle fault diagnosis decision system are shown in Table 1.

3.3. Discrete Decision Table of Fault Diagnosis

Because the rough set theory can only deal with the discrete attribute values, the data of the fault diagnosis decision-making system have to be dealt to the discrete values. The paper discretee the continuous attributes based on the expert experiences. Thus each critical value of the condition attribute intervals is listed as follows:: “1” represents the interval , “2” represents the interval ;: “1” represents the interval , “2” represents the interval ;: “1” represents the interval , “2” represents the interval ;: “1” represents the interval , “2” represents the interval , “3” represents the interval , “4” represents the interval , “5” represents the interval ;: “1” represents the interval , “2” represents the interval [], “3” represents the interval , “4” represents the interval ;: “1” represents the interval , “2” represents the interval , “3” represents the interval , “4” represents the interval ;: “1” represents the interval , “2” represents the interval , “3” represents the interval [], “4” represents the interval , “5” represents the interval ;: “1” represents the interval , “2” represents the interval .

The discrete results are shown in Table 2.

3.4. Attribute Reduction Based on Improved Discernibility Matrix

The attribute reduction method in rough set theory is one of the key research topics. Attribute reduction method based on the discernibility matrix [13, 14] is an important variant of rough set theory, whose main thought is to firstly use the discernibility matrix to derive the discernibility function and then solve the disjunctive paradigm, whose each paradigm is a reduction of the rough set. The concept of attribute 0-1 resolution matrix was put forward to transform the attribute reduction problem to the 0-1 matrix cover problem [15]. The concept and construction method of traditional discernibility matrix are based on the symbol-type information system with the complete attributes [16, 17]. A concept and configuration method of a flexible discernibility matrix were proposed to reduct attributes of the incomplete information system and Vague set attribute information system [18]. Aiming at the calculation complex problem of the traditional discernibility matrix, an attributes reduction algorithm based on the improved discernibility matrix is proposed, which have the superiority in tackling with the discernibility matrix with higher dimensions.

3.4.1. Discernibility Matrix

Definition 1. Assume is a decision table, , , , , is the condition attributes set, is the decision attributes set, and is a value of the sample in attribute . So the discernibility matrix is defined as follows:

It can be seen form the definition of a discernibility matrix that is a set composed by all attributes which can distinguish the sample from . For all , , , which is in accordance with the following properties: (1) ; (2) ; (3) .

The discernibility matrix is symmetry along the main diagonal. So in the analyzing process, its upper triangular or lower triangular part is only considered. When the decision attributes of the two samples are the same, the element of the corresponding discernibility matrix is 0. When the decision attributes of the two samples are different and are distinguished by certain conditions, the element values of the corresponding discernibility matrix are the different condition attributes in two samples.

3.4.2. Attribute Reduction Algorithm Based on Improved Discernibility Matrix

The attributes reduction algorithm based on the discernibility matrix and logic operations can be used to obtain all the possible attribute reduction results of the decision table, which is actually to reduct the attributes combination situations into logic formula. If there is an element in the discernibility matrix, whose value is the set containing elements with single attribute, the attribute is the necessary attribute of the two samples which distinguish the matrix elements and it is the only one that can distinguish the attributes of the two samples. Attribute set composed of the attributes in the discernibility matrix is actually the relative attribute kernel of the decision table system. The attribute reduction algorithm based on the improved discernibility matrix is described as follows.

Step 1. Calculate the discernibility matrix of the decision table.

Step 2. Find out the single element in discernibility matrix, retain it to the kernel of the attribute reduction, and change all the elements containing it into zero.

Step 3. The elements in the discernibility matrix (, ), whose values are not 0 or 1, are established as the corresponding disjunction logic expressions:

Step 4. Put all the disjunction logic expressions into conjunction operation to obtain

Step 5. Convert the conjunctive normal into the disjunctive normal form:

Step 6. Output attribute reduction results. Each conjunction of disjunctive normal form is in accordance with one attribute reduction. So the condition attributes set after reduction is composed of all attributes in every conjunction.

3.4.3. Validation of the Improved Discernibility Matrix Algorithm

The validation experiments on the improved discernibility matrix algorithm are carried out by aiming at the rough attribute reduction of the fault diagnosis system for polymerization kettle. The condition attributes , , , and are, respectively, the reducer vibration value of polymerization kettle (, mm), the stirring current (, A), the pressure of mechanical seal (, MPa), and the operating pressure (, MPa), whose corresponding variables are denoted as , , , and . The variable of fault types () is . The discrete decision table is shown in Table 3, in which the condition attribute set is and the decision attribute set is .

Based on the definition of the discernibility matrix, is described as follows:

Therefore, the final attributes reduction results according to the discernibility matrix are or , namely, or . Because the single element of the above matrix is , according to the improved discernibility matrix algorithm, the attribute kernel is . So all combinations containing are set as zero to get the simpler discernibility matrix:

The matrix obtained quickly is described as follows:

So the core attribute is and the attribute reduction result is or , namely, or . The above reduction results based on two methods show the correctness of the proposed algorithm and the proposed method has characteristics of simple calculation and less errors. It especially can have the superiority when processing the discernibility matrix with higher dimension.

3.4.4. Attribute Reduction by Utilizing the Improved Discernibility Matrix

In order to facilitate the data reduction, Table 2 is divided into two separate decision tables, which respectively include attributes and . Because one shortcoming of the rough set theory applied in the fault diagnosis of polymerization kettle is that it is sensitive to noise to produce the inconsistent rules, for the rules extraction of compatible system, the reliability and the support degree of each rule are calculated. Then by setting two thresholds, when reliability and the support degree are larger than the corresponding threshold, the rule is acquired, or the rule is deleted. By getting rid of the repetition and incompatible lines of the decision table, the partition decision (Table 4) is obtained.

According to the improved discernibility matrix algorithm, it is concluded that the core attribute is and the reduction is or . Similarly, another block decision table is shown in Table 5. So the core attributes and the reduction are , namely, .

Finally, by using the attribute reduction method based on the improved discernibility matrix, the attributes reduction of the decision table is carried out to obtain the minimum attribute reduction or .

4. BP Neural Network

4.1. Improved BP Neural Network

Back propagation (BP) neural network is a kind of multilayer forward neural network. BP algorithm has merits of parallel processing, distributed storage, and adaptive learning, but there are also slow convergence speed and local minimum [19, 20]. Many proposed improved BP learning algorithm may be divided into two categories: (1) the methods based on the gradient descent including momentum BP algorithm (MOBP) and variable leaning-rate BP algorithm (VLBP); (2) the methods based on the numerical optimization including conjugate gradient BP algorithm (CGBP) and Levenberg-Marquardt BP algorithm (LMBP).

4.1.1. Heuristic Improved BP Algorithm

(1) Momentum Back Propagation Algorithm (MOBP). The basic BP learning algorithm adjusts weights only along the gradient descent direction of error in moment and does not consider the direction before moment. So the learning process often has oscillation and converges slowly. In order to improve the training speed, a momentum component is added into the weights adjustment formula: where is the weight matrix, is the output vector, is the learning ratio, and is the momentum coefficient. Define the momentum item to reflect the prior adjustment experiences. means that the weights adjustment is only concerned with the current negative gradient. means that the weights adjustment depends on the negative gradient in the last cycle. The momentum component can be regarded as a low-pass filter, which smoothes the oscillation in the learning process and improves the convergence velocity.

In addition, when the error gradient occurred the local minimum, though , can make it jump out of the local minimum and speed up the iterative convergence speed.

(2) Variable Leaning-Rate Back Propagation Algorithm (VLBP). In the gradient descent method, the learning speed has great influence on the whole training process. That is to say the training success or not mainly depends on the selection of learning speed. If it can improve learning speed in relatively flat surface and reduce it when slope increases, the convergence speed can be improved. This kind of VLBP rule is described as follows: (a)If mean square error (in the whole training set) increases to be larger than the predefined percentage (typical value is 1–5%), the weight update is cancelled. Then the learning speed is multiplied by a factor () and the momentum coefficient is set to zero. (b)If mean square error (in the whole training set) decreases after weights update, the weight update is accepted. Then the learning speed is multiplied by a factor . If the momentum coefficient was to be set zero, it will be back to the previous value. (c)If the growth of the mean square error is less than , the weights update is accepted, but learning speed remains unchanged. If the momentum coefficient was to be set zero, it will be back to the previous value.

The heuristic BP learning algorithm will improve the convergence speed for some problems. But these methods have two main shortcomings. First of all, these improvements need to set more parameters (such as , , , etc.), but the standard BP algorithm only needs a parameter (learning speed), which results in the performance of the algorithm sensitive to the changes of these parameters. The second shortcoming of these algorithms is that they can find the solution of the problem but not convergence for some applications.

4.1.2. Numerical Optimization Technique

The steepest descent method, the conjugate gradient method, and the Newton method are three widely used techniques in numerical optimization. The steepest descent method is the simplest algorithm, but the speed of convergence is slower. The convergence speed of Newton method is more quickly, but it needs to calculate Hessian matrix and its inverse matrix. The conjugate gradient method is compromise in that it does not need to calculate second derivative and still has the characteristics of quadratic convergence.

(1) Conjugate Gradient Back Propagation Algorithm (CGBP). The conjugate gradient method is an improved gradient method, which can improve the shortcomings that the oscillation of gradient method is too big and the convergence is poor. Its basic idea is to calculate the conjugate direction of the negative gradient direction and the previous searching direction in order to speed up the training speed and improve the training accuracy. All of the conjugate gradient methods adopt the negative gradient direction as the initial convergence direction: where is error gradient direction; that is to say,

stands for the weight error surface in the form of mean square sum. Then the learning speed is selected to minimize the function along the searching direction:

Then the conjugate direction is used as a new search direction. Usually the previous search direction is added to the current gradient direction:

If the algorithm does not obtain convergence, return to (13). The selection of derived various conjugate gradient methods, such as the following.The conjugate gradient method of Fletcher-Reeves is defined as The conjugate gradient method of Polak-Ribiere is defined as

(2) Levenberg-Marquardt Back Propagation Algorithm (LMBP). The principle of Newton algorithm is to seek the approximated quadratic stagnation of , which represents the weight error surface in the form of mean sum square. Assume

Then, the gradient of the function to is defined as follows:

If (17) is set to zero, then obtain

The Newton algorithm is defined as follows: where is the secondary derivative of the current weight error function.

The Newton method is fast in convergence speed, but in each iteration, it requires to calculate the second derivative Hessian matrix of , which makes the calculation amount become very large. If is represented in the form of quadratic sum, namely,

then the th gradient component is calculated by

Therefore, the gradient can be written in the following matrix form: where where is a Jacobian matrix. So the Hessian matrix can approximately be replaced by the following matrix:

The gradient is defined as follows: where contains the first order derivative of the network training error and it is a function of the weights and threshold values. is the error vector of the network training. So the training of the LMBP algorithm is described as

The characteristic of this algorithm is that when increases, it is close to the steepest descent method with a small learning speed

When, in (26), decreases to zero, it becomes to the Newton method with an approximate Hessian matrix. In the iterative process utilizing LMBP algorithm, the value of will be reduced if the training is successful; otherwise, it will increase. When the squared error sum is reduced to a predefined value, the algorithm is considered to converge.

4.2. Simulation of Improved BP Neural Network

Four kinds of the improved BP training algorithm (MOBP, VLBP, CGBP, and LMBP) are adopted to carry out the simulation experiments for polymerization data, and a performance comparison is analyzed in details in order to guideline the selection of the BP learning algorithm for the fault diagnosis system of the polymerizer kettle. The maximum number of training cycles is 5000 and the error limit is 0.001. The training samples are selected as follows. P = [1 4 2 2 5; 1 3 2 1 1; 1 3 1 2 1; 1 5 2 4 2; 1 4 2 2 2]′; T = [1 0 0 0; 0 0 0 0; 0 0 0 0; 0 0 0 1; 0 1 0 0]′. The training process of the four algorithms for the above training samples is shown in Figures 3(a)3(d). The performance indexes of four kinds of improved BP training algorithm are listed in Table 6.

It can be seen from the above simulation results, for the provided training samples, the network training methods based on the improved BP learning algorithm can converge to the specified error after designating 5000 iterations in addition to MOBP algorithm. On the other hand, the LMBP learning algorithm has a great advantage in training speed. So it may be used in the discussed polymerization fault diagnosis system.

5. Simulation Experiments

5.1. Network Training Simulation

In this paper, a typical three-layer BP neural network is used for the polymerizer fault diagnosis system, in which the number of nodes in the input layer is 5, the number of nodes in the output layer is 4, and through experiments comparison the number of nodes in the hidden layer nodes is 9. So the structure of the LMBP neural network of polymerizer fault diagnosis system is . The learning rate lr is 0.01 and the training expectations error is 0.0001. When testing the polymerizer fault diagnosis system based on the rough sets neural network, the data in Table 2 with the attributes are selected as the training samples and the data in Table 7 with same condition attributes are selected as the test samples for training and testing the LMBP neural network. The network training steps for the test data are shown in Figure 4, and the fault diagnostic results are shown in Table 8.

Similarly, the data in Table 2 with the attributes are selected as the training samples and the data in Table 7 with same condition attributes are selected as the test samples for training and testing the LMBP neural network. The network training steps for the test data are shown in Figure 5, and the fault diagnostic results are shown in Table 9. Finally, the data in Table 2 with the attributes are selected as the training samples and the data in Table 7 with same condition attributes are selected as the test samples for training and testing the LMBP neural network. The network training steps for the test data are shown in Figure 6, and the fault diagnostic results are shown in Table 10.

5.2. Performance Comparison of Fault Diagnosis

Many performance evaluation methods have been proposed, of which various cross validations are most popularly used, such as 3-fold cross validation, 10-fold cross validation, leave-one-out cross-validation (LOOCV), and others. Here, we used the 10-fold cross validation to evaluate the performance of three condition attributes , , and . The results by using 10-fold cross validation also are shown in Table 11. In Table 11, we list the average training error, training step, and average accuracy by using 10-fold cross validation based on the same samples described in Table 1.

It can be seen from Table 11, for the training error and training time of the discussed samples, the attributes reduction results based on the improved discernibility matrix algorithm are attributes set and . So the network training steps are significantly lower than the training steps of attributes set . This greatly reduced the training time and increased the fault diagnosis accuracy. In conclusion, the attribute set after the attribute reduction is more suitable for the polymerization fault diagnosis system, which reflects the efficiency of rough set neural network and the effectiveness and practicality of the proposed method are proved.

6. Conclusions

The improved attributes reduction algorithm based on discernibility matrix is proposed for rough set, which simplified the single property element of the matrix and the element containing the single attribute in mathematics to reduce the calculated amount and the error probability. On the other hand, the fault diagnosis method combining the attributes reduction algorithm based on the improved discernibility matrix and the LMBP neural network is applied to the fault diagnosis of the polymerizer process. Simulation results show that the proposed method has higher diagnosis accuracy and shorter training time.

Acknowledgments

This work was supported by China Postdoctoral Science Foundation (no. 20110491510), Program for Liaoning Excellent Talents in University (no. LJQ2011027), and Special Research Foundation of University of Science and Technology of Liaoning (no. 2011zx10).