Fuzzy ARTMAP Ensemble Based Decision Making and Application

Jin, Min; Xu, Zengbing; Li, Ren; Wu, Dan

doi:https://doi.org/10.1155/2013/124263

Mathematical Problems in Engineering

On this page

Abstract Introduction Conclusions Acknowledgments References Copyright Related Articles

Special Issue

New Developments in Sliding Mode Control and Its Applications

View this Special Issue

Research Article | Open Access

Volume 2013 | Article ID 124263 | https://doi.org/10.1155/2013/124263

Fuzzy ARTMAP Ensemble Based Decision Making and Application

Min Jin,¹Zengbing Xu,²Ren Li,²and Dan Wu¹

Academic Editor: Xudong Zhao

Received26 Jul 2013

Accepted23 Aug 2013

Published23 Sept 2013

Abstract

Because the performance of single FAM is affected by the sequence of sample presentation for the offline mode of training, a fuzzy ARTMAP (FAM) ensemble approach based on the improved Bayesian belief method is supposed to improve the classification accuracy. The training samples are input into a committee of FAMs in different sequence, the output from these FAMs is combined, and the final decision is derived by the improved Bayesian belief method. The experiment results show that the proposed FAMs’ ensemble can classify the different category reliably and has a better classification performance compared with single FAM.

1. Introduction

Recently, artificial neural networks (ANNs) have been widely used as an intelligent classifier to identify the different categories based on learning pattern from empirical data modeling in complex systems [1]. For example, the BP, RBF, and SVM models have been developed quickly and utilized to classify the different fault classes of the machine equipment [2–6]. However, these traditional neural network methods have limitation on generalization, which can give rise to overfitting models for training samples. To solve the problem, the fuzzy ARTMAP (FAM) neural network is created and applied to the classification field [7–9], which is an incremental and supervised network model and designed in accordance with adaptive resonance theory. Although the FAM is able to overcome the stability-plasticity dilemma [10], in real-world application, the performance of FAM is affected by the sequence of sample presentation for the offline mode of training [11, 12].

For this drawback, some preprocessing procedures, known as the ordering algorithms such as min-max clustering and genetic algorithm [13, 14], have been proposed for FAM. Furthermore, a number of fusion techniques have been proposed for FAM to overcome this problem. Tang and Yan employed the voting algorithm of FAM to diagnose the bearings faults [15], Loo and Rao applied the multiple FAM based on the probabilistic plurality voting strategy to medical diagnosis and classification problems [16]. Since these voting algorithms do not consider the effect of the number of the sample in each class, an improved Bayesian belief method (BBM) is used to combine multiple FAM classifiers which are offline trained in different sequence of samples in this paper.

In view of the above principles, a novel ensemble FAM classifiers is proposed to improve the classification performance of single FAM. The identification schematic graph is shown in Figure 1. Firstly, through different features extraction methods, some feature parameters are extracted from the raw signals. Secondly, by the modified distance discrimination technique, the optimal feature set is selected from the original feature set. Finally, multiple FAM classifiers ensemble based on the improved BBM is employed to come up with the final classification results. The proposed method is applied to the fault diagnosis of hydraulic pump. The experiment results show the effectiveness of the proposed ensemble FAM classifiers.

2. Fuzzy ARTMAP Ensemble Using the Improved Bayesian Belief Method

2.1. Fuzzy ARTMAP (FAM)

FAM consists of two ART modules, namely, and modules which are bridged via a map field [10], which is capable to forming associative maps between clusters of input domain in which module functions as clustering and output domain in which the module functions as clustering. Each module comprises three layers: normalization layer , input layer , and recognition layer . The structure of FAM is shown in Figure 2. When the output domain is a finite set of class labels, FAM can be utilized as a classifier. The algorithm of FAM can be depicted simply as follows.

The module receives the input pattern, and the normalization of a -dimensional input vector , is complement-coded to a 2-dimensional vector Then, the dimension of the input vector is kept constant: Afterward, the input sample selects the category node stored in the network by the category choice function (CCF): where is a min operator, is the choice function of , and is the weight vector of the th category node.

When a winning category node is selected, a vigilance test (VT), namely, a similarity check against a vigilance parameter of the chosen category node, is taken place: where is the winning th node. When the above category match function (CMF) is satisfied with criterion, the resonance occurs and learning takes place; namely, the weight vector is updated according to the following equation: where is the learning rate. Otherwise, a new node is created in which codes the input pattern. In the meantime, for the the same learning algorithm occurs simultaneously using the target pattern.

After the resonance occurs in the and , the winning node in will send a prediction to via the map field. The map field vigilance test is used to detect the test. If the test fails, it indicates that the winning node of predicts an incorrect target class in ; then a match tracking process initiates. During the match tracking, the value of is increased until it is slightly higher than ; then a new search for the other winning node in is carried out, and the process continues until the selected node can make a correct prediction in .

2.2. Decision Fusion Using Bayesian Belief Method

The novel Bayesian belief method is supposed in [17]. It is based on the assumption of mutual independency of classifiers and considers the error of each classifier. Assume that in pattern space there are classes and classifiers. A classifier can be considered as a function: It signifies that the sample is assigned to class by the classifier . And its two-dimensional confusion matrix can be represented as which is obtained by executing on the test data set after is trained. Each row corresponds to class and each column corresponds to . The matrix unit means the input samples from class while are assigned to class by classifier . The number of samples in class is , where , and the number of samples labeled by is , where . Considering the difference of the number of samples in each class, on the basis of the confusion matrix a belief measure of classification can be calculated for each classifier by the following belief function [18]: When multiple classifiers are developed, their correspondent beliefs are computed based on the performance of base classifiers. Combining the belief measures of all fusion classifiers can result in the final belief measure of the multiple classifier system. In case of equal a priori class probabilities, the combination rule can be depicted as follows: Thus, is classified into a class according to the belief of making the final decision .

3. Case Study

In order to evaluate the effectiveness of the supposed ensemble FAM, the fault identification of hydraulic pump is taken as example. Figure 3 shows the schematic diagram of experiment rig. Four accelerometers are attached to the housing with magnetic bases and mounted at the positions P1, P2, P3, and P4. Pressure sensor is mounted at the position P5. Considering the sensitivity to the fault conditions of hydraulic pump, the vibration signal which is acquired by the accelerometer at the position P2 is utilized to identify the fault categories. And the vibration signals are acquired, respectively, under normal condition and the different fault conditions, such as inner plunger wear, inner race wear, ball wear, swashplate wear, portplate wear, and paraplungers wear.

3.1. Data Preparation

The data set contains 490 samples. These data samples are divided into 245 training and 245 test samples. The detailed descriptions of three data sets are shown in Table 1. In order to identify the different fault categories, a seven-class classification problem need be solved.

3.2. Feature Extraction and Selection

3.2.1. Feature Extraction

Feature parameters are used to characterize the information relevant to the conditions of the hydraulic pump. To acquire more fault-related information, many features in different symptom domains are extracted from the measured signals.

Frequency domain is another description of a signal. In [19], some novel features which can give a much fuller picture of the frequency distribution in each band of frequencies are proposed. Supposed points of normalized PSD, , of the vibration signal, are divided into segments, where is 1 in this study. The four features based on the moment estimates of power can be obtained as follows: where “” is the number of total data points and ′ is the number of sample points in the lth segment.

In order to characterize the spectrum with a higher accuracy, the moment estimates of frequency weighed by power are calculated by the following formulas: where is the corresponding frequency of and is the total power in the segment. Then, the total number of features extracted for each spectrum is 1 × 8.

To depict the fault-related information about the hydraulic pumps quantitatively, the first-order continuous wavelet grey moment (WGM) [20] of vibration signal is extracted. Assuming the wavelet coefficients matrix which can be displayed by the continuous wavelet transform (CWT) scalogram, and are the scales and the time of the scalogram, respectively, the matrix is divided into parts along the scale equally, and the first-order wavelet grey moment of each part can be calculated by the following equation: where is the element of matrix . In this paper, the is set as 8 and the wavelet function is Morlet wavelet.

In addition, due to sensitiveness of these model parameters to the shape of the vibration data, AR model parameters are utilized to characterize the information about the conditions of hydraulic pumps. The AR model is written as follows: where are the previous samples, is the predicted sample of the signal, and is AR model parameters, which can be obtained by the least square method in [21] and expressed by the following formula: where In this study, the parameter is set as 8.

Thus, 24 features constitute the original feature set.

3.2.2. Feature Selection

In order to improve the identification accuracy and reduce the computation burden, some sensitive features providing characteristic information for the classification system need to be selected, and irrelevant or redundant features must be removed. In this study, based on [22], a modified distance discriminant technique is employed to select the optimal features.

Supposing that a feature set of classes consists of samples, in the th class there are samples, where , and . Each sample is represented by features, and the th feature of the th sample is written as . Then, the feature selection process can be described as follows.

Step 1. Calculate the standard deviation and the mean of all samples in the th feature:

Step 2. Calculate the standard deviation and the mean of the sample in the th class in the th feature, respectively,

Step 3. Calculate the weighted standard deviation of the class center in the th feature: where , , , and are the centers of all samples in the th feature; is the center of the samples of the th class in the th feature; , are the weighted means of the squared class center and the class center in the th feature; is the prior probability of the th class, respectively; and .

Step 4. Calculate the distance discriminant factor of the th feature: where is the distance of the th feature between different classes, corresponds to the distance of the th feature within classes, and is used to control the impact of , which is set as 2 in this paper.
Considering the overlapping degree among different classes, a compensation factor is calculated as follows.
Firstly, define and calculate the variance factor of in the th feature as follows:
Secondly, define and calculate the variance factor of in the th feature as follows:
Then, the compensation factor of the th feature can be defined and calculated as follows:
Thus, the modified distance discriminant factor can be calculated as follows:

Step 5. Rank features in descending order according to the modified distance discriminant factors ; then normaliz by and get the distance discriminant criteria. Clearly, bigger signifies that the correspondent feature is better to separate classes.

Step 6. Set a threshold value and select the sensitive features whose distance discriminant factor from the set of features.

3.3. Diagnosis Analysis

It is well known that the data-ordering of training samples can affect the classification accuracy of single FAM, and that a single output used to represent multiple classes may lead to lower classification accuracy. In order to know how well the proposed FAMs’ ensemble work, that is, how significant the generalization ability is improved by utilizing the improved Bayesian belief method to combine the classification results from a committee of single FAM trained with different data-ordering of training samples, the performance of single FAM is also conducted.

In the diagnosis phase performed by the single FAM and FAM ensemble, they are all trained in the fast learning and conservative mode (i.e., setting in (5) and in (3)). Besides, in order to ensure the performance of stability-plasticity, the vigilance parameter of FAM is set as , and the ensemble size is set as 5.

In order to improve the classification accuracy and reduce the computation time, in each case some salient features are selected from each feature set by the modified distance discriminant technique, respectively, and then input into the five single FAM in different sequence in the process of training. Figure 4 shows the modified distance discriminant factor of all features in the feature sets. From the figure it can be seen that the threshold corresponding to the optimal features are different for the case. That is to say, the number of salient features is different.

Figure 5 summarizes the classification results in terms of test accuracy of single FAM and FAMs’ ensemble. From the figure, it can be seen that the FAMs’ ensemble (0.988) outperforms the single FAMs’ in terms of accuracy. And the test accuracy is getting higher when the number of single FAM increases. These indicate that FAMs’ ensemble can identify the different fault categories of hydraulic pump well.

3.3.1. Effect of Different Threshold for Feature Selection

As shown in Figure 4, when the threshold value is set properly, some redundant and irrelevant features can be removed from the original feature set. To test the effect of the proposed feature selection method based on the modified distance discriminant technique, a series of experiments is carried out against the threshold value , in which the parameter of the single FAM is the same as the above, and the size of ensemble FAM is set as 5.

Figure 6 lists the classification accuracy of five individual FAMs’ and FAM ensemble against the different thresholds. From the figure, it can be noticed that when (original feature set), the test accuracy of single FAM and FAM ensemble is 0.824 and 0.845, respectively. The highest test accuracy of single FAM and FAMs’ ensemble (0.915 and 0.988) is arrived synchronously when , where the optimal feature set is selected. However, when the threshold value continues to increase, the test accuracy of single FAM and FAMs’ ensemble tends to decrease. And when threshold , the test accuracy of single FAM and FAMs’ ensemble is lower than that used in all features with threshold . This is mainly because the smaller number of features leads to the overfitting; namely, the drastic reduction of features can lead to a decrease in the test accuracy.

3.3.2. Classification Performance Comparison with Other Classification Methods

In order to test the superiority of the proposed FAMs’ ensemble method, the test results produced by FAMs’ ensemble and single FAM are compared with those produced by other classification methods. In this experiment, the parameters of FAM ensemble and single FAM are the same as the above.

Table 2 shows the test results of the FAMs’ ensemble versus other classification methods. From the table it can be seen that the average test accuracy using single FAMs’ is the lowest. However, the test accuracy produced by two FAMs’ ensemble methods is higher than that produced by the single classifier, and the test success rate of the proposed FAMs’ ensemble is highest and higher than that of FAM ensemble with voting algorithm. These indicate that the proposed FAMs’ ensemble has comparatively superior diagnosis performance.

4. Conclusions

The classification performance of FAM is affected by the sequence of training samples. A novel and reliable FAMs’ ensemble based on the improved Bayesian belief method is described and proposed to improve the classification performance of FAM in this paper, which combines the output from a committee of FAM fed with different orderings of training samples and derives the combined decision.

And the supposed FAMs’ ensemble method is applied to the fault identification of hydraulic pump. The experiment results testify that the proposed FAM ensemble can diagnose the fault categories accurately and reliably and has better diagnosis performance compared with single FAM. These indicate that the proposed FAMs’ ensemble has a good promise in the engineering of classification and decision making.

Acknowledgments

This work is supported by the National Scientific and Technological Achievement Transformation Project of China (Grant no. 201255), Electronic Information Industry Development Fund of China (Grant no. 2012407), the National Natural Science Foundation of China (Grant no. 61374172), and the Fundamental Research Funds for the Central Universities, Hunan University, China.

References

M. M. Polycarpou and A. J. Helmicki, “Automated fault detection and accommodation: a learning systems approach,” IEEE Transactions on Systems, Man and Cybernetics, vol. 25, no. 11, pp. 1447–1458, 1995.
View at: Publisher Site | Google Scholar
X. Dong, L. Qiu, and Z. Wang, “On line condition monitoring and fault diagnosis for hydraulic pump based on BP algorithm,” Journal of Beijing University of Aeronautics and Astronautics, vol. 23, no. 3, pp. 322–327, 1997.
View at: Google Scholar
Y. H. Jia, Y. G. Kong, and S. P. Liu, “Application of wavelet neural network to fault diagnosis of hydraulic pumps,” in Proceedings of the 6th International Symposium on Test and Measurement, pp. 19–22, 2005.
View at: Google Scholar
H. Liu, S. Wang, and P. Ouyang, “Fault diagnosis based on wavelet package and Elman neural network for a hydraulic pump,” Journal of Beijing University of Aeronautics and Astronautics, vol. 33, no. 1, pp. 67–71, 2007.
View at: Google Scholar
F. Sun and Z. Wei, “Rolling bearing fault diagnosis based on wavelet packet and RBF neural network,” in Proceedings of the 26th Chinese Control Conference (CCC '07), pp. 451–455, July 2007.
View at: Publisher Site | Google Scholar
B. Samanta, K. R. Al-Balushi, and S. A. Al-Araimi, “Artificial neural networks and support vector machines with genetic algorithm for bearing fault detection,” Engineering Applications of Artificial Intelligence, vol. 16, no. 7-8, pp. 657–665, 2003.
View at: Publisher Site | Google Scholar
R. Javadpour and G. M. Knapp, “A fuzzy neural network approach to machine condition monitoring,” Computers and Industrial Engineering, vol. 45, no. 2, pp. 323–330, 2003.
View at: Publisher Site | Google Scholar
S. C. Tan and C. P. Lim, “Application of an adaptive neural network with symbolic rule extraction to fault detection and diagnosis in a power generation plant,” IEEE Transactions on Energy Conversion, vol. 19, no. 2, pp. 369–377, 2004.
View at: Publisher Site | Google Scholar
X. Zhao, L. Zhang, P. Shi, and H. Karimi, “Novel stability criteria for T-S fuzzy systems,” IEEE Transactions on Fuzzy Systems, vol. 99, pp. 110–111, 2013.
View at: Google Scholar
G. A. Carpenter, S. Grossberg, N. Markuzon, J. H. Reynolds, and D. B. Rosen, “Fuzzy ARTMAP: a neural network architecture for incremental supervised learning of analog multidimensional maps,” IEEE Transactions on Neural Networks, vol. 3, no. 5, pp. 698–712, 1992.
View at: Publisher Site | Google Scholar
M. Georgiopoulos, H. Fernlund, G. Bebis, and G. L. Heileman, “Order of search in fuzzy ART and fuzzy ARTMAP: effect of the choice parameter,” Neural Networks, vol. 9, no. 9, pp. 1541–1559, 1996.
View at: Publisher Site | Google Scholar
M. Jin, X. Zhou, Z. M. Zhang, and M. M. Tentzeris, “Short-term power load forecasting using grey correlation contest modeling,” Expert Systems with Applications, vol. 39, no. 1, pp. 773–779, 2012.
View at: Publisher Site | Google Scholar
I. Dagher, M. Georgiopoulos, G. L. Heileman, and G. Bebis, “An ordering algorithm for pattern presentation in fuzzy ARTMAP that tends to improve generalization performance,” IEEE Transactions on Neural Networks, vol. 10, no. 4, pp. 768–778, 1999.
View at: Publisher Site | Google Scholar
R. Palaniappan and C. Eswaran, “Using genetic algorithm to select the presentation order of training patterns that improves simplified fuzzy ARTMAP classification performance,” Applied Soft Computing Journal, vol. 9, no. 1, pp. 100–106, 2009.
View at: Publisher Site | Google Scholar
Z. Tang and X. Yan, “Voting algorithm of fuzzy ARTMAP and its application to fault diagnosis,” in Proceedings of the 4th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD '07), pp. 535–538, August 2007.
View at: Publisher Site | Google Scholar
C. K. Loo and M. V. C. Rao, “Accurate and reliable diagnosis and classification using probabilistic ensemble simplified fuzzy ARTMAP,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 11, pp. 1589–1593, 2005.
View at: Publisher Site | Google Scholar
L. Lam and C. Y. Suen, “Optimal combinations of pattern classifiers,” Pattern Recognition Letters, vol. 16, no. 9, pp. 945–954, 1995.
View at: Google Scholar
L. Chen and H. L. Tang, “Improved computation of beliefs based on confusion matrix for combining multiple classifiers,” Electronics Letters, vol. 40, no. 4, pp. 238–239, 2004.
View at: Publisher Site | Google Scholar
M. L. D. Wong, L. B. Jack, and A. K. Nandi, “Modified self-organising map for automated novelty detection applied to vibration signal monitoring,” Mechanical Systems and Signal Processing, vol. 20, no. 3, pp. 593–610, 2006.
View at: Publisher Site | Google Scholar
Z. Yanping, H. Shuhong, H. Jinghong, S. Tao, and L. Wei, “Continuous wavelet grey moment approach for vibration analysis of rotating machinery,” Mechanical Systems and Signal Processing, vol. 20, no. 5, pp. 1202–1220, 2006.
View at: Publisher Site | Google Scholar
S. Z. Yang, Y. Wu, and J. P. Xuan, Time Series Analysis in Engineering Application, Huazhong University of Science and Technology Press, Wuhan, China, 2007.
J. Liang, S. Yang, and W. Adam, “Invariant optimal feature selection: a distance discriminant and feature ranking based solution,” Pattern Recognition, vol. 41, no. 5, pp. 1429–1439, 2008.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2013 Min Jin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2189

Downloads

1421

Citations