Research Article  Open Access
Huaping Guo, Haiyan Liu, Changan Wu, Wei Liu, Wei She, "Ensemble of Rotation Trees for Imbalanced Medical Datasets", Journal of Healthcare Engineering, vol. 2018, Article ID 8902981, 9 pages, 2018. https://doi.org/10.1155/2018/8902981
Ensemble of Rotation Trees for Imbalanced Medical Datasets
Abstract
Medical datasets are often predominately composed of “normal” examples with only a small percentage of “abnormal” ones and how to correctly recognize the abnormal examples is very meaningful. However, conventional classification learning methods try to pursue high accuracy by assuming that the number of any class examples is similar to each other, which lead to the fact that the abnormal class examples are usually ignored and misclassified to normal ones. In this paper, we propose a simple but effective ensemble method called ensemble of rotation trees (ERT) to handle this problem in imbalanced medical datasets. ERT learns an ensemble through the following four stages: (1) undersampling subsets from normal class, (2) obtaining new balanced training sets through combining each subset and abnormal class, (3) inducing a rotation matrix on randomly sampling subset of each new balanced set, and in each rotation matrix space, (4) learning a decision tree on each balanced training data. Here, the rotation matrix is mainly to improve the diversity between ensemble members, and undersampling technique aims to improve the performance of learned models on abnormal class. Experimental results show that, compared with other stateoftheart methods, ERT shows significantly better performance for imbalanced medical datasets.
1. Introduction
In real world, the medical data often exists class imbalance, where the number of one class examples is larger than other classes [1, 2]. For two classes, the examples are usually categorized into normal (negative or majority) and abnormal (positive or minority) classes. The cost of misclassifying abnormal class examples is often higher than misclassifying the normal class ones. For instances, the “mammography” dataset contains 10,923 “healthy” patients and 260 “cancerous” patients and how to recognize the “cancerous” patients is very meaningful. However, traditional learning methods try to achieve high accuracy by assuming that the number of any class examples is similar to each other, which causes that the abnormal class examples are often overlooked and incorrectly classified as normal class [3, 4]. Therefore, many approaches have been proposed to tackle the problem.
Sampling technique including undersampling [5], oversampling [6], and SMOTE [7] is one of the most popular methods to solve the problem existing in imbalanced medical datasets. Undersampling technique is to learn models on the rebalanced dataset by sampling a subset of normal class and, unlike undersampling, oversampling rebalances the training dataset by repeating abnormal class examples [1]. SMOTE [7] is another version of oversampling technique, which generates new synthetic abnormal class examples by randomly interpolating pairs of closest neighbors of abnormal class.
Ensemble learning, which has often used to solute challenging issues when traditional classification models have been insufficient such as image detection [8–11], is another popular method to deal with imbalanced datasets. The proposed class imbalanceoriented ensemble learning methods can be mainly grouped into three categories: (1) bagging, (2) boosting, and (3) hybridbased approaches. Both bagging and boostingbased approaches often apply sampling technique to ensemble learning process, such as OverBagging, UnderBagging, UnderOverBagging [12], SMOTEBoost [13], and RUSBoost [14]. The former three methods combine bagging with sampling technique, and the latter two methods embed sampling technique into the process of learning each member. EasyEnsemble and BalanceCascade are the two specific examples of hybridbased approaches [5]. EasyEnsemble undersamples several subsets from the normal class, trains a model using each of them, and combines the outputs of those models. The learning process of BalanceCascade is similar to EasyEnsemble with exception that in each step of training the models, the normal class examples, which are correctly classified by the current trained models, are removed from further consideration.
In this paper, we propose a novel ensemble method called ensemble of rotation trees (ERT) to build accurate and diverse classifiers to tackle classimbalanced medical datasets. The main heuristics consist of (1) undersampling subsets from normal class, (2) obtaining new balanced training sets through combining each subset and abnormal class, (3) inducing a rotation matrix on randomly sampling subsets from each new balanced set, and in each rotation matrix space, (4) learning a decision tree on each balanced training data. Here, rotation matrix is to improve ensemble diversity, and undersampling technique mainly aims to improve the performance of learned models on abnormal class. The decision tree is selected as the chosen base model because it is sensitive to the rotation of feature axes, hence the name “rotation trees”. Compared with other stateoftheart classification methods, ERT also shows a much better performance on classimbalanced medical datasets.
This paper extends our previous work [15] in the following respects. First of all, it empirically compares a variety of ensemble method for medical datasets and this has led to new conclusions, such as the fact that the proposed ensemble significantly outperforms other ensemble methods for imbalanced medical datasets. The comparison is based on more medical datasets. Finally, this paper includes more discussion about why the proposed method works.
The rest of this paper is organized as follows: after presenting related work in Section 2, Section 3 describes the proposed learning method for medical datasets, Section 4 presents the experimental results, and finally, Section 5 concludes this work.
2. Strategies for Imbalanced Medical Datasets
In medical data analysis, it often happens that examples are categorized into an abnormal (minority or positive) group and a normal (majority or negative) group and the cost of misclassifying an abnormal example as a normal example is highly expensive. Take “mammography dataset” as an example. This dataset contains 10,923 “healthy” patients and 260 “cancerous” patients and a naive approach of classifying every example to a “healthy” patient would provide an accuracy of almost 97.68%. Although the naive approach achieves high accuracy, it incorrectly classifies all the “cancerous” patients.
Many techniques have been proposed to handle the imbalanced problem in medical datasets, where the efforts mainly focus on the methods of manipulating datasets and ensemble learning methods.
The methods of manipulating dataset are to rebalance the imbalanced medical data through manipulating data distribution such that traditional methods bias to abnormal class. Reported studies of manipulating datasets can be further subdivided two types: resampling and weighting the data space. Resampling techniques aim to alleviate the effect of classimbalanced distribution through sampling data space to rebalance the corresponding imbalanced dataset. Commonly used sampling techniques are falling to the following three categories: oversampling methods, undersampling methods, and hybrid method. Oversampling techniques try to create new minority class examples to eliminate the harms of imbalanced problem. Randomly duplicating the minority samples and synthetic minority oversampling technique (SMOTE) [7] are the two most popular examples of oversampling techniques. Undersampling techniques, such as random undersampling (RUS) [5], the simplest yet most effective method, try to eliminate the harms of classimbalanced distribution through removing the examples of the majority class. The hybrid method is a combination of oversampling and undersampling. The strategies of weighting data space adopt information concerning the misclassification costs to adjust the training set distribution, examples including costsensitive methods [16] and an ensemble of SVM with asymmetric misclassification costs [1].
Ensemble learning, which generally outperforms single classifiers in classimbalanced problems [17], and decision trees are popular choices for the base classifiers in an ensemble [18]. According to Galar et al. [19], ensembles for classimbalanced problem can be grouped into three categories: (1) bagging, (2) boosting, and (3) hybridbased approaches. Baggingbased ensemble methods, such as UnderBagging, OverBagging, and UnderOverBagging [12], integrate bagging with resampling technique to improve model’s performance on classimbalanced problem, where UnderBagging uses undersampling technique to preprocess the training set before learning each member. On the contrary to UnderBagging, OverBagging uses oversampling technique instead of undersampling technique to preprocess the training set. UnderOverBagging uses both oversampling and undersampling techniques to adjust data distribution for training individual members. Boostingbased ensembles embed sampling techniques into the learning process of boosting algorithms: alter and bias the weight distribution to train the next classifier toward the abnormal class every iteration. For example, SMOTEBoost [13] uses SMOTE [7] to generate synthetic examples of abnormal class to alter data distribution, and RUSBoost [14] which performs similarly to SMOTEBoost uses RUS [5] to remove examples from the normal class to train base classifiers. Hybridbased ensembles, such as EasyEnsemble and BalanceCascade [5], combine bagging with boosting (also with a sampling technique). Both EasyEnsemble and BalanceCascade use bagging as the main ensemble learning method and use AbaBoost as the base classifier learning method. The difference between these methods is the way in which they treat the normal class examples after each iteration. EasyEnsemble does not perform any operation after each AdaBoost iteration. Unlike EasyEnsemble, after learning an AdaBoost, BalanceCascade removes the normal class examples that are correctly classified with higher confidences from further consideration.
Rotation forest, an ensemble learning approach, often performances better than bagging and boosting due to build accurate and diverse classifiers by introducing subsets of features and rotation feature space [20]. This method is also applied to imbalanced problems, for example, Su et al. [21] employed class imbalanceoriented learner, namely, Hellinger distance decision tree (HDDT), as the base classifier of rotation forest to handle classimbalanced problem, and each base classifier is constructed on the whole training set. Hosseinzadeh and Eftekharia [22] learned rotation forest on the data obtained by preprocessing training set using synthetic oversampling technique (SMOTE) and fuzzy cluster. Fang et al. [23] learned the rotation matrixes on datasets obtained by random undersampling or oversampling (SMOTE) the training set, and each base classifier is constructed on the whole training set.
This paper proposes a novel ensemble method for imbalanced medical datasets. Unlike bagging, boosting, and hybridbased approaches, the proposed method learns each base classifier in rotation matrix space. Unlike conventional rotation forestbased approaches, the proposed method learns both rotation matrixes and base classifiers on the diverse balanced datasets instead of on imbalanced data or on the same data. More details are discussed in Section 3.
3. Ensemble of Rotation Trees for Imbalanced Medical Datasets
3.1. Ensemble of Rotation Trees
Classimbalanced problem often exists in medical datasets. This problem causes that traditional classifier learning methods do not work well. This section proposes a novel ensemble method called ensemble of rotation trees (ERT) to handle imbalanced medical datasets. ERT learns an ensemble through the following two steps: (1) sampling subsets from normal class and learning a rotation matrix on each subset and (2) training a tree on the balanced dataset obtained from combining each subset and abnormal class set in the new feature space defined by current rotation matrix.
Let x = [x_{1}, x_{2},..., x_{n}]^{T} be an example of a medical dataset described by n features, and let X_{a} be the abnormal class set in the form of N_{a} × n matrix and X_{n} be the normal class set (in the form of N_{n} × n matrix). Denote by h ∈ H a classifier in the ensemble H and by F, the feature set. Like bagging, all classifiers can be trained in parallel. ERT constructs the current classifier h ∈ H using the following steps: (i)D = D_{n} ∪ X_{a}, where D_{n} is a subset of X_{n} obtained by randomly undersampling X_{n} without replacement, and D_{n} = X_{a}.(ii)Split F randomly into subsets {F _{j}j = 1, 2,…n/L}. The disjoint subsets are chosen to maximize the chance of high diversity.(iii)For each F_{j}, draw a subset of size 50 percent from D. Run feature extraction method on F_{j} and the subset to get feature projection components, , each of size L × 1.(iv)Organize the components in a sparse “rotation” matrix R(v)Train current classifier h using DR.
Pseudocode 1 shows the pseudocode for the algorithm of ERT. The differences with rotation forestbased classimbalanced methods (refers to Section 2) are mainly reflected in lines 4~5 and lines 14~15. Lines 4~5 construct new balanced training set D_{i} through undersampling subset D_{n} from the normal set X_{n} with the size of equal to that of abnormal set X_{a}. Lines 14~15 learn base classifier h_{i} on the balanced data D_{i} (obtained in steps 4~5) in matrix space R_{i} through projecting D_{i} using R_{i} to obtain a new balanced training set D_{i, train} = D_{i}R_{i}. Therefore, both the rotation matrix R_{i} and base classifier h_{i} are learned from balanced dataset. Besides, unlike conventional rotation forestbased methods, which select and eliminate a random nonempty subspace of classes, ERT does not handle classes due to only two classes used in this paper.

In this paper, we chose decision trees as the base classifiers because they are sensitive to the rotation of the feature axes and still can be very accurate. The feature extraction is based on principal component analysis (PCA) [24] following rotation forest [20]. The running time of ERT is mainly dominated by constructing decision trees, running PCA, and rotating the datasets. Therefore, the computational complexity of ERT is the same to rotation forest [13].
3.2. Discussion
Two issues in ensemble should be addressed for imbalanced medical datasets: high performance of individual ensemble member bias towards abnormal class and the diversity between the members. Undersampling technique is employed to normal class such that individual base classifiers focus more on abnormal class. Specifically, ERT (the proposed method) undersample normal class set such that the learned rotate matrixes capture more on the distribution of the abnormal class set, which enhances the performance of individual classifiers on abnormal class (line 4, Pseudocode 1). Besides, ERT learns each individual classifier on rebalanced dataset obtained by undersampling the training set (lines 15, Pseudocode 1).
Diversity is one major issue to the success of an ensemble, and the intended diversity in the proposed model comes from the following two approaches: (1) the undersampling technique used to sample the normal class (refer to line 4 in Pseudocode 1) and (2) the difference in the possible feature subsets (refer to lines 6–14 in Pseudocode 1). For the first method, the larger the ratio between the size of normal class set and abnormal class set, the larger the diversity of individual classifiers. For the second approach, the number of different partitions of the feature set into n/L subsets is
For the ensemble with M members, the probability of all the members be different can be calculated by
For example, the probability that all different classifiers of an ensemble with 50 member for n = 9 is less than 0.01, and thus, an extra randomization of the ensemble is meaningful, especially for balanced datasets. Following rotation forest [20], we draw a bootstrap sample of objects, and PCA was applied on the subset.
4. Experiments
4.1. Evaluation Metrics
Evaluation metric is extremely essential to assess the effectiveness of an algorithm, and traditionally, accuracy is the most frequently used one. The examples classified by a classifier can be grouped into four categories as shown in Table 1, and thus, accuracy is defined as

However, accuracy is inadequate for imbalanced medical problem and other metrics are proposed, including precision, recall, fmeasure, gmean, and AUC. Precision and recall are, respectively, designed as
Fmeasure is a harmonic mean between recall and precision. Specifically, fmeasure is defined as where δ, often set to be 1, is a coefficient to adjust the relative importance of precision versus recall.
Like fmeasure, gmean is another metric considering both normal class and abnormal class. Specifically, gmean measures the balanced performance of a classifier using the geometric mean of the recall of abnormal class and that of normal class. Formally, gmean is as follows:
Besides, AUC is a commonly used measure to evaluate models’ performances. According to [25], AUC can be estimated by
In this paper, we employ recall, fmeasure, gmean, and AUC to evaluate the classification performance on imbalanced datasets.
4.2. Datasets and Experimental Setup
Eight medical datasets are selected in this paper. All the datasets are twoclass imbalanced medical datasets [26]. The imbalanced degree of these dataset varies from 0.061 (highly imbalanced) to 0.349 (only slightly imbalanced), where imbalanced degree is defined as the ratio of the size of the abnormal class to that of the normal class. The details of the datasets are shown in Table 2, where #Degree is the imbalance degree, #Size is the size of datasets, and #Attrs is the number of attributes.

A 10fold crossvalidation [27] is performed to test model performance: each dataset is randomly divided into tenfolds. For each fold, the other ninefolds are used to train a model, and the current fold is to test the model. We run ten times of the 10fold crossvalidation, and therefore, 100 models are constructed for each dataset.
To evaluate the performance of ERT (the proposed method), we compare it with RURF [23], EasyEnsemble [5], BalanceCascade [5], Bagging [28], and C4.5 [29]: (i)RURF is a class imbalanceoriented version of rotation forest (RF) which learns projection matrixes on random undersampling (RU) datasets. C4.5 was selected as the base learner and the number of the base classifiers was set to be 100.(ii)EasyEnsemble samples T subsets from the normal class and uses AdaBoost with C4.5 as the weak learner to learn M base classifiers on each subset. We set T = M = 10 and therefore 100 trees are learned.(iii)BalanceCascade is similar to EasyEnsemble except that it removes major class examples that are correctly classified by trained learners from further consideration. T and M are both set to be 10 and therefore 100 trees are learned.(iv)Bagging learns each base classifier on a resampled dataset. C4.5 is set to be the weak classifier and the number of base classifiers is set to be 100.(v)ERT is the proposed method in this paper. Here, we set , namely, the number of bases classifier is 100. C4.5 is used to train base classifiers (refer to Pseudocode 1).
4.3. Experimental Results
To evaluate the performance of ERT (the proposed method), ERT is compared with RURF, EasyEnsemble, BalanceCascade, Bagging, and C4.5 (more details refer to Section 4.2). The corresponding results are reported both in tables and one figure, where four tables report the results of the eight comparing methods on the measures of recall, fmeasure, gmean, and AUC, and the figure reports the ranks of the methods on recall, fmeasure, gmean, and AUC. In these tables, a bullet (an open circle) next to a result indicates that ERT significantly outperforms (is outperformed by) the respective method (column) for respective dataset (row) in pairwise ttest at 0.05 significance level. The last rows in these tables are the average results. The ranks of these methods on measure of recall, fmeasure, gmean, and AUC shown in Figure 1 are calculated as follow [30, 31]: on a dataset, the best performing algorithm gets the rank of 1.0, the second best gets the rank of 2.0, and so on. In case of ties, average ranks are assigned.
(a)
(b)
(c)
(d)
Table 3 and Figure 1(a) show the summarizing results and the ranks of the six comparing methods on measure of recall, respectively. From Table 3, ERT significantly outperforms both bagging and C4.5 on all the eight medical datasets, and the average recall of ERT is 0.2087 higher than C4.5 (recall ∈ [0, 1]). Also, ERT statistically outperforms RURF, EasyEnsemble, and BalancedCascade on eight, seven, and six out of the datasets, respectively, and outperforms them on all datasets. Besides, from Figure 1(a), we observe that the average ranks of ERT, RURF, EasyEnsemble, BalanceCascade, bagging, and C4.5 are 1.0, 4.3, 2.4, 2.8, 5.3, and 5.3, respectively.
 
●: ERT is significantly better; level of significance: 0.05. 
Table 4 and Figure 1(b) illustrate the summarizing results and the ranks of ERT, RURF, EasyEnsemble, BalanceCascade, Bagging, and C4.5 on fmeasure, respectively. From Table 4, ERT shows much better performance comparing to other methods. Specifically, ERT statistically outperforms RURF, EasyEnsemble, and BalanceCascade on four, eight, eight, seven, and seven out of the eight datasets. Figure 1(b) shows that ERT wins on six, eight, eight, seven, and seven out of the eight datasets. Besides, ERT is statistically outperformed by RURF, bagging, and C4.5 on “sick.” Combining the results of Table 3 and Figure 1(a), we have that ERT obtains high recall by scarifying the precision of models on “sick.”
 
●: ERT is significantly better; ○: ERT is significantly worse; level of significance: 0.05. 
Gmean summaries and the corresponding ranks of ERT, RURF, EasyEnsemble, BalanceCascade, Bagging, and C4.5 are reported in Table 5 and Figure 1(c), respectively. Table 5 shows that ERT significantly outperforms RURF, EasyEnsemble, BalanceCascade, Bagging, and C4.5 on all of the eight datasets, and Figure 1(c) shows that ERT ranks first with average rank 1.0, followed by BalanceCascade (2.9), EasyEnesemble (3.4), RURF(3.5), Bagging (4.5), and C4.5 (5.13).
 
●: ERT is significantly better; level of significance: 0.05. 
Table 6 and Figure 1(d) depict AUC and the ranks of ERT, RURF, EasyEnsemble, BalanceCascade, Bagging, and C4.5, respectively. Similar to the results on gmean, ERT significantly wins on all the eight sets comparing to other methods. The average AUC (ranks) of ERT, RURF, EasyEnsemble, BalanceCascade, Bagging, and C4.5 are 0.8573(1.0), 0.8093(3.1), 0.8096(4.3), 0.8098(3.1), 0.7959(4.6), and 0.7899(3.3), respectively.
 
●: ERT is significantly better; ○: ERT is significantly worse; level of significance: 0.05. 
5. Conclusion
In this paper, we propose a novel method called ensemble of rotation trees (ERT), which aims to build accurate and diverse classifiers to handle imbalanced medical data. The main heuristic consists of (1) sampling subsets from normal class, (2) learning a rotation matrix on each subset, and (3) learning a tree using each subset and abnormal class set in the new feature space. Experimental results show that ERT performs better than other stateoftheart classification methods on measure of recall, fmeasure, gmean, and AUC on medical datasets.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this article.
Acknowledgments
This work is in part supported by the National Natural Science Foundation of China (nos. 61572417 and 615013933), in part by the Project of Science and Technology Department of Henan Province (no. 182102210132), and in part by the Nanhu Scholars Program for Young Scholars of XYNU.
References
 H. He and Y. Ma, Eds., Imbalanced Learning: Foundations, Algorithms, and Applications, WileyIEEE Press, New York, NY, USA, 2013. View at: Publisher Site
 P. Yao, Z. Wang, H. Jiang, and Z. Liu, “Fault diagnosis method based on CSboosting for unbalanced training data,” Journal of Vibration Measurement & Diagnosis, vol. 33, no. 1, pp. 111–115, 2013. View at: Google Scholar
 P. D. Martin, “Evaluation: from precision, recall and Fmeasure to ROC, informedness, markedness and correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011. View at: Google Scholar
 X. Y. Liu, Q. Q. Li, and Z. H. Zhou, “Learning imbalanced multiclass data with optimal dichotomy weights,” in 2013 IEEE 13th International Conference on Data Mining, pp. 478–487, Dallas, TX, USA, 2013. View at: Publisher Site  Google Scholar
 X.Y. Liu, J. Wu, and Z.H. Zhou, “Exploratory undersampling for classimbalance learning,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 2, pp. 539–550, 2009. View at: Publisher Site  Google Scholar
 G. Batista, R. Prati, and M. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD Explorations Newsletter, vol. 6, no. 1, pp. 20–29, 2004. View at: Publisher Site  Google Scholar
 N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority oversampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. View at: Publisher Site  Google Scholar
 Z. Zhou, Y. Wang, Q. M. J. Wu, C.N. Yang, and X. Sun, “Effective and efficient global context verification for image copy detection,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 1, pp. 48–63, 2017. View at: Publisher Site  Google Scholar
 Z. Xia, X. Wang, L. Zhang, Z. Qin, X. Sun, and K. Ren, “A privacypreserving and copydeterrence contentbased image retrieval scheme in cloud computing,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 11, pp. 2594–2608, 2016. View at: Publisher Site  Google Scholar
 J. Li, X. Li, B. Yang, and X. Sun, “Segmentationbased image copymove forgery detection scheme,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 3, pp. 507–518, 2015. View at: Publisher Site  Google Scholar
 Z. Zhou, C.N. Yang, B. Chen, X. Sun, Q. Liu, and Q. M. J. Wu, “Effective and efficient image copy detection with resistance to arbitrary rotation,” IEICE Transactions on Information and Systems, vol. E99.D, no. 6, pp. 1531–1540, 2016. View at: Publisher Site  Google Scholar
 S. Wang and X. Yao, “Diversity analysis on imbalanced data sets by using ensemble models,” in 2009 IEEE Symposium on Computational Intelligence and Data Mining, pp. 324–331, Nashville, TN, USA, 2009. View at: Publisher Site  Google Scholar
 N. Chawla, A. Lazarevic, L. Hall, and K. Bowyer, “SMOTEBoost: improving prediction of the minority class in boosting,” in Knowledge Discovery in Databases: PKDD 2003. PKDD 2003, N. Lavrač, D. Gamberger, L. Todorovski, and H. Blockeel, Eds., vol. 2838 of Lecture Notes in Computer Science, pp. 107–119, Springer, Berlin, Heidelberg, 2003. View at: Publisher Site  Google Scholar
 C. Seiffert, T. Khoshgoftaar, J. Van Hulse, and A. Napolitano, “RUSBoost: a hybrid approach to alleviating class imbalance,” IEEE Transactions on Systems, Man, and Cybernetics  Part A: Systems and Humans, vol. 40, no. 1, pp. 185–197, 2010. View at: Publisher Site  Google Scholar
 H. Guo, H. Liu, C.A. Wu, W. Liu, and W. She, “Ensemble of rotation trees for imbalanced medical datasets,” in The International Conference on Healthcare Science and Engineering, Zhengzhou, China, 2017. View at: Google Scholar
 B. X. Wang and N. Japkowicz, “Boosting support vector machines for imbalanced data sets,” Knowledge and Information Systems, vol. 25, no. 1, pp. 1–20, 2010. View at: Publisher Site  Google Scholar
 N. V. Chawla, “Many are better than one: improving probabilistic estimates from decision trees,” in Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment, J. QuiñoneroCandela, I. Dagan, B. Magnini, and F. d’AlchéBuc, Eds., vol. 3944 of Lecture Notes in Computer Science, pp. 41–55, Springer, Berlin, Heidelberg, 2006. View at: Publisher Site  Google Scholar
 R. E. Banfield, L. O. Hall, K. W. Bowyer, and W. P. Kegelmeyer, “A comparison of decision tree ensemble creation techniques,” IEEE Transaction of Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 173–180, 2007. View at: Publisher Site  Google Scholar
 M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, “A review on ensembles for the class imbalance problem: bagging, boosting, and hybridbased approaches,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 42, no. 4, pp. 463–484, 2012. View at: Publisher Site  Google Scholar
 J. J. Rodriguez and L. I. Kuncheva, “Rotation forest: a new classifier ensemble method,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1619–1630, 2006. View at: Publisher Site  Google Scholar
 C. Su, S. Ju, Y. Liu, and Z. Yu, “Improving random forest and rotation forest for highly imbalanced datasets,” Intelligent Data Analysis, vol. 19, no. 6, pp. 1409–1432, 2015. View at: Publisher Site  Google Scholar
 M. Hosseinzadeh and M. Eftekharia, “Improving rotation forest performance for imbalanced data classification through fuzzy clustering,” in 2015 The International Symposium on Artificial Intelligence and Signal Processing (AISP), pp. 35–40, Mashhad, Iran, 2015. View at: Publisher Site  Google Scholar
 X. Fang, X. Zheng, Y. Tan, and H. Zhang, “Highly imbalanced classification using improved rotation forests,” International Journal of Wireless and Mobile Computing, vol. 10, no. 1, pp. 35–41, 2016. View at: Publisher Site  Google Scholar
 C. Yuan, X. Sun, and R. LV, “Fingerprint liveness detection based on multiscale LPQ and PCA,” China Communications, vol. 13, no. 7, pp. 60–65, 2016. View at: Publisher Site  Google Scholar
 F. J. Provost and T. Fawcett, “Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions,” in Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD97), pp. 43–48, Huntington Beach, CA, USA, 1997. View at: Google Scholar
 J. V. Hulse, T. M. Khoshgoftaar, and A. Napolitano, “Experimental perspectives on learning from imbalanced data,” in ICML '07 Proceedings of the 24th International Conference on Machine Learning, pp. 935–942, Corvalis, OR, USA, 2007. View at: Publisher Site  Google Scholar
 N. GarciaPddrajas, C. GarciaOsorio, and C. Fyfe, “Nonlinear boosting projections for ensemble construction,” Journal of Machine Learning Research, vol. 8, pp. 1–33, 2007. View at: Google Scholar
 L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, no. 2, pp. 123–140, 1996. View at: Publisher Site  Google Scholar
 J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.
 J. Demsar, “Statistical comparisons of classifiers over multiple data sets,” Journal of Machine Learning Research, vol. 6, pp. 1–30, 2006. View at: Google Scholar
 S. García, A. Fernández, J. Luengo, and F. Herrera, “Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power,” Information Sciences, vol. 180, no. 10, pp. 2044–2064, 2010. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Huaping Guo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.