Table of Contents Author Guidelines Submit a Manuscript
Advances in Bioinformatics
Volume 2016, Article ID 1058305, 16 pages
http://dx.doi.org/10.1155/2016/1058305
Research Article

Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information

1Department of Computer Engineering, Imam Reza International University, Mashhad, Iran
2Department of Software Engineering, Islamic Azad University, Mashhad Branch, Mashhad, Iran

Received 28 November 2015; Revised 20 February 2016; Accepted 22 February 2016

Academic Editor: Pietro H. Guzzi

Copyright © 2016 Atiyeh Mortazavi and Mohammad Hossein Moattar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI) for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches.

1. Introduction

In feature selection, the features of the data sets are selected which are effective for predicting the target class. By eliminating additional features from data sets, the efficiency of the learning models dramatically increases. Genetic data sets have high dimensions and small size and are usually imbalanced. Increasing the high dimensions leads to classification complexity and it can reduce the classification accuracy. The small size of the data sets is another challenge [1]. Robustness issue is often ignored in feature selection. Increasing and decreasing the training samples in a nonrobust feature selection algorithm will lead to different results [2].

Feature selection methods are classified as filter methods, wrapper methods, and embedded methods [3]. Filtering methods are independent from learning algorithms. They are statistical tests which rely on the basic features of the training data and have much lower computational complexity compared with wrapper methods [4]. These methods use some measurements. Among these measurements are distance measurements [5, 6], rough set theory [7], and information theoretic measures [8]. A common distance measure is Euclidean distance, which was applied in Relief method by Kira and Rendell [5], which uses Euclidean distance to assign weights to each feature. Then, [6] developed Relief algorithm which adds the ability of dealing with multiclass problems. Peng et al. [9] proposed minimum redundancy maximum relevance (mRMR) approach, which selects features that have the highest relevance with the target class and are minimally redundant. Wang et al. [10] proposed a new filtering algorithm based on the maximum weight minimum redundancy (MWMR) criterion. The weight of the feature shows its importance and redundancy represents the correlation between the features. Nguyen et al. [11] proposed a hierarchical approach called Modified Analytic Hierarchy Process (MAHP) which uses five individual gene ranking methods including -test, entropy, Receiver Operating Characteristics (ROC) curve, Wilcoxon test, and Signal to Noise Ratio (SNR) for feature ranking and selection.

In the wrapper methods, each subset is evaluated using a specific learning algorithm, and optimal features are selected to improve classification performance [3]. For example, Inza et al. [12] proposed classical wrapper algorithms for this purpose (i.e., sequential forward and backward selection, floating selection, and best-first search) and evaluated them on three microarray data sets. Another wrapper approach for gene selection from microarray data is proposed in [13] using modified ant colony optimization. Another heuristic based wrapper approach is proposed in [14] which is based on Genetic Bee Colony (GBC) optimization. More referencing materials on wrapper methods can be found in [1518].

Similar to wrapper methods, embedded feature selection methods are classifier dependent but this relationship is stronger in embedded approaches. Guyon et al. [19] proposed Support Vector Machine (SVM) based on Recursive Feature Elimination (SVM-RFE) for feature selection in cancer classification. Canul-Reich et al. [20] introduced and applied Iterative Feature Perturbation (IFP) method, as an embedded gene selector, on four microarray data sets.

Robust feature selection algorithms are another source of researches. Yang and Mao [2] tried to improve the robustness of feature selection algorithms, with an ensemble method called Multicriterion Fusion-Based Recursive Feature Elimination (MCF-RFE). Also, Yassi and Moattar [1] presented robust and stable feature selection by integrating ranking methods and wrapper technique for genetic data classification.

The authors of [3, 21] tried to overcome the weaknesses of feature selection methods using cooperative game. They introduced a framework based on cooperative game theory to evaluate the power of each feature. To evaluate the weight of each of the features, the Banzhaf power index and the Shapley value index are used. Paper [3] used the Banzhaf power index in game theory for evaluating the weight of each feature, while paper [21] used the Shapley value index in game theory for evaluating the weight of each feature. Reference [22] presented a novel feature selection approach called Neighborhood Entropy-Based Cooperative Game Theory (NECGT) which was based on information theoretic approaches. The results of the evaluation of some UCI data sets showed that the approach yields better performance compared to classical methods in terms of accuracy.

Stability and robustness are important specially when the data is scarce and classes are imbalanced. In this paper, cooperative game theory is used for robust feature selection. In the proposed method, a cooperative game theory framework based on Shapley value is introduced for evaluating the power of each feature. To score the features in Shapley value index, we propose Qualitative Mutual Information criterion to achieve more stable results even in the presence of class imbalance and data scarcity. This criterion improves the robustness of feature selection algorithm. In this criterion, we use the Fisher ratio as the utility function in calculating Qualitative Mutual Information and this criterion leads to better robustness of the feature selection algorithm. The rest of this paper is organized as follows. Section 2 introduces the fundamental materials and methods. In Section 3, our proposed method is described. In Section 4, the simulation results and evaluation of the proposed method are discussed. Finally, conclusions and future work are described in Section 5.

2. Mutual Information, Qualitative Mutual Information, and Conditional Mutual Information

The entropy, , of discrete random variable with as its probability density function is defined as follows:Also, the mutual information (MI) between two random variables and with a joint probability distribution is formulated as follows [22]: Hence, Qualitative Mutual Information (QMI) is defined by multiplying a utility function to mutual information formula; the formula is as follows [23, 24]:Different informative functions can be applied as the utility function. In the proposed approach, we use the Fisher ratio [2]. Also, Conditional Mutual Information (CMI) noted as denotes the amount of information between and , when is given and is formulated as follows [21]:

3. Proposed Approach

The proposed method consists of three phases, which are described as follows. Figure 1 depicts the flowchart of the proposed feature selection algorithm.

Figure 1: Flowchart of the proposed feature selection algorithm.
3.1. Filter Approaches for Dimension Reduction

Due to high dimension of features of microarray data, in the first phase of the proposed method, the features are reduced by using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. Fisher’s ratio is an individual feature evaluation criterion. Fisher’s ratio measures the discriminative power of a feature by the ratio of interclass to intraclass variability. This relationship is as follows [2]:where is the sample mean of feature in class and is variance of feature in .

3.2. Feature Evaluation

In large data sets, there are intrinsic correlations among features. However, most of the filter-based feature selection algorithms discard redundant features that are highly correlated with the selected ones [21]. Each feature is weighted using information theory measures such as CMI and QMI [21].

Theoretically, the more relevancy of a feature means that it shares more information with the target class [21, 25]. Feature is to be redundant with feature if the following form is accepted [21]:Two features and are interdependent if the following form is accepted [21, 26]:The relevance criterion that is introduced by Peng et al. [9] is the relevance criterion of set with the target class; we haveThe change of the relevance of set of features with the target class considering the feature which is introduced by Jain et al. [27] is measured by the following formula:

3.3. Feature Evaluation via Shapley Value

In the second phase of the proposed method, after filtering the features, the weight of each feature is obtained using the Shapley value [28, 29] for all features. In this work, we used QMI in all our formulations. The Shapley value is an efficient way for estimating the importance of features weight. The Shapley value of the th feature is denoted by as is formulated as follows:where denotes the total number of features and is defined as follows:

And denotes the interdependence index and is defined as follows: which means that the feature will be appropriate if it increases the relevance of subset on the target class and is interdependent with at least half of the features [21]. Figure 2 shows the flowchart of feature evaluation. The output is a weight vector indicating the Shapley value of feature . In all the above relationships, we used QMI criterion for more robustness of selected features.

Figure 2: Feature evaluation flowchart.

In step 2 of the proposed algorithm, for each feature , we should calculate all subsets of features, so that these subsets do not contain feature . So at first we calculate all two member subsets of features. Then, using (9), Conditional Mutual Information measure between set and target class is calculated. Then, we calculate the Qualitative Mutual Information measure between each feature and class, namely, , and set these two measures in the interdependence index . Finally, Shapley value (weight) of each feature is computed according to (10).

3.4. Final Feature Selection Phase

In the third phase that is the final phase of the proposed method, the feature selection algorithm is performed using the weight of each feature. Accordingly, we use the weights of features to reevaluate the features. For this purpose, the features with the highest weights are used. A straightforward optimization is used by employing any information criterion, such as BIF [26], SU [30], and mRMR [9]. In this paper, we used SU criterion. The formula for obtaining this criterion is provided as follows:where indicates the entropy and MI is the mutual information between feature and class. The flowchart of this phase is specified in Figure 3.

Figure 3: The flowchart of the final feature selection.

To select the optimal feature in each iteration, the victory criterion is defined to evaluate the superiority of each feature on others. The criterion function (i.e., SU) is used to select feature by feature relevance or redundancy analysis. The weight , which denotes the impact of feature on the whole feature space, is used to regulate the relative importance of evaluation value in feature selection. Finally, we choose the features with the largest victory measure.

4. Experimental Results and Discussion

4.1. Data Set Descriptions

In this paper, we used the cancer microarray data sets from the Plymouth University [31]. For better performance and better evaluation of the proposed approach, we applied 11 high dimensional microarray data. These eleven data sets are briefly summarized in Table 1.

Table 1: Characteristics of the evaluation data sets.
4.2. Evaluation Criteria

The criteria used to evaluate the proposed method consist of accuracy, classification precision, -measure, and stability criterion for feature selection algorithm.

4.2.1. Accuracy of Classification

Classification accuracy is the main criterion for evaluating the classification and predicting the samples. The classification accuracy is as follows: Accuracy is described in terms of true positives (TP), true negatives (TN), false negatives (FN), and false positives (FP).

4.2.2. Precision, Recall, and -Measure

The precision in (15) and recall in (16) are two other evaluation criteria. -measure criterion in (17) is used to integrate precision and recall into a single criterion for the comparison:

4.2.3. Robustness Measure

The robustness (stability) of a feature selection algorithm can be evaluated based on the ability to select repeated features, given various categories under the same distribution [2]. Let and denote feature subsets selected using the th categories of resampled data and the full data, respectively. For measuring the similarity between two feature subsets, similarity index is used which is defined as follows:where is the number of common features between and , is the sum of the absolute correlation values between different features from and , and is set size of and . Assume categories of data are produced by resampling and feature subsets are selected. The robustness or stability measure of the feature selection algorithm is calculated by (19) as follows:

4.3. Classification Results

First, in preprocessing phase, the features of data sets are normalized, but the SRCBT data is not normalized because its features are small. Microarray data sets are classified using KNN, SVM, Naive Bayes (NB), and Classification and Regression Tree (CART) classifiers. The Gaussian kernel function and standard deviation equal to 20 are used for SVM classifier in this study. We used for KNN classifier, but since the classification result for is not good for Lung_Cancer data, we used for this data. Among 11 data sets, only two Prostate_Tumor and DLBCL data sets have two classes; therefore, precision and -measure criteria are obtained only for these two data sets and for other data sets only the accuracy criterion is examined. The 300 superior features of 11_Tumors, 14_Tumors, Leukemia1, Leukemia2, and DLBCL data sets are selected using mutual information criterion. For other data sets, Fisher ratio is applied for prior feature selection.

For comparison, the classification accuracy is given against the number of features. The -axis is the number of features which is considered up to 50 and the -axis is the classification accuracy. The proposed method is compared with feature selection using Fisher’s ratio, mRMR, and CGFS-SU. The mRMR algorithm is used as information criterion algorithm for comparison. Fisher’s ratio method is a univariate filter method for evaluating each feature individually and the CGFS-SU method is proposed in [21] for cooperative game based feature selection. For estimating the performance of classification algorithms, tenfold cross-validation is used. Figures 418 show the classification results for eleven data sets using KNN classifier.

Figure 4: KNN classification accuracy for 11_Tumors.
Figure 5: KNN classification accuracy for 14_Tumors.
Figure 6: KNN classification accuracy for 9_Tumors.
Figure 7: KNN classification accuracy for Brain_Tumor1.
Figure 8: KNN classification accuracy for Leukemia1.
Figure 9: KNN classification accuracy for Brain_Tumor2.
Figure 10: KNN classification accuracy for Leukemia2.
Figure 11: KNN classification accuracy for Lung _Cancer.
Figure 12: KNN classification accuracy for SRBCT.
Figure 13: KNN classification accuracy for Prostate_Tumor.
Figure 14: KNN classification precision for Prostate_Tumor.
Figure 15: KNN classification -measure for Prostate_Tumor.
Figure 16: KNN classification accuracy for DLBCL.
Figure 17: KNN classification precision for DLBCL.
Figure 18: KNN classification -measure for DLBCL.

KNN classifier results show that the proposed method has been improved in most cases compared to the other methods of feature selection. For 11_Tumors data, the proposed method has achieved the maximum accuracy of 74.73 percent. Also, the result of CGFS-SU method is almost the same as the proposed method and is 74.08 percent, but in 30 first features, its accuracy is less than the proposed method. In 14_Tumors data, the results are low for all feature selection algorithms. However, the proposed method has the maximum accuracy of 48.34 percent and other methods have much lower accuracy, which is due to the weak correlation between selected features. However, it can be observed that the proposed method achieves higher accuracy in all 50 features as compared to other methods.

The accuracy has been increased in SRBCT data, and the proposed method and Fisher and mRMR algorithms reached the maximum accuracy of 98.75 percent. Furthermore, the CGFS-SU method accuracy is 97.56 percent. In Prostate_Tumors data, it has been shown that the proposed method achieved maximum accuracy among all feature selection methods for all three evaluation criteria and for all cases. For DLBCL data, the proposed method achieved the maximum value for most cases for each of the three criteria, and it reached the highest accuracy of 93 percent, precision of 91 percent, and -measure of 88 percent compared to other methods. This shows that the proposed method could reduce interdependency between groups of features and is effective as compared with other algorithms. Figures 1933 show the classification results using SVM classifier on the evaluation data sets.

Figure 19: SVM classification accuracy for 11_Tumors.
Figure 20: SVM classification accuracy for 14_Tumors.
Figure 21: SVM classification accuracy for 9_Tumors.
Figure 22: SVM classification accuracy for Brain_Tumor1.
Figure 23: SVM classification accuracy for Brain_Tumor2.
Figure 24: SVM classification accuracy for Leukemia1.
Figure 25: SVM classification accuracy for Leukemia2.
Figure 26: SVM classification accuracy for Lung_Cancer.
Figure 27: SVM classification accuracy for SRBCT.
Figure 28: SVM classification accuracy for Prostate_Tumor.
Figure 29: SVM classification precision for Prostate_Tumor.
Figure 30: SVM classification -measure for Prostate_Tumor.
Figure 31: SVM classification accuracy for DLBCL.
Figure 32: SVM classification precision for DLBCL.
Figure 33: SVM classification -measure for DLBCL.

The results of most data sets with SVM classifier are better compared to KNN classifier. In SRBCT data, the proposed method reached the highest rate of accuracy of 100 percent for 10 to 40 features and also the mRMR method. But the CGFS-SU method has lower accuracy and the maximum accuracy is 97.63 percent. Also, Fisher criterion is not much accurate and its maximum accuracy is 97.5 percent. It may be due to the fact that the relationship between the features and target classes is maximum for this data set and the mRMR method and the proposed approach have been able to retain this relationship between features. Also, in DLBCL data, the proposed method is superior to the other approaches for all three criteria.

4.4. Stability Results

In the stability diagram, the -axis shows the number of features and the -axis shows the stability index. As denoted above, the stability index value is between 0 and 1. For this criterion, features are normalized except for the SRCBT data. Fisher ratio is applied to select the prior 300 best features of all data sets. For estimating the stability, we used tenfold cross-validation. The results of stability are observed in Figures 3444.

Figure 34: Stability results for 11-Tumors.
Figure 35: Stability results for 14_Tumors.
Figure 36: Stability results for 9_Tumors.
Figure 37: Stability results Brain_Tumor1.
Figure 38: Stability results for Brain_Tumor2.
Figure 39: Stability results for Leukemia1.
Figure 40: Stability results for Leukemia2.
Figure 41: Stability results for Lung_Cancer.
Figure 42: Stability results for SRCBT.
Figure 43: Stability results for Prostate_Tumor.
Figure 44: Stability results for DLBCL.

The stability values of the CGFS-SU and the proposed methods are very close to each other and are slightly different. But we have seen that the proposed method reached the maximum stability in 11_Tumors, 14_Tumors, Brain_Tumor2, and Lung_Cancer data sets. This shows that the proposed method is much more robust than the other feature selection approaches. However, the CGFS-SU method achieved the maximum stability only for Brain_Tumor2 data. The stability of other feature selection methods has significant differences with the CGFS-SU and the proposed method; that is, the stability of mRMR method is approximately equal to these methods only for Brain_Tumor2 data. Table 2 shows the maximum accuracy and average accuracy of feature selection methods on evaluation microarray data using KNN, SVM, NB, and CART classifiers.

Table 2: Maximum values of accuracy and average accuracy of feature selection methods on eleven microarray data sets.

According to Table 2, it can be seen that the proposed method achieved the highest accuracy among all the algorithms in most cases. Also, results of the average classification accuracy show that the proposed method has improved the average accuracy between 1 and 5 percent as compared with the CGFS-SU method, also between 2 and 14 percent as compared with the Fisher ratio method, and between 1 and 14 percent as compared with the minimum redundancy maximum relevance approach. In Table 3, average stability ± standard deviation of four feature selection methods is shown.

Table 3: Average stability ± standard deviation stability of the evaluated feature selection approaches.

In these experiments, if the average stability is high and the standard deviation is low, the approach is more robust. It is observed that, among the eleven microarray data sets, the stability of the proposed method is more than the other feature selection methods on eight data sets and the stability is less than the CGFS-SU method only on three data sets. Furthermore, the results of these experiments show that the proposed method has improved the average stability between 0.001 and 0.01 as compared with the CGFS-SU method, between 0.1 and 0.5 as compared with the Fisher ratio method, and between 0.01 and 0.3 as compared with the minimum redundancy maximum relevance method.

5. Conclusion and Future Work

This paper proposed a feature selection approach for robust gene selection of microarray data. In the proposed method, a cooperative game theory method is introduced to evaluate the weight of each feature considering the complex and inherent relationships among features and Qualitative Mutual Information (QMI) measure is used for more robust feature selection. The idea used for the stability is to use Fisher ratio as a utility in calculating QMI. The results on eleven evaluation microarray data sets show that the proposed method is an effective and stable method for reducing the dimensions of data and is able to reach relative improvement as compared to the other feature selection methods. As future works, we propose to calculate the weight of each feature using fuzzy Shapley value or fuzzy Banzhaf power index.

Competing Interests

The authors declare that they have no competing interests.

References

  1. M. Yassi and M. H. Moattar, “Robust and stable feature selection by integrating ranking methods and wrapper technique in genetic data classification,” Biochemical and Biophysical Research Communications, vol. 446, no. 4, pp. 850–856, 2014. View at Publisher · View at Google Scholar · View at Scopus
  2. F. Yang and K. Z. Mao, “Robust feature selection for microarray data based on multicriterion fusion,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 4, pp. 1080–1092, 2011. View at Publisher · View at Google Scholar · View at Scopus
  3. X. Sun, Y. Liu, J. Li, J. Zhu, H. Chen, and X. Liu, “Feature evaluation and selection with cooperative game theory,” Pattern Recognition, vol. 45, no. 8, pp. 2992–3002, 2012. View at Publisher · View at Google Scholar · View at Scopus
  4. V. Bolón-Canedo, N. Sánchez-Maroño, A. Alonso-Betanzos, J. M. Benítez, and F. Herrera, “A review of microarray datasets and applied feature selection methods,” Information Sciences, vol. 282, pp. 111–135, 2014. View at Publisher · View at Google Scholar · View at Scopus
  5. K. Kira and L. Rendell, “The feature selection problem: traditional methods and a new algorithm,” in Proceedings of the 10th National Conference on Artificial Intelligence (AAAI '92), pp. 129–134, AAAI Press, 1992.
  6. I. Kononenko, “Estimating attributes: analysis and extensions of relief,” in Machine Learning: ECML-94, F. Bergadano and L. De Raedt, Eds., vol. 784 of Lecture Notes in Computer Science, pp. 171–182, Springer, 1994. View at Publisher · View at Google Scholar
  7. K. Thangavel and A. Pethalakshmi, “Dimensionality reduction based on rough set theory: a review,” Applied Soft Computing Journal, vol. 9, no. 1, pp. 1–12, 2009. View at Publisher · View at Google Scholar · View at Scopus
  8. G. Brown, “A new perspective for information theoretic feature selection,” in Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS '09), pp. 49–56, Clearwater Beach, Fla, USA, 2009.
  9. H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226–1238, 2005. View at Publisher · View at Google Scholar · View at Scopus
  10. J. Wang, L. Wu, J. Kong, Y. Li, and B. Zhang, “Maximum weight and minimum redundancy: a novel framework for feature subset selection,” Pattern Recognition, vol. 46, no. 6, pp. 1616–1627, 2013. View at Publisher · View at Google Scholar · View at Scopus
  11. T. Nguyen, A. Khosravi, D. Creighton, and S. Nahavandi, “A novel aggregate gene selection method for microarray data classification,” Pattern Recognition Letters, vol. 60-61, pp. 16–23, 2015. View at Google Scholar
  12. I. Inza, B. Sierra, R. Blanco, and P. Larrañaga, “Gene selection by sequential search wrapper approaches in microarray cancer class prediction,” Journal of Intelligent & Fuzzy Systems, vol. 12, no. 1, pp. 25–33, 2002. View at Google Scholar · View at Scopus
  13. S. Tabakhi, A. Najafi, R. Ranjbar, and P. Moradi, “Gene selection for microarray data classification using a novel ant colony optimization,” Neurocomputing, vol. 168, pp. 1024–1036, 2015. View at Publisher · View at Google Scholar
  14. H. M. Alshamlan, G. H. Badr, and Y. A. Alohali, “Genetic bee colony (GBC) algorithm: a new gene selection method for microarray cancer classification,” Computational Biology and Chemistry, vol. 56, pp. 49–60, 2015. View at Publisher · View at Google Scholar
  15. R. Ruiz, J. C. Riquelme, and J. S. Aguilar-Ruiz, “Incremental wrapper-based gene selection from microarray data for cancer classification,” Pattern Recognition, vol. 39, no. 12, pp. 2383–2392, 2006. View at Publisher · View at Google Scholar · View at Scopus
  16. A. Sharma, S. Imoto, and S. Miyano, “A top-r feature selection algorithm for microarray gene expression data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 3, pp. 754–764, 2012. View at Publisher · View at Google Scholar · View at Scopus
  17. M. Wanderley, V. Gardeux, R. Natowicz, and A. Braga, “GA-KDE-BAYES: an evolutionary wrapper method based on non-parametric density estimation applied to bioinformatics problems,” in Proceedings of the 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN '13), pp. 155–160, 2013.
  18. J. Apolloni, G. Leguizamón, and E. Alba, “Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments,” Applied Soft Computing, vol. 38, pp. 922–932, 2016. View at Publisher · View at Google Scholar
  19. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, no. 1-3, pp. 389–422, 2002. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at Scopus
  20. J. Canul-Reich, L. O. Hall, D. Goldgof, J. N. Korecki, and S. Eschrich, “Iterative feature perturbation as a gene selector for microarray data,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 26, no. 5, Article ID 1260003, 25 pages, 2012. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  21. X. Sun, Y. Liu, J. Li, J. Zhu, X. Liu, and H. Chen, “Using cooperative game theory to optimize the feature selection problem,” Neurocomputing, vol. 97, pp. 86–93, 2012. View at Publisher · View at Google Scholar · View at Scopus
  22. K. Zeng, K. She, and X. Niu, “Feature selection with neighborhood entropy-based cooperative game theory,” Computational Intelligence and Neuroscience, vol. 2014, Article ID 479289, 10 pages, 2014. View at Publisher · View at Google Scholar
  23. Y. Lv, S. Liao, H. Shi, Y. Qian, and S. Ji, “QMIQPN: an enhanced QPN based on qualitative mutual information for reducing ambiguity,” Knowledge-Based Systems, vol. 71, pp. 114–125, 2014. View at Publisher · View at Google Scholar · View at Scopus
  24. H. Luan, F. Qi, and D. Shen, “Multi-modal image registration by quantitative-qualitative measure of mutual information (Q-MI),” in Computer Vision for Biomedical Image Applications, vol. 3765 of Lecture Notes in Computer Science, pp. 378–387, Springer, Berlin, Germany, 2005. View at Publisher · View at Google Scholar
  25. R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1-2, pp. 273–324, 1997. View at Publisher · View at Google Scholar · View at Scopus
  26. L. Yu and H. Liu, “Efficient feature selection via analysis of relevance and redundancy,” Journal of Machine Learning Research, vol. 5, pp. 1205–1224, 2004. View at Google Scholar
  27. A. K. Jain, R. P. W. Duin, and J. C. Mao, “Statistical pattern recognition: a review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4–37, 2000. View at Publisher · View at Google Scholar · View at Scopus
  28. S. Cohen, G. Dror, and E. Ruppin, “Feature selection via coalitional game theory,” Neural Computation, vol. 19, no. 7, pp. 1939–1961, 2007. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet · View at Scopus
  29. L. S. Shapley, “A value for n-person games,” in Contributions to the Theory of Games, vol. 2, no. 28 of Annals of Mathematics Studies, pp. 307–317, Princeton University Press, Princeton, NJ, USA, 1953. View at Google Scholar
  30. P. E. Meyer, C. Schretter, and G. Bontempi, “Information-theoretic feature selection in microarray data using variable complementarity,” IEEE Journal on Selected Topics in Signal Processing, vol. 2, no. 3, pp. 261–274, 2008. View at Publisher · View at Google Scholar · View at Scopus
  31. Microarray Cancers, Plymouth University, January 2014, http://www.tech.plym.ac.uk/spmc/links/bioinformatics/microarray/microarray_cancers.html.