Artificial Evolution Methods in the Biological and Biomedical SciencesView this Special Issue
Research Article | Open Access
Nicoletta Dessì, Barbara Pes, "An Evolutionary Method for Combining Different Feature Selection Criteria in Microarray Data Classification", Journal of Artificial Evolution and Applications, vol. 2009, Article ID 803973, 10 pages, 2009. https://doi.org/10.1155/2009/803973
An Evolutionary Method for Combining Different Feature Selection Criteria in Microarray Data Classification
The classification of cancers from gene expression profiles is a challenging research area in bioinformatics since the high dimensionality of microarray data results in irrelevant and redundant information that affects the performance of classification. This paper proposes using an evolutionary algorithm to select relevant gene subsets in order to further use them for the classification task. This is achieved by combining valuable results from different feature ranking methods into feature pools whose dimensionality is reduced by a wrapper approach involving a genetic algorithm and SVM classifier. Specifically, the GA explores the space defined by each feature pool looking for solutions that balance the size of the feature subsets and their classification accuracy. Experiments demonstrate that the proposed method provide good results in comparison to different state of art methods for the classification of microarray data.
- J. Khan, J. S. Wei, M. Ringnér et al., “Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,” Nature Medicine, vol. 7, no. 6, pp. 673–679, 2001.
- T. R. Golub, D. K. Slonim, P. Tamayo et al., “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring,” Science, vol. 286, no. 5439, pp. 531–527, 1999.
- I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, no. 1–3, pp. 389–422, 2002.
- I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Elsevier, Amsterdam, The Netherlands, 2nd edition, 2005.
- I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003.
- E. Pranckeviciene and R. Somorjai, “On classification models of gene expression mieroarrays: the simpler the better,” in Proceedings of the IEEE International Conference on Neural Networks, pp. 3572–3579, 2006.
- Y. S. Ong and A. J. Keane, “Meta-Lamarckian learning in memetic algorithms,” IEEE Transactions on Evolutionary Computation, vol. 8, no. 2, pp. 99–110, 2004.
- I.-S. Oh, J.-S. Lee, and B.-R. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1424–1437, 2004.
- H. Simon, “Supervised analysis when the number of candidate features (p) greatly exceeds the number of cases (n),” SIGKDD Explorations, vol. 5, no. 2, pp. 31–36, 2003.
- R. L. Somorjai, B. Dolenko, and R. Baumgartner, “Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions,” Bioinformatics, vol. 19, no. 12, pp. 1484–1491, 2003.
- A. L. Blum and P. Langley, “Selection of relevant features and examples in machine learning,” Artificial Intelligence, vol. 97, no. 1-2, pp. 245–271, 1997.
- H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491–502, 2005.
- R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1-2, pp. 273–324, 1997.
- Y. Saeys, I. Inza, and P. Larrañaga, “A review of feature selection techniques in bioinformatics,” Bioinformatics, vol. 23, no. 19, pp. 2507–2517, 2007.
- T. Jirapech-Umpai and S. Aitken, “Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes,” BMC Bioinformatics, vol. 6, article 148, 2005.
- S. Peng, Q. Xu, X. B. Ling, X. Peng, W. Du, and L. Chen, “Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines,” FEBS Letters, vol. 555, no. 2, pp. 358–362, 2003.
- E. B. Huerta, B. Duval, and J.-K. Hao, “A hybrid GA/SVM approach for gene selection and classification of microarray data,” in Proceedings of the EvoWorkshops, vol. 3907 of Lecture Notes in Computer Science, pp. 34–44, 2006.
- F. Tan, X. Fu, Y. Zhang, and A. G. Bourgeois, “Improving feature subset selection using a genetic algorithm for microarray gene expression data,” in Proceedings of the IEEE Congress on Evolutionary Computation (CEC '06), pp. 2529–2534, Vancouver, Canada, July 2006.
- L. Li, C. R. Weinberg, T. A. Darden, and L. G. Pedersen, “Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method,” Bioinformatics, vol. 17, no. 12, pp. 1131–1142, 2002.
- V. Bevilacqua, G. Mastronardi, F. Menolascina, A. Paradiso, and S. Tommasi, “Genetic algorithms and artificial neural networks in microarray data analysis: a distributed approach,” Engineering Letters, vol. 13, no. 3, pp. 335–343, 2006.
- A. R. Reddy and K. Deb, “Classification of two-class cancer data reliably using evolutionary algorithms,” Tech. Rep., KanGAL, 2003.
- J. Bins and B. A. Draper, “Feature selection from huge feature sets,” in Proceedings of the IEEE International Conference on Computer Vision, vol. 2, pp. 159–165, 2001.
- J. C. Platt, “Fast training of support vector machines using sequential minimal optimization,” in Advances Kernel Methods-Support Vector Learning, chapter 12, pp. 41–65, MIT Press, Cambridge, Mass, USA, 1998.
- M. Wall, “GAlib: A C++ Library of Genetic Algorithm Components,” Massachusetts Engineering Department, August 1996.
- T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16, no. 10, pp. 906–914, 2000.
- S. Chao and C. Lihui, “Feature dimension reduction for microarray data analysis using locally linear embedding,” in Proceedings of the Asia Pacific Bioinformatics Conference (APBC '05), pp. 211–217, 2005.
- Y. Wang, F. S. Makedon, J. C. Ford, and J. Pearlman, “HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data,” Bioinformatics, vol. 21, no. 8, pp. 1530–1537, 2005.
Copyright © 2009 Nicoletta Dessì and Barbara Pes. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.