Particle Swarm for Attribute Selection in Bayesian Classification: An Application to Protein Function Prediction

Correa, Elon S.; Freitas, Alex A.; Johnson, Colin G.

doi:https://doi.org/10.1155/2008/876746

Journal of Artificial Evolution and Applications

On this page

Abstract References Copyright Related Articles

Special Issue

Particle Swarms: The Second Decade

View this Special Issue

Research Article | Open Access

Volume 2008 | Article ID 876746 | https://doi.org/10.1155/2008/876746

Particle Swarm for Attribute Selection in Bayesian Classification: An Application to Protein Function Prediction

Elon S. Correa,¹Alex A. Freitas,¹and Colin G. Johnson¹

Academic Editor: Jim Kennedy

Received29 Jul 2007

Revised26 Nov 2007

Accepted10 Jan 2008

Published18 Mar 2008

Abstract

The discrete particle swarm optimization (DPSO) algorithm is an optimization technique which belongs to the fertile paradigm of Swarm Intelligence. Designed for the task of attribute selection, the DPSO deals with discrete variables in a straightforward manner. This work empowers the DPSO algorithm by extending it in two ways. First, it enables the DPSO to select attributes for a Bayesian network algorithm, which is more sophisticated than the Naive Bayes classifier previously used by the original DPSO algorithm. Second, it applies the DPSO to a set of challenging protein functional classification data, involving a large number of classes to be predicted. The work then compares the performance of the DPSO algorithm against the performance of a standard Binary PSO algorithm on the task of selecting attributes on those data sets. The criteria used for this comparison are (1) maximizing predictive accuracy and (2) finding the smallest subset of attributes.

References

T. Blackwell and J. Branke, “Multi-swarm optimization in dynamic environments,” in Applications of Evolutionary Computing, vol. 3005 of Lecture Notes in Computer Science, pp. 489–500, Springer, New York, NY, USA, 2004.
View at: Google Scholar
S. Janson and M. Middendorf, “A hierarchical particle swarm optimizer for dynamic optimization problems,” in Proceedings of the 1st European Workshop on Evolutionary Algorithms in Stochastic and Dynamic Environments (EvoCOP '04), vol. 3005 of Lecture Notes in Computer Science, pp. 513–524, Springer, Coimbra, Portugal, April 2004.
View at: Google Scholar
M. Løvbjerg and T. Krink, “Extending particle swarm optimisers with self-organized criticality,” in Proceedings of the Congress on Evolutionary Computation (CEC '02), D. B. Fogel, M. A. El-Sharkawi, X. Yao et al., Eds., vol. 2, pp. 1588–1593, IEEE Press, Honolulu, Hawaii, USA, May 2002.
View at: Publisher Site | Google Scholar
M. M. Solomon, “Algorithms for the vehicle routing and scheduling problems with time window constraints,” Operations Research, vol. 35, no. 2, pp. 254–265, 1987.
View at: Google Scholar
E. S. Correa, A. A. Freitas, and C. G. Johnson, “A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set,” in Proceedings of the 8th Annual Conference Genetic and Evolutionary Computation (GECCO '06), M. Keijzer, M. Cattolico, D. Arnold et al., Eds., pp. 35–42, ACM Press, Seattle, Wash, USA, July 2006.
View at: Publisher Site | Google Scholar
E. S. Correa, M. T. A. Steiner, A. A. Freitas, and C. Carnieri, “A genetic algorithm for solving a capacity $p$ -median problem,” Numerical Algorithms, vol. 35, no. 2–4, pp. 373–388, 2004.
View at: Publisher Site | Google Scholar | MathSciNet
A. A. Freitas, Data Mining and Knowledge Discovery with Evolutionary Algorithms, Springer, Berlin, Germany, 2002.
J. Kennedy and R. C. Eberhart, Swarm Intelligence, Morgan Kaufmann, San Francisco, Calif, USA, 2001.
E. S. Correa, A. A. Freitas, and C. G. Johnson, “Particle swarm and Bayesian networks applied to attribute selection for protein functional classification,” in Proceedings of the 9th Annual Genetic and Evolutionary Computation Conference (GECCO '07), pp. 2651–2658, London, UK, July 2007.
View at: Publisher Site | Google Scholar
T. M. Mitchell, Machine Learning, McGraw-Hill, London, UK, 1997.
F. V. Jensen, Bayesian Networks and Decision Graphs, Springer, New York, NY, USA, 1st edition, 2001.
J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Francisco, Calif, USA, 1st edition, 1988.
S. L. Lauritzen and D. J. Spiegelhalter, “Local computations with probabilities on graphical structures and their application to expert systems,” Journal of the Royal Statistics Society, vol. 50, no. 2, pp. 157–224, 1988.
View at: Google Scholar
P. Larrañaga, R. Etxeberria, J. A. Lozano, B. Sierra, I. Naki Inza, and J. M. Peña, “A review of the cooperation between evolutionary computation and probabilistic models,” in Proceedings of the 2nd International Symposium on Artificial Intelligence and Adaptive Systems (CIMAF '99), pp. 314–324, La Havana, Cuba, March 1999.
View at: Google Scholar
J. M. Peña, J. A. Lozano, and P. Larrañaga, “Globally multimodal problem optimization via an estimation of distribution algorithm based on unsupervised learning of Bayesian networks,” Evolutionary Computation, vol. 13, no. 1, pp. 43–66, 2005.
View at: Publisher Site | Google Scholar
R. R. Bouckaert, “Properties of Bayesian belief network learning algorithms,” in Proceedings of the 10th Annual Conference on Uncertainty in Artificial Intelligence (UAI '94), I. R. L. de Mantaras and E. D. Poole, Eds., pp. 102–109, Morgan Kaufmann, Seattle, Wash, USA, July 1994.
View at: Google Scholar
D. M. Chickering, D. Geiger, and D. Heckerman, “Learning Bayesian networks is NP-hard,” Tech. Rep. MSR-TR-94-17, Microsoft Research, Redmond, Wash, USA, November 1994.
View at: Google Scholar
J. Kennedy and R. C. Eberhart, “A discrete binary version of the particle swarm algorithm,” in Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC '97), vol. 5, pp. 4104–4109, IEEE, Orlando, Fla, USA, October 1997.
View at: Publisher Site | Google Scholar
J. Kennedy, “Small worlds and mega-minds: effects of neighborhood topology on particle swarm performance,” in Proceedings of the Congress of Evolutionary Computation, P. J. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, and A. Zalzala, Eds., vol. 3, pp. 1931–1938, IEEE Press, Washington, DC, USA, July 1999.
View at: Publisher Site | Google Scholar
G. Kendall and Y. Su, “A particle swarm optimisation approach in the construction of optimal risky portfolios,” in Proceedings of the IASTED International Conference on Artificial Intelligence and Applications, part of the 23rd Multi-Conference on Applied Informatics, pp. 140–145, Innsbruck, Austria, February 2005.
View at: Google Scholar
R. Poli, C. D. Chio, and W. B. Langdon, “Exploring extended particle swarms: a genetic programming approach,” in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '05), pp. 169–176, ACM Press, Washington, DC, USA, June 2005.
View at: Google Scholar
D. Filmore, “It's a GPCR world,” Modern Drug Discovery, vol. 11, no. 7, pp. 24–28, 2004.
View at: Google Scholar
N. Holden and A. A. Freitas, “Hierarchical classification of G-protein-coupled receptors with a PSO/ACO algorithm,” in Proceedings of the IEEE Swarm Intelligence Symposium (SIS '06), pp. 77–84, IEEE Press, Indianapolis, Ind, USA, May 2006.
View at: Google Scholar
N. Holden and A. A. Freitas, “A hybrid particle swarm/ant colony algorithm for the classification of hierarchical biological data,” in Proceedings of the IEEE Swarm Intelligence Symposium (SIS '05), pp. 100–107, IEEE Press, Pasadena, Calif, USA, June 2005.
View at: Publisher Site | Google Scholar
I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, San Francisco, Calif, USA, 2nd edition, 2005.
Y. Shi and R. C. Eberhart, “Parameter selection in particle swarm optimization,” in Proceedings of the 7th International Conference on Evolutionary Programming (EP '98), pp. 591–600, Springer, San Diego, Calif, USA, March 1998.
View at: Google Scholar
G. L. Pappa, A. J. Baines, and A. A. Freitas, “Predicting post-synaptic activity in proteins with data mining,” Bioinformatics, vol. 21, 2, pp. ii19–ii25, 2005.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2008 Elon S. Correa et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

242

Downloads

0

Citations