Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2014 (2014), Article ID 103054, 12 pages
Research Article

Protein Sequence Classification with Improved Extreme Learning Machine Algorithms

1Institute of Information and Control, Hangzhou Dianzi University, Zhejiang 310018, China
2School of Mathematics and Computer Science, Yunnan University of Nationalities, Kunming 650500, China
3School of Mathematics and Statistics, Yunnan University, Kunming 650091, China

Received 17 December 2013; Revised 15 February 2014; Accepted 16 February 2014; Published 30 March 2014

Academic Editor: Tao Huang

Copyright © 2014 Jiuwen Cao and Lianglin Xiong. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Precisely classifying a protein sequence from a large biological protein sequences database plays an important role for developing competitive pharmacological products. Comparing the unseen sequence with all the identified protein sequences and returning the category index with the highest similarity scored protein, conventional methods are usually time-consuming. Therefore, it is urgent and necessary to build an efficient protein sequence classification system. In this paper, we study the performance of protein sequence classification using SLFNs. The recent efficient extreme learning machine (ELM) and its invariants are utilized as the training algorithms. The optimal pruned ELM is first employed for protein sequence classification in this paper. To further enhance the performance, the ensemble based SLFNs structure is constructed where multiple SLFNs with the same number of hidden nodes and the same activation function are used as ensembles. For each ensemble, the same training algorithm is adopted. The final category index is derived using the majority voting method. Two approaches, namely, the basic ELM and the OP-ELM, are adopted for the ensemble based SLFNs. The performance is analyzed and compared with several existing methods using datasets obtained from the Protein Information Resource center. The experimental results show the priority of the proposed algorithms.