Table of Contents Author Guidelines Submit a Manuscript
International Journal of Genomics
Volume 2016 (2016), Article ID 7604641, 11 pages
Research Article

Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology

1School of Software, Tianjin University, Tianjin, China
2School of Computer Science and Technology, Tianjin University, Tianjin, China
3School of Information Science and Technology, Xiamen University, Xiamen, China
4College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China
5School of Computer Science and Technology, Heilongjiang University, Harbin, China
6State Key Laboratory of Medicinal Chemical Biology, NanKai University, Tianjin, China

Received 18 April 2016; Revised 24 May 2016; Accepted 14 June 2016

Academic Editor: Qin Ma

Copyright © 2016 Jieru Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics.