BioMed Research International
Volume 2016 (2016), Article ID 6802832, 10 pages
Research Article

ProFold: Protein Fold Classification with Additional Structural Features and a Novel Ensemble Classifier

1College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
2Shanghai University of Medicine & Health Sciences, Shanghai 201318, China

Received 1 June 2016; Revised 15 July 2016; Accepted 7 August 2016

Academic Editor: Dariusz Mrozek

Protein fold classification plays an important role in both protein functional analysis and drug design. The number of proteins in PDB is very large, but only a very small part is categorized and stored in the SCOPe database. Therefore, it is necessary to develop an efficient method for protein fold classification. In recent years, a variety of classification methods have been used in many protein fold classification studies. In this study, we propose a novel classification method called proFold. We import protein tertiary structure in the period of feature extraction and employ a novel ensemble strategy in the period of classifier training. Compared with existing similar ensemble classifiers using the same widely used dataset (DD-dataset), proFold achieves 76.2% overall accuracy. Another two commonly used datasets, EDD-dataset and TG-dataset, are also tested, of which the accuracies are 93.2% and 94.3%, higher than the existing methods. ProFold is available to the public as a web-server.