Table of Contents Author Guidelines Submit a Manuscript
Mathematical Problems in Engineering
Volume 2013, Article ID 283129, 7 pages
Research Article

Identification of Protein Pupylation Sites Using Bi-Profile Bayes Feature Extraction and Ensemble Learning

1College of Computer Science and Information Technology, Northeast Normal University, 2555 Jingyue Street, Changchun 130117, China
2Key Laboratory of Intelligent Information Processing of Jilin Universities, Northeast Normal University, Changchun 130117, China

Received 13 July 2013; Accepted 1 August 2013

Academic Editor: William Guo

Copyright © 2013 Xiaowei Zhao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Pupylation, one of the most important posttranslational modifications of proteins, typically takes place when prokaryotic ubiquitin-like protein (Pup) is attached to specific lysine residues on a target protein. Identification of pupylation substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of pupylation. Comparing with the labor-intensive and time-consuming experiment approaches, computational prediction of pupylation sites is much desirable for their convenience and fast speed. In this study, a new bioinformatics tool named EnsemblePup was developed that used an ensemble of support vector machine classifiers to predict pupylation sites. The highlight of EnsemblePup was to utilize the Bi-profile Bayes feature extraction as the encoding scheme. The performance of EnsemblePup was measured with a sensitivity of 79.49%, a specificity of 82.35%, an accuracy of 85.43%, and a Matthews correlation coefficient of 0.617 using the 5-fold cross validation on the training dataset. When compared with other existing methods on a benchmark dataset, the EnsemblePup provided better predictive performance, with a sensitivity of 80.00%, a specificity of 83.33%, an accuracy of 82.00%, and a Matthews correlation coefficient of 0.629. The experimental results suggested that EnsemblePup presented here might be useful to identify and annotate potential pupylation sites in proteins of interest. A web server for predicting pupylation sites was developed.