Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2016 (2016), Article ID 4525786, 5 pages
http://dx.doi.org/10.1155/2016/4525786
Research Article

Positive-Unlabeled Learning for Pupylation Sites Prediction

1School of Electronic Engineering, Dongguan University of Technology, Dongguan 523808, China
2School of Control Science and Engineering, Dalian University of Technology, Dalian 116024, China

Received 11 May 2016; Revised 26 June 2016; Accepted 5 July 2016

Academic Editor: Qin Ma

Copyright © 2016 Ming Jiang and Jun-Zhe Cao. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Pupylation plays a key role in regulating various protein functions as a crucial posttranslational modification of prokaryotes. In order to understand the molecular mechanism of pupylation, it is important to identify pupylation substrates and sites accurately. Several computational methods have been developed to identify pupylation sites because the traditional experimental methods are time-consuming and labor-sensitive. With the existing computational methods, the experimentally annotated pupylation sites are used as the positive training set and the remaining nonannotated lysine residues as the negative training set to build classifiers to predict new pupylation sites from the unknown proteins. However, the remaining nonannotated lysine residues may contain pupylation sites which have not been experimentally validated yet. Unlike previous methods, in this study, the experimentally annotated pupylation sites were used as the positive training set whereas the remaining nonannotated lysine residues were used as the unlabeled training set. A novel method named PUL-PUP was proposed to predict pupylation sites by using positive-unlabeled learning technique. Our experimental results indicated that PUL-PUP outperforms the other methods significantly for the prediction of pupylation sites. As an application, PUL-PUP was also used to predict the most likely pupylation sites in nonannotated lysine sites.