Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2013, Article ID 414069, 7 pages
Research Article

Method for Rapid Protein Identification in a Large Database

1Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
2State Key Laboratory of Computer Architecture, ICT, CAS, Beijing 100190, China
3Graduate University of Chinese Academy of Sciences, Beijing 100049, China

Received 17 May 2013; Revised 10 July 2013; Accepted 14 July 2013

Academic Editor: Lei Chen

Copyright © 2013 Wenli Zhang and Xiaofang Zhao. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Protein identification is an integral part of proteomics research. The available tools to identify proteins in tandem mass spectrometry experiments are not optimized to face current challenges in terms of identification scale and speed owing to the exponential growth of the protein database and the accelerated generation of mass spectrometry data, as well as the demand for nonspecific digestion and post-modifications in complex-sample identification. As a result, a rapid method is required to mitigate such complexity and computation challenges. This paper thus aims to present an open method to prevent enzyme and modification specificity on a large database. This paper designed and developed a distributed program to facilitate application to computer resources. With this optimization, nearly linear speedup and real-time support are achieved on a large database with nonspecific digestion, thus enabling testing with two classical large protein databases in a 20-blade cluster. This work aids in the discovery of more significant biological results, such as modification sites, and enables the identification of more complex samples, such as metaproteomics samples.