Table of Contents
International Journal of Proteomics
Volume 2014 (2014), Article ID 845479, 22 pages
Review Article

A Survey of Computational Intelligence Techniques in Protein Function Prediction

Department of Computer Science & Engineering, Indian Institute of Technology (BHU), Varanasi 221005, India

Received 10 September 2014; Revised 31 October 2014; Accepted 7 November 2014; Published 11 December 2014

Academic Editor: Yaoqi Zhou

Copyright © 2014 Arvind Kumar Tiwari and Rajeev Srivastava. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction.