Review Article

A Survey of Computational Intelligence Techniques in Protein Function Prediction

Table 1

Summary of computational intelligence (CI) techniques in prediction of binding sites.

Reference CI techniquesBinding sites/residuesPerformance Datasets

[23]ANNDNAAccuracy: 64%,  
sensitivity: 69%
Amino acid sequence composition, solvent accessibility, and secondary structure

[24]ANNDNAAccuracy: 73.6%Position specific scoring matrices (PSSM)

[25]SVM DNAAccuracy: 90%Surface and overall composition, overall charge and positive potential patches on the protein surface

[26]SVMDNAAccuracy: 82.30%Amino acid sequence, PSSM, and low-resolution structural information

[18]SVMDNAAccuracy: 77.2%, sensitivity: 76.4%, and specificity: 76.6%Position specific scoring matrices (PSSM)

[27]SVMDNAAccuracy: 96.6%, sensitivity: 90.7%Amino acid sequence, pseudoamino acid composition, autocross-covariance transforms, and dipeptide composition

[28]SVM DNAAccuracy: 80%, sensitivity: 85.1%, and specificity: 85.3%Normalized PSSM score, normalized solvent accessible surface area, and protein backbone structure

[29]SVMDNAMCC: 0.67, accuracy: 89.6%, sensitivity: 88.4%, and specificity: 90.8% PSSM, amino acid composition, hydrophobicity, polarity, polarizability, secondary structure, solvent accessibility, normalized Vander Waals volume, binding propensity, and nonbinding propensity

[30]Ensemble of ANN and SVMDNAAccuracy: 89.00% PSSM and structural features such as secondary structure, solvent accessibility, and globularity

[31]Random forestDNAAccuracy: 78.20%, sensitivity: 78.06%, and specificity: 78.22% PSSM with mean and standard of deviation side chain pKa value, hydrophobicity index, and molecular mass

[32]Random forest DNAAccuracy: 91.41%, MCC: 0.70, and AUC: 0.913PSSM, secondary structure information, and orthogonal binary vector information and two physical-chemical properties dipoles and volumes of the side chains

[33]Random forestDNAAccuracy: 83.96% Pseudoamino acid composition

[34]Gaussian Naive Bayes DNAAccuracy: 79.10% and MCC: 0.583PSSM, predicted secondary structure, predicted relative solvent accessibility

[35]Naive Bayes classifierRNAAccuracy: 85.00% Amino acid sequence, relative accessible surface area, sequence entropy, hydrophobicity, secondary structure, and electrostatic potential

[36]SVMRNAMCC: 0.31Amino acid sequence and PSSM

[37]SVMRNAAccuracy: 87.99%, sensitivity: 79.95%, and specificity: 90.36%Smoothed PSSM with the correlation and dependency from the neighboring residues

[38]SVMRNAAUC: 0.83PSSM, accessible surface area, between centrality and retention coefficient

[39]Random forestRNAMCC: 0.5637, accuracy: 88.63%, sensitivity: 53.70%, and specificity: 96.97% PSSM, physicochemical properties of amino acids, polarity-charge, and hydrophobicity

[40]SVMRNAAccuracy: 79.72% and  
MCC: 0.59
Protein sequence, amino acid composition, hydrophobicity, secondary structure, predicted solvent accessibility, normalized Vander Waals volume, polarity, and polarizability

[41]SVMrRNA, RNA, and DNArRNA accuracy: 84%, 
RNA accuracy: 78%, 
DNA accuracy: 72%
Protein sequence, amino acid composition, hydrophobicity, secondary structure, predicted solvent accessibility, normalized Vander Waals volume, polarity, and polarizability

[19]SVMDNA and RNA DNA sensitivity: 69.40%, specificity: 70.47%, and RNA sensitivity: 66.28%, and specificity: 69.84% Side chain pKa value, hydrophobicity index and molecular mass

[42]SVMDNA and RNAAccuracy: 79.00%, sensitivity: 77.30%, specificity: 79.30% for DNA, and accuracy: 77.70%, sensitivity: 71.60%, and specificity: 78.70% for RNA-binding residues PSSM with mean and standard of deviation side chain pKa value, hydrophobicity index, and molecular mass

[43]SVM Metal bindingAccuracy: 78.10% Physiochemical properties of the amino acid sequences

[44]Bayesian classifierZincSpecificity: 99.8%, sensitivity: 75.5%Structural properties of a protein

[45]Structural comparisonDNAAccuracy: 98% and precision: 84%Combination of structural comparison and the evaluation of statistical potential

[46]Structural comparison RNAAccuracy: 98%, precision: 91% for predicting RBPs, and accuracy: 93%  and precision: 78% for predicting RNA binding residuesDistance-scaled, finite, ideal gas reference based statistical energy function, and structural alignment