Research Article

An Empirical Study of Different Approaches for Protein Classification

Table 1

Summarized description of the datasets (if available, the number of training and independent samples is given in column “number of samples”). The column BKB reports whether it is possible from the dataset to obtain the PDB of the proteins for extracting the backbone structure.

NameShort nameNumber of samplesNumber of classesProtocolBKB

Membrane subcellularMEM3249 + 43338Independent training and testing setsNO
Human pairsHU1882210-fold cross validationNO
Protein foldPF69827Independent training and testing setsYES
GPCRGP730210-fold cross validationNO
GRAMGR452510-fold cross validationNO
ViralVR112410-fold cross validationNO
CysteinesCY957310-fold cross validationYES
SubCellSC121310-fold cross validationYES
DNA-binding proteins DNA349210-fold cross validationYES
Enzyme ENZ1094610-fold cross validationYES
GO dataset GO168410-fold cross validationYES
Human interaction HI8161210-fold cross validationNO
Submitochondria locations SL317310-fold cross validationNO
Virulent independent set 1VI12055 + 832Independent training and testing setsNO
Virulent independent set 2VI22055 + 2842Independent training and testing setsNO
AdhesinsAD2055 + 11722Independent training and testing setsNO