Research Article

A Novel Approach for Protein-Named Entity Recognition and Protein-Protein Interaction Extraction

Table 5

Gain Ratio and ranking of the features and distribution of entities by features.

FeatureGain RatioRankingPro_NE (%)Not Pro_NE (%)Pro_NE/Not_Pro_NE

Sent_Uppercase0.08827124.315.71.55
Word_Num0.0839210.517.80.59
Word_Symbol0.05184337.316.72.23
Suffix_Letter0.045184
Word_Uppercase0.04284524.515.51.58
Suffix_Word0.03886
Prefix_Word0.038517
Prefix_Letter0.032548
Length0.024729
POS0.0210810
Word_Alphabet0.014111122.55.54.09
Biology_Word0.008331221.87.92.76