Research Article

Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection

Table 1

The measure results when the is applied to the when the top 100, 300, 500, 700, 900, 1100, 1300, 1500, 1700, and 1900 features are selected. In each column, the bold value indicates the best performance for each feature set when various feature selection methods are used, respectively. The “best” column presents the best performance that various feature selection methods can achieve, and the numbers in the parentheses are the corresponding sizes of feature sets.

Number of features 100 300 500 700 900 1100 1300 1500 1700 1900 best

72.32 74.04 73.30 73.34 70.96 71.79 71.97 71.49 71.27 71.21 74.60(2300)
72.02 74.56 74.14 72.50 71.53 72.21 71.77 71.51 71.21 72.18 74.80(2300)
71.89 73.91 74.45 74.49 74.46 74.25 73.70 72.73 71.56 71.99 75.04(2100)
72.40 74.22 74.60 73.49 74.15 74.01 72.63 72.77 72.58 73.22 75.00(2100)
72.40 74.01 73.51 73.61 74.62 75.11 75.55 75.11 75.01 74.66 75.55(1300)
72.16 74.11 74.64 74.70 74.82 75.41 75.81 75.51 74.70 74.18 75.81(1300)
72.03 74.52 74.9776.1476.6376.6677.0776.8176.60 75.96 77.07(1300)
72.15 74.77 74.90 76.09 76.55 76.60 76.97 76.65 76.47 76.16 76.97(1300)
72.55 74.57 74.87 75.99 76.35 76.47 76.90 75.50 75.38 75.81 76.90(1300)
72.13 74.53 74.90 76.06 76.58 76.61 76.84 76.60 76.51 76.15 76.84(1300)