Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids
Table 8
The number of host proteins shared by training (TR) and test (TS) datasets used for assessing the applicability of the SVM model to new viruses and to new hosts.
Dataset
TR1
TS1
TR1
TS2
TR1
TS3
TR1
TS4
TR1
TS5
#PPIs
638
515
638
30
638
377
638
319
638
1578
#Virus proteins
25
11
25
12
25
10
25
11
25
46
#Host proteins
499
424
499
27
499
307
499
298
499
1056
#Host proteins common to TR and TS
63 (14.9%)
5 (18.5%)
68 (22.1%)
22 (7.4%)
122 (11.6%)
Dataset
TR2
TS6
TR2
TS7
TR2
TS8
TR2
TS9
TR2
TS10
#PPIs
689
191
689
125
689
86
689
57
689
78
#Virus proteins
35
116
35
34
35
24
35
10
35
27
#Host proteins
522
141
522
87
522
79
522
38
522
64
#Virus proteins common to TR and TS
9 (7.8%)
1 (2.9%)
4 (16.7%)
0 (0.0%)
0 (0.0%)
The numbers in parentheses represent the proportion of common proteins to proteins in test datasets.