Research Article

Chemical Entity Recognition and Resolution to ChEBI

Table 1

Evaluation of entity recognition, full gold standard of 18,061 chemical entities. Results of named entity recognition for each assessment and method are shown in this table. The dictionary method recognized a total of 18,683 entities while the machine-learning method recognized 13,832 entities. True positives (TP) is the amount of entity recognitions that agree with the gold standard for each assessment. Values of precision, recall, and F-measure are presented.

Assessment Method TP Precision RecallF-measure

Exact matching Dictionary 5,868 31.41 32.49 31.94
Machine learning 9,094 65.76 50.3557.03
Left matching Dictionary 6,868 36.76 38.03 37.38
Machine learning 9,892 71.53 54.7762.04
Right matching Dictionary 8,015 42.90 44.38 43.63
Machine learning 10,419 75.34 57.6965.34
Left/right matching Dictionary 9,015 48.25 49.91 49.07
Machine learning 11,217 81.11 62.1170.35
Partial matching Dictionary 12,780 68.40 70.76 69.56
Machine learning 12,328 89.15 68.2677.32