Research Article

Chemical Entity Recognition and Resolution to ChEBI

Table 2

Evaluation of entity recognition, subset of the gold standard composed by 9,696 chemical entities that contain a mapping to ChEBI. Results of entity identification (named entity recognition and resolution) for each alignment and method are shown in this table. The dictionary method recognized and mapped a total of 18,683 entities while the machine-learning method recognized and mapped 10,681 entities. True positives (TP) is the amount of entity recognitions that agree with the gold standard for each assessment. Values of precision, recall, and F-measure are presented.

Assessment Method TP Precision RecallF-measure

Exact matching Dictionary 5,651 30.25 58.28 38.83
Machine learning 5,830 54.60 60.1357.23
Left matching Dictionary 5,913 31.65 60.98 41.67
Machine learning 6,084 56.98 62.7559.72
Right matching Dictionary 6,158 32.96 63.51 43.40
Machine learning 5,948 55.70 61.3458.39
Left/right matching Dictionary 6,435 34.44 66.37 45.35
Machine learning 6,307 59.07 65.0561.91
Partial matching Dictionary 7,654 40.97 78.94 53.94
Machine learning 6,703 62.78 69.1365.80