Research Article
ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition
Table 7
NER performance (
-score in %) of classifiers using DrugBank and Medline corpora of DDI SemEval data set.
| Data set |
Tokenizer |
Classification algorithm | CRF | SVM |
| DrugBank | White space | 77.89 | 82.85 | ChemSpot | 87.16 | 89.10 | tmVar | 84.74 | 90.34 | ChemTok | 88.65 | 91.79 |
| Medline | White space | 51.51 | 42.41 | ChemSpot | 62.72 | 67.48 | tmVar | 62.04 | 67.50 | ChemTok | 64.88 | 68.51 |
|
|