Research Article

ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition

Table 7

NER performance (-score in %) of classifiers using DrugBank and Medline corpora of DDI SemEval data set.

Data set Tokenizer Classification algorithm
CRFSVM

DrugBankWhite space77.8982.85
ChemSpot87.1689.10
tmVar84.7490.34
ChemTok88.6591.79

MedlineWhite space51.5142.41
ChemSpot62.7267.48
tmVar62.0467.50
ChemTok64.8868.51