Research Article
ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition
Table 3
Details of DDI corpus.
| Data set | Number of documents | Number of sentences | Number of NEs in each class | Total number of Named Entities | Drug | Group | Brand | Drug_n |
| Train | | | | | | | | DrugBank | 572 | 5675 | 8197 | 3206 | 1423 | 103 | 12929 | Medline | 142 | 1301 | 1228 | 193 | 14 | 401 | 1836 | Test | | | | | | | | DrugBank | 54 | 145 | 180 | 65 | 53 | 5 | 303 | Medline | 58 | 520 | 171 | 90 | 6 | 115 | 382 |
|
|