Research Article

iSentenizer- : Multilingual Sentence Boundary Detection Model

Table 3

Number of abbreviations in corpora.

CorpusNumber of abbreviations (train)Number of abbreviations (test)

WSJ corpus 27,960 3,110
Brown corpus 644 158
Tycho Brahe corpus 382 8