Journals
Publish with us
Publishing partnerships
About us
Blog
The Scientific World Journal
Journal overview
For authors
For reviewers
For editors
Table of Contents
Special Issues
The Scientific World Journal
/
2014
/
Article
/
Tab 4
/
Research Article
i
Sentenizer-
: Multilingual Sentence Boundary Detection Model
Table 4
Size of the Brown, WSJ, and Tycho Brahe corpora.
Corpus
Sentences
Tokens
Training data
Test data
WSJ corpus
41,977
4,671
1,153,993
Brown corpus
51,599
5,801
1,155,242
Tycho Brahe corpus
38,000
5,102
953,080