Research Article
Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora
Table 1
Number of words, number of texts, and average text length for the corpora used in the performance evaluation.
| Corpus | Number of words | Number of texts | Average text length (words) |
| 2012 CAN | 2,207,469 | 2,910 | 759 | 4M KACST ATCC | 4,356,509 | 5,939 | 734 | 7M KACST ATCC | 7,198,767 | 11,719 | 614 | KACST ATCC | 11,555,276 | 17,658 | 654 | KSUCCA | 50,602,412 | 410 | 123,421 |
|
|