Research Article
A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation
Table 3
Distribution of number of words and number of sentences in experimental corpora of 11,000 sentence pairs.
| Corpora | NS | CL | WL | NW | NW/NS | NS | NW/NS |
| Chinese | Training | 9,900 | 99,026 | 10.0 | 72,541 | 7.3 | Developing | 550 | 5,645 | 10.3 | 4,138 | 7.5 | Testing | 550 | 5,598 | 10.2 | 4,092 | 7.4 |
| Vietnamese | Training | 9,900 | 107,153 | 10.8 | 93,909 | 9.5 | Developing | 550 | 6,151 | 11.2 | 5,401 | 9.8 | Testing | 550 | 5,985 | 10.9 | 5,272 | 9.6 |
|
|