Research Article
A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation
Table 4
Distribution of number of words and number of sentences in experimental corpora of 22,000 sentence pairs.
| Corpora | NS | CL | WL | NW | NW/NS | NW | NW/NS |
| Chinese | Training | 19,800 | 196,903 | 9.9 | 144,475 | 7.3 | Developing | 1,100 | 11,292 | 10.3 | 8,237 | 7.5 | Testing | 1,100 | 11,056 | 10.1 | 8,090 | 7.4 |
| Vietnamese | Training | 19,800 | 211,179 | 10.7 | 185,346 | 9.4 | Developing | 1,100 | 12,028 | 10.9 | 10,534 | 9.6 | Testing | 1,100 | 11,803 | 10.7 | 10,376 | 9.4 |
|
|