Research Article

A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation

Table 5

Distribution of number of words and number of sentences in experimental corpora of 33,372 sentence pairs.

CorporaNSCLWL
NWNW/NSNWNW/NS

ChineseTraining30,036301,63010.0221,4197.4
Developing1,66816,97310.212,4687.5
Testing1,66817,04910.212,4537.5

VietnameseTraining30,036316,45310.5278,2329.3
Developing1,66817,83910.715,6799.4
Testing1,66817,74510.615,6179.4

NS is “number of sentences”, NW is “number of words”, and NW/NS is “NW per NS”.
We used BLEU score and TER score to evaluate the performance of the translation systems.