Research Article

A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation

Table 3

Distribution of number of words and number of sentences in experimental corpora of 11,000 sentence pairs.

CorporaNSCLWL
NWNW/NSNSNW/NS

ChineseTraining9,90099,02610.072,5417.3
Developing5505,64510.34,1387.5
Testing5505,59810.24,0927.4

VietnameseTraining9,900107,15310.893,9099.5
Developing5506,15111.25,4019.8
Testing5505,98510.95,2729.6