Research Article

A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation

Table 4

Distribution of number of words and number of sentences in experimental corpora of 22,000 sentence pairs.

CorporaNSCLWL
NWNW/NSNWNW/NS

ChineseTraining19,800196,9039.9144,4757.3
Developing1,10011,29210.38,2377.5
Testing1,10011,05610.18,0907.4

VietnameseTraining19,800211,17910.7185,3469.4
Developing1,10012,02810.910,5349.6
Testing1,10011,80310.710,3769.4