Research Article

Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity

Table 1

Corpus data at different granularities.

Different granularityTraining setTest set
Number of sentencesAverage sentence lengthNumber of sentencesAverage sentence length

Syllable14743481.58100046.21
Marked syllable14743481.65100046.36
Words14743482.64100046.97
Syllable word fusion14743499.69100057.35