Research Article
Research on Uyghur-Chinese Neural Machine Translation Based on the Transformer at Multistrategy Segmentation Granularity
Table 1
Corpus data at different granularities.
| Different granularity | Training set | Test set | Number of sentences | Average sentence length | Number of sentences | Average sentence length |
| Syllable | 147434 | 81.58 | 1000 | 46.21 | Marked syllable | 147434 | 81.65 | 1000 | 46.36 | Words | 147434 | 82.64 | 1000 | 46.97 | Syllable word fusion | 147434 | 99.69 | 1000 | 57.35 |
|
|