Research Article
Heavyweight Statistical Alignment to Guide Neural Translation
Table 1
Some basic statistics of the datasets.
| English-Vietnamese | Training | Development | Testing |
| Sentence pairs | 42026 | 1482 | 1527 | Average lengths | 19.2–26.2 | 17.8–24.5 | 20.6–28.3 | Words | 806456–1099205 | 26315–36276 | 31513–43286 | Dictionaries | 36672–16441 | 4981–2720 | 6211–3462 |
|
|