Research Article

An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention

Table 1

Number of sequence pairs and sequence length in IWSLT16 DE-EN, WMT17 EN-DE, and WMT17 EN-CS.

DatasetTrainEvalTestVocabTotal length of the train setMean length of the train set

IWSLT16 DE-EN181,14912,09811,82539,6453,766,897/3,826,03820/21
WMT17 EN-DE5,852,45730003003/2169/2999/300484,441148,665,327/149,176,20525/25
WMT17 EN-CS1,010,91830003003/2656/2999/300525,560/39,67027,825,644/26,152,29927/25

Note. Train, Eval, and Test represent the number of sequence pairs of different data subsets, respectively; length refers to the number of tokens in a sentence; total length of the train set is the total number of tokens in the training set; mean length of the train set is the ratio of the total length to the total sequence pairs in the training set.