Research Article

An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention

Table 9

Overall evaluation scores of interacting-head attention on IWSLT16 DE-EN evaluation set and test set.

ModelDatasetSubsetNumber of heads/head size
32/1664/8

Interacting-head attentionIWSLT16dev24.3819.98
Test22.8518.54

Note. The unit of the performance is BLEU.