Research Article

An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention

Figure 4

Training time of each epoch of four models on the IWSLT16 DE-EN dataset.