Research Article

An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention

Figure 2

Comparison between multihead attention (a) and our mechanism (b) with 4 heads.
(a)
(b)