Research Article

An Improved Transformer-Based Neural Machine Translation Strategy: Interacting-Head Attention

Figure 1

Associations between the head size of different subspaces.