| Cross-layer parameter sharing | Paper | Performance | Decrease | Increase |
| All-shared | [15, 20, 29, 32–36, 40, 44, 46, 47, 51] | [20, 29, 32, 44, 46, 47, 51] | [15, 33–36, 40] | Shared-attention | [28] | [28] | — | Shared-FFN | [48] | [48] | — | Not-shared | [10, 11, 14, 27, 30, 31, 37–39, 41–43, 45, 49, 50, 52] | [10, 11, 14, 27, 30, 31, 37, 41–43, 45, 49, 50, 52] | [38] |
|
|