Review Article

Bidirectional Language Modeling: A Systematic Literature Review

Table 8

Cross-layer parameter sharing.

Cross-layer parameter sharingPaperPerformance
DecreaseIncrease

All-shared[15, 20, 29, 3236, 40, 44, 46, 47, 51][20, 29, 32, 44, 46, 47, 51][15, 3336, 40]
Shared-attention[28][28]
Shared-FFN[48][48]
Not-shared[10, 11, 14, 27, 30, 31, 3739, 4143, 45, 49, 50, 52][10, 11, 14, 27, 30, 31, 37, 4143, 45, 49, 50, 52][38]