Review Article

Bidirectional Language Modeling: A Systematic Literature Review

Table 10

Overall data.

PaperNameTraining data sizeTokensTraining dataset nameModel typeSentence learningCross-layer parameter sharing

[17]RoBERTa (large)160 GB−2.2TBooksCorpus + Wikipedia + CC-News + OpenWebText + StoriesAutoencodingNoneFalse
[10]SpanBERT13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneFalse
[11]SemBERT13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneFalse
[14]ERNIE,9 GB+4.5B + 140MEnglish Wikipedia + WikidataAutoencodingNSPFalse
[15]ERNIE2.013 GB+8BEncyclopedia + BooksCorpus + Dialog + Discourse Relation DataAutoencodingNoneTrue
[27]BERT (base)13 GB3.8BBooksCorpus + WikipediaAutoencodingNSPFalse
[28]XLNet126 GB32.89BBooksCorpus + Wikipedia + Giga5+ ClueWeb 2012B + Common CrawlAutoencoding + autoregressiveNoneTrue
[29]UniLM13 GB3.8BBooksCorpus + WikipediaAutoencodingNSPTrue
[20]13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneTrue
[30]StructBERT16 GB2.5B+English Wikipedia + BooksCorpusAutoencodingNSPFalse
[31]TinyBERT13 GB3.8BBooksCorpus + WikipediaAutoencodingNSPFalse
[32]MT-DNN13 GB3.8BBooksCorpus + WikipediaAutoencodingNSPTrue
[33]AlBERT (xxlarge)16 GBBooksCorpus + WikipediaAutoencodingSopTrue
[34]Megatron-LM174 GBWikipedia + CC-Stories + Real News + OpenWebTextAutoencodingSOPTrue
[35]AlBERT (xxlarge-ensemble)16 GBBooksCorpus + WikipediaAutoencodingNoneTrue
[36]T529 TBColossal Clean Crawled CorpusAutoencodingNoneTrue
[37]SMARTRoBERTa160 GBBooksCorpus + Wikipedia + CC-News + OpenWebText + StoriesAutoencodingNoneFalse
[38]FreeLBRoBERTa160 GB−2.2TBooksCorpus + Wikipedia + CC-News + OpenWebText + StoriesAutoencodingNoneFalse
[39]SesameBERT13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneFalse
[40]Electra1.75M126 GB33BBooksCorpus + Wikipedia + Giga5+ ClueWeb 2012B + Common CrawlAutoencodingNoneTrue
[41]MiniLMa13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneFalse
[42]SBERT-WK13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneFalse
[43]PowerBERT11 GB3.4BBooksCorpus + WikipediaAutoencodingNoneFalse
[44]Bam13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneTrue
[45]StackBERT11 GB3.4BBooksCorpus + WikipediaAutoencodingNoneFalse
[46]MT-DNNKD13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneTrue
[47]HUBERT16 GB3.8BBooksCorpus + WikipediaAutoencodingNoneTrue
[48]AdaBERT13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneTrue
[49]BERTQA13 GB3.8BBooksCorpus + WikipediaAutoencodingNoneFalse
[50]BART (large)160 GB2.2T160 GB + WikipediaAutoregressiveNoneFalse
[51]Nezha10.5BChinese Wikipedia + Baidu Baike + Chinese NewsAutoencoding + autoregressiveNSPTrue
[52]UniLMv2160 GBBooksCorpus + Wikipedia + CC-News + OpenWebText + StoriesAutoencoding and partially autoregressiveNoneFalse