Review Article

A Comprehensive Survey of Abstractive Text Summarization Based on Deep Learning

Table 2

The statistics of DUC/TAC datasets.

Dataset#DocumentLanguage#Ground-truth summarySummary length

DUC 200160 × 10Eng.3 per cluster50, 100, 200, 400 tokens
DUC 200260 × 10Eng.12810, 50, 100, 200 tokens
DUC 200360 × 10, 30 × 25Eng.128200, 400 tokens
DUC 2004100 × 10Ara. & Eng.4 per cluster100 tokens
DUC 200550 × 32Eng.4 per cluster665 bytes
DUC 200650 × 25Eng.4 per cluster250 tokens
DUC 200725 × 10Eng.4 per cluster250 tokens
TAC 200848 × 20Eng.4 per cluster250 tokens
TAC 200944 × 20Eng.4 per cluster250 tokens
TAC 201046 × 20Eng.8 per cluster100 tokens
TAC 201144 × 20Eng.8 per cluster100 tokens