Research Article

CharTeC-Net: An Efficient and Lightweight Character-Based Convolutional Network for Text Classification

Table 1

Large-scale text classification datasets used in our experiments.

Dataset#train#test#classes#classification task

AG’s news120 k7.6 k4English news categorization
Sogou news450 k60 k5Chinese news categorization
DBPedia560 k70 k14Ontology classification
Yelp review polarity560 k38 k2Sentiment analysis
Yelp review full650 k50 k5Sentiment analysis
Yahoo! Answers1,4 M60 k10Topic classification
Amazon Review Full3,0 M650 k5Sentiment analysis
Amazon Review Polarity3,6 M400 k2Sentiment analysis

Note: see [19] for detailed description.