Research Article
CharTeC-Net: An Efficient and Lightweight Character-Based Convolutional Network for Text Classification
Table 1
Large-scale text classification datasets used in our experiments.
| Dataset | #train | #test | #classes | #classification task |
| AG’s news | 120 k | 7.6 k | 4 | English news categorization | Sogou news | 450 k | 60 k | 5 | Chinese news categorization | DBPedia | 560 k | 70 k | 14 | Ontology classification | Yelp review polarity | 560 k | 38 k | 2 | Sentiment analysis | Yelp review full | 650 k | 50 k | 5 | Sentiment analysis | Yahoo! Answers | 1,4 M | 60 k | 10 | Topic classification | Amazon Review Full | 3,0 M | 650 k | 5 | Sentiment analysis | Amazon Review Polarity | 3,6 M | 400 k | 2 | Sentiment analysis |
|
|
Note: see [ 19] for detailed description. |