Research Article

Improving Feature Representation Based on a Neural Network for Author Profiling in Social Media Texts

Table 10

Obtained results (accuracy, %) for age and gender classification on the PAN author profiling 2016 Spanish training corpus under 10-fold cross-validation.

Feature set Age
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 44.4046.0044.8043.60
D2V (1 + 2-grams) 47.2052.4046.4051.20
D2V (1 + 2 + 3-grams) 51.6056.0048.8057.20
Character 3-grams 50.8052.0048.0047.60
Bag-of-Words 48.0047.6044.0048.40

Feature set Gender
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 71.2068.0067.6064.80
D2V (1 + 2-grams) 69.6071.6068.0070.40
D2V (1 + 2 + 3-grams) 70.4075.6069.2073.60
Character 3-grams 68.0069.6061.6063.20
Bag-of-Words 66.4063.6058.8072.00