Research Article

Improving Feature Representation Based on a Neural Network for Author Profiling in Social Media Texts

Table 6

Obtained results (accuracy, %) for age and gender classification on the PAN author profiling 2015 Spanish training corpus under 10-fold cross-validation.

Feature set Age
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 59.00 62.0062.00 60.00
D2V (1 + 2-grams) 59.00 65.0065.0069.00
D2V (1 + 2 + 3-grams) 62.00 66.00 64.00 66.00
Character 3-grams 66.0066.00 64.00 67.00
Bag-of-Words 65.00 62.00 62.00 60.00

Feature set Gender
LR-NP LR-WP SVM-NP SVM-WP

D2V (1-gram) 65.00 63.00 63.00 66.00
D2V (1 + 2-grams) 68.00 66.00 66.00 61.00
D2V (1 + 2 + 3-grams) 71.00 67.00 71.00 66.00
Character 3-grams 73.0073.0075.0074.00
Bag-of-
Words
72.00 71.00 73.00 72.00