Research Article

Improving Feature Representation Based on a Neural Network for Author Profiling in Social Media Texts

Table 5

Obtained results (accuracy, %) for age and gender classification on the PAN author profiling 2015 English training corpus under 10-fold cross-validation.

Feature set Age
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 66.4566.4568.4269.73
D2V (1 + 2-grams) 71.0574.3471.0572.36
D2V (1 + 2 + 3-grams) 69.7370.3968.4270.39
Character 3-grams 65.7867.7666.4467.10
Bag-of-Words 65.7865.7865.1365.13

Feature set Gender
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 59.8766.4456.5769.07
D2V (1 + 2-grams) 63.1569.7361.8471.05
D2V (1 + 2 + 3-grams) 65.1369.0765.7871.71
Character 3-grams 57.2361.8459.2162.50
Bag-of-Words 60.5256.5761.8455.26