Research Article

Improving Feature Representation Based on a Neural Network for Author Profiling in Social Media Texts

Table 9

Obtained results (accuracy, %) for age and gender classification on the PAN author profiling 2016 English training corpus under 10-fold cross-validation.

Feature set Age
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 44.7144.8441.7841.65
D2V (1 + 2-grams) 43.5344.3742.8242.49
D2V (1 + 2 + 3-grams) 41.4146.7140.7144.13
Character 3-grams 39.5341.7837.6542.96
Bag-of-Words 42.8239.4440.9439.91

Feature set Gender
LR-NPLR-WPSVM-NPSVM-WP

D2V (1-gram) 73.1875.5972.7170.66
D2V (1 + 2-grams) 73.4178.6471.5374.88
D2V (1 + 2 + 3-grams) 71.5377.4669.4176.76
Character 3-grams 68.4773.2469.6571.83
Bag-of-Words 69.1872.7767.7669.01