Research Article

Nonlinear Dynamic Feature Extraction Based on Phase Space Reconstruction for the Classification of Speech and Emotion

Table 7

Four types of features used to obtain a confusion matrix for the mixed language emotional speech recognition task.

Feature typeEmotional stateCASIA-ChineseBerlin-GermanAverage

ProsodyHappiness42.6566.6754.66
Sadness52.9470.0064.47
Neutral48.5369.2358.88
Anger69.1262.9666.04
Fear44.1217.3933.76
Average51.4757.5054.49

MFCCHappiness55.8854.1755.03
Sadness52.9480.0066.47
Neutral75.0076.9275.96
Anger69.1285.1977.16
Fear38.2634.7836.52
Average58.2466.6762.46

NLD-1Happiness50.0054.1752.09
Sadness47.0660.0053.53
Neutral76.4780.7778.62
Anger73.5377.7876.65
Fear44.1269.5756.85
Average55.2969.1762.23

NLD-2Happiness52.9470.8361.89
Sadness48.5380.0064.27
Neutral44.1269.2356.68
Anger79.4185.1982.30
Fear54.1173.9164.01
Average55.8875.8365.86

Prosody+MFCC+NLDHappiness75.0072.0673.53
Sadness76.0972.0674.08
Neutral77.7873.5375.66
Anger86.9679.4183.19
Fear79.1770.5974.88
Average79.1773.5376.35