Mathematical Problems in Engineering

Research Article

Improved Emotion Recognition Using Gaussian Mixture Model and Extreme Learning Machine in Speech and Glottal Signals

Table 1

Some of the significant works on speech emotion recognition.


Ref. number	Database	Signals	Number of emotions	Methods	Best result

[49]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	Nonlinear dynamic features + prosodic + spectral features + SVM classifier	82.72% (females) 85.90% (males)
[50]	BES	Speech signals	Neutral, fear, and anger	Nonlinear dynamic features + neural network	93.78%
[51]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	Modulation spectral features (MSFs) + multiclass SVM	85.60%
[30]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	Combination of spectral excitation source features + autoassociative neural network	82.16%
[27]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	Combination of utterancewise global and local prosodic features + SVM classifier	62.43%
[52]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	LPCCs + formants + GMM classifier	68%
[28]	BES	Speech signals	Anger, boredom, fear, happiness, sadness, and neutral	Discriminative band wavelet packet power coefficients (db-WPPC) with Daubechies filter of order 40 + GMM classifier	75.64%
[53]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	Low level audio descriptors and high level perceptual descriptors with linear SVM	87.7%
[54]	BES	Speech signals	Anger, boredom, disgust, fear, happiness, sadness, and neutral	MPEG-7 low level audio descriptors + SVM with radial basis function kernel	77.88%
[55]	SAVEE	Speech signals	Anger, surprise, sadness, happiness, fear, disgust, and neutral	Mel-frequency cepstral coefficients + signal energy + correlation based feature selection + SVM with radial basis function kernels	79%
[56]	SAVEE	Speech signals	Anger, surprise, sadness, happiness, fear, disgust, and neutral	Energy intensity + pitch + standard deviation + jitter + shimmer + NN	74.39%
[57]	SAVEE	Speech signals	Anger, surprise, sadness, happiness, fear, disgust, and neutral	Audio features + LDA feature reduction + single component Gaussian classifier	63%
[20]	SAVEE	Speech signals	Anger, surprise, sadness, happiness, fear, disgust, and neutral	Pitch + energy + duration + spectral + Gaussian classifier	59.2%