Table of Contents Author Guidelines Submit a Manuscript
The Scientific World Journal
Volume 2013, Article ID 162093, 13 pages
http://dx.doi.org/10.1155/2013/162093
Research Article

Recognition of Emotions in Mexican Spanish Speech: An Approach Based on Acoustic Modelling of Emotion-Specific Vowels

Technological University of the Mixteca, Road to Acatlima K.m. 2.5, 69000 Huajuapan de León, OAX, Mexico

Received 30 March 2013; Accepted 6 June 2013

Academic Editors: R. J. Ferrari and S. Wu

Copyright © 2013 Santiago-Omar Caballero-Morales. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis et al., “Emotion recognition in human-computer interaction,” IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 32–80, 2001. View at Publisher · View at Google Scholar · View at Scopus
  2. B. Schuller, G. Rigoll, and M. Lang, “Hidden Markov model-based speech emotion recognition,” in Proceedings of the International Conference on Multimedia and Expo (ICME '03), vol. 1, pp. 401–404, 2003.
  3. S. Emerich and E. Lupu, “Improving speech emotion recognition using frequency and time domain acoustic features,” in Proceedings of the Signal Processing and Applied Mathematics for Electronics and Communications (SPAMEC '11), pp. 85–88, Cluj-Napoca, Romania, 2011.
  4. Y. Li and Y. Zhao, “Recognizing emotions in speech using short-term and long-term features,” in Proceedings of the International Conference on Spoken Language Processing (ICSLP '98), pp. 1–4, 1998.
  5. C. E. Williams and K. N. Stevens, “Emotions and speech: some acoustical correlates,” Journal of the Acoustical Society of America, vol. 52, no. 4, pp. 1238–1250, 1972. View at Google Scholar · View at Scopus
  6. H. Levin and W. Lord, “Speech pitch frequency as an emotional state indicator,” IEEE Transactions on Systems, Man and Cybernetics, vol. 5, no. 2, pp. 259–273, 1975. View at Publisher · View at Google Scholar · View at Scopus
  7. T. L. Nwe, S. W. Foo, and L. C. De Silva, “Speech emotion recognition using hidden Markov models,” Speech Communication, vol. 41, no. 4, pp. 603–623, 2003. View at Publisher · View at Google Scholar · View at Scopus
  8. D. Ververidis and C. Kotropoulos, “Emotional speech recognition: resources, features, and methods,” Speech Communication, vol. 48, no. 9, pp. 1162–1181, 2006. View at Publisher · View at Google Scholar · View at Scopus
  9. F. J. Tolkmitt and K. R. Scherer, “Effect of experimentally induced stress on vocal parameters,” Journal of Experimental Psychology, vol. 12, no. 3, pp. 302–313, 1986. View at Google Scholar
  10. D. J. France, R. G. Shiavi, S. Silverman, M. Silverman, and M. Wilkes, “Acoustical properties of speech as indicators of depression and suicidal risk,” IEEE Transaction Biomedical Engineering, vol. 7, pp. 829–837, 2000. View at Google Scholar
  11. L. Deng and D. O'Shaughnessy, Speech Processing: A Dynamic and Optimization-Oriented Approach, Marcel Dekker, New York, NY, USA, 2003.
  12. S. B. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 28, no. 4, pp. 357–366, 1980. View at Google Scholar · View at Scopus
  13. J. Wagner, T. Vogt, and E. Andr, “A systematic comparison of different HMM designs for emotion recognition from acted and spontaneous speech,” in Affective Computing and Intelligent Interaction, vol. 4738 of Lecture Notes in Computer Science, pp. 114–125, Springer, Berlin, Germany, 2007. View at Publisher · View at Google Scholar
  14. A. B. Kandali, A. Routray, and T. K. Basu, “Emotion recognition from Assamese speeches using MFCC features and GMM classifier,” in Proceedings of the IEEE Region 10 Conference (TENCON '08), Hyderabad, India, November 2008. View at Publisher · View at Google Scholar · View at Scopus
  15. S. Wu, T. H. Falk, and W.-Y. Chan, “Automatic recognition of speech emotion using long-term spectro-temporal features,” in Proceedings of the 16th International Conference on Digital Signal Processing (DSP '09), Santorini-Hellas, Greece, July 2009. View at Publisher · View at Google Scholar · View at Scopus
  16. S. Emerich, E. Lupu, and A. Apatean, “Emotions recognition by speech and facial expressions analysis,” in Proceedings of the 17th European Signal Processing Conference (EUSIPCO '09), pp. 1617–1621, 2009.
  17. B. D. Womack and J. H. L. Hansen, “Classification of speech under stress using target driven features,” Speech Communication, vol. 20, no. 1-2, pp. 131–150, 1996. View at Google Scholar · View at Scopus
  18. R. Tato, “Emotional space improves emotion recognition,” in Proceedings of the International Conference on Spoken Language Processing (ICSLP '02), vol. 3, pp. 2029–2032, 2002.
  19. R. Fernandez and R. W. Picard, “Modeling drivers' speech under stress,” Speech Communication, vol. 40, no. 1-2, pp. 145–159, 2003. View at Publisher · View at Google Scholar · View at Scopus
  20. K. Alter, E. Rank, and S. A. Kotz, “Accentuation and emo-tions—two different systems?” in Proceedings of the ISCA Workshop on Speech and Emotion, vol. 1, pp. 138–142, 2000.
  21. A. Batliner, C. Hacker, S. Steidl et al., “‘you stupid tin box’ children interacting with the AIBO robot: a cross-linguistic emotional speech corpus,” in Proceedings of the 4th International Conference of Language Resources and Evaluation (LREC '04), pp. 171–174, 2004.
  22. F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, and B. Weiss, “A database of German emotional speech,” in Proceedings of the 9th European Conference on Speech Communication and Technology, pp. 1517–1520, Lisbon, Portugal, September 2005. View at Scopus
  23. M. Grimm, K. Kroschel, and S. Narayanan, “The Vera am Mittag German audio-visual emotional speech database,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME '08), pp. 865–868, Hannover, Germany, June 2008. View at Publisher · View at Google Scholar · View at Scopus
  24. P. Eckman, “An argument for basic emotions,” Cognition and Emotion, vol. 6, pp. 169–200, 1992. View at Google Scholar
  25. F. Yu, E. Chang, Y. Q. Xu, and H. Y. Shum, “Emotion detection from speech to enrich multimedia content,” in Proceedings of the IEEE Pacific Rim Conference on Multimedia, vol. 1, pp. 550–557, Shanghai, China, 2001.
  26. S. Yildirim, M. Bulut, C. M. Lee et al., “An acoustic study of emotions expressed in speech,” in Proceedings of the International Conference on Spoken Language Processing (ICSLP '04), vol. 1, pp. 2193–2196, 2004.
  27. C. M. Lee, S. Yildirim, M. Bulut et al., “Emotion recognition based on phoneme classes,” in Proceedings of the International Conference on Spoken Language Processing (ICSLP '04), vol. 1, pp. 889–892, 2004.
  28. J. Pribil and A. Pribilov, “Spectral properties and prosodic parameters of emotional speech in Czech and Slovak,” in Speech and Language Technologies, pp. 175–200, InTech, 2011. View at Google Scholar
  29. E. Uraga and L. Pineda, “Automatic generation of pro-nunciation lexicons for Spanish,” in Proceedings of the International Conference on Computational Linguistics and Intelligent Text Processing (CICLing '02), A. Gelbukh, Ed., pp. 300–308, Springer, 2002.
  30. J. Cuetara, Fonetica de la ciudad de Mexico: aporta- ciones desde las tecnologias del habla [MSc. Dissertation], National Autonomous University of Mexico (UNAM), Mexico, 2004.
  31. A. Li, Q. Fang, F. Hu, L. Zheng, H. Wang, and J. Dang, “Acoustic and articulatory analysis on mandarin chinese vowels in emotional speech,” in Proceedings of the 7th International Symposium on Chinese Spoken Language Processing (ISCSLP '10), pp. 38–43, Tainan, Taiwan, December 2010. View at Publisher · View at Google Scholar · View at Scopus
  32. B. Vlasenko, D. Prylipko, D. Philippou-Hiibner, and A. Wendemuth, “Vowels formants analysis allows straight-forward detection of high arousal acted and spontaneous emotions,” in Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech '11), pp. 1577–1580, Florence, Italy, 2011.
  33. J. H. L. Hansen and B. D. Womack, “Feature Analysis and Neural Network-Based Classification of Speech under Stress,” IEEE Transactions on Speech and Audio Processing, vol. 4, no. 4, pp. 307–313, 1996. View at Publisher · View at Google Scholar · View at Scopus
  34. S. Young and P. Woodland, The HTK Book, (for HTK Version 3.4), Cambridge University Engineering Department, UK, 2006.
  35. D. Jurafsky and J. H. Martin, Speech and Language Pro-Cessing, Pearson Prentice Hall, New Jersey, NJ, USA, 2009.
  36. B. D. Womack and J. H. Hansen, “N-channel hidden Markov models for combined stressed speech classification and recognition,” IEEE Transaction on Speech and Audio Processing, vol. 7, no. 6, pp. 668–677, 1999. View at Google Scholar
  37. L. Pineda, L. Villaseñor, J. Cuétara et al., “The corpus DIMEX100: transcription and evaluation,” Language Resources and Evaluation, vol. 44, pp. 347–370, 2010. View at Google Scholar
  38. G. Bonilla-Enríquez and S. O. Caballero-Morales, “Com-munication interface for mexican spanish dysarthric speakers,” Acta Universitaria, vol. 22, no. NE-1, 98–105 pages, 2012. View at Google Scholar