Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015, Article ID 259239, 13 pages
http://dx.doi.org/10.1155/2015/259239
Research Article

Modulation Spectra Morphological Parameters: A New Method to Assess Voice Pathologies according to the GRBAS Scale

ETSIST, Universidad Politécnica de Madrid, Campus Sur, Carretera de Valencia km 7, 28031 Madrid, Spain

Received 23 January 2015; Revised 4 May 2015; Accepted 4 May 2015

Academic Editor: Adam Klein

Copyright © 2015 Laureano Moro-Velázquez et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. C. Sapienza and B. Hoffman-Ruddy, Voice Disorders, Plural Publishing, 2009.
  2. D. K. Wilson, Voice Problems of Children, Williams & Wilkins, Baltimore, Md, USA, 1987.
  3. G. B. Kempster, B. R. Gerratt, K. V. Abbott, J. Barkmeier-Kraemer, and R. E. Hillman, “Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol,” American Journal of Speech-Language Pathology, vol. 18, no. 2, pp. 124–132, 2009. View at Publisher · View at Google Scholar · View at Scopus
  4. M. Hirano, Clinical Examination of Voice, Springer, 1981.
  5. M. S. De Bodt, F. L. Wuyts, P. H. van de Heyning, and C. Croux, “Test-retest study of the GRBAS scale: influence of experience and professional background on perceptual rating of voice quality,” Journal of Voice, vol. 11, no. 1, pp. 74–80, 1997. View at Publisher · View at Google Scholar · View at Scopus
  6. I. V. Bele, “Reliability in perceptual analysis of voice quality,” Journal of Voice, vol. 19, no. 4, pp. 555–573, 2005. View at Publisher · View at Google Scholar · View at Scopus
  7. A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, “Accurate telemonitoring of Parkinson's disease progression by noninvasive speech tests,” IEEE Transactions on Biomedical Engineering, vol. 57, no. 4, pp. 884–893, 2010. View at Publisher · View at Google Scholar · View at Scopus
  8. A. Tsanas, M. A. Little, C. Fox, and L. O. Ramig, “Objective automatic assessment of rehabilitative speech treatment in parkinson's disease,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 22, no. 1, pp. 181–190, 2014. View at Publisher · View at Google Scholar
  9. C. Fredouille, G. Pouchoulin, A. Ghio, J. Revis, J.-F. Bonastre, and A. Giovanni, “Back-and-forth methodology for objective voice quality assessment: from/to expert knowledge to/from automatic classification of dysphonia,” EURASIP Journal on Advances in Signal Processing, vol. 2009, Article ID 982102, 13 pages, 2009. View at Publisher · View at Google Scholar · View at Scopus
  10. P. Gómez-Vilda, R. Fernández-Baillo, V. Rodellar-Biarge et al., “Glottal Source biometrical signature for voice pathology detection,” Speech Communication, vol. 51, no. 9, pp. 759–781, 2009. View at Publisher · View at Google Scholar · View at Scopus
  11. M. Markaki and Y. Stylianou, “Voice pathology detection and discrimination based on modulation spectral features,” IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 7, pp. 1938–1948, 2011. View at Publisher · View at Google Scholar · View at Scopus
  12. J. D. Arias-Londoño, J. I. Godino-Llorente, M. Markaki, and Y. Stylianou, “On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices,” Logopedics Phoniatrics Vocology, vol. 36, no. 2, pp. 60–69, 2011. View at Publisher · View at Google Scholar · View at Scopus
  13. R. Wielgat, T. P. Zieliński, T. Woźniak, S. Grabias, and D. Król, “Automatic recognition of pathological phoneme production,” Folia Phoniatrica et Logopaedica, vol. 60, no. 6, pp. 323–331, 2009. View at Publisher · View at Google Scholar · View at Scopus
  14. L. Salhi and A. Cherif, “Robustness of auditory teager energy cepstrum coefficients for classification of pathological and normal voices in noisy environments,” The Scientific World Journal, vol. 2013, Article ID 435729, 8 pages, 2013. View at Publisher · View at Google Scholar · View at Scopus
  15. M. Rosa, J. Pereira, M. Greller, and A. Carvalho, “Signal processing and statistical procedures to identify laryngeal pathologies,” in Proceedings of the 6th IEEE International Conference on Electronics, Circuits and Systems (ICECS '99), vol. 1, pp. 423–426, Pafos, Cyprus, 1999. View at Publisher · View at Google Scholar
  16. G. Muhammad, M. Alsulaiman, A. Mahmood, and Z. Ali, “Automatic voice disorder classification using vowel formants,” in Proceedings of the 12th IEEE International Conference on Multimedia and Expo (ICME '11), pp. 1–6, IEEE, July 2011. View at Publisher · View at Google Scholar · View at Scopus
  17. P. Yu, Z. Wang, S. Liu, N. Yan, L. Wang, and M. Ng, “Multidimensional acoustic analysis for voice quality assessment based on the GRBAS scale,” in Proceedings of the 9th International Symposium on Chinese Spoken Language Processing (ISCSLP '14), pp. 321–325, IEEE, Singapore, September 2014. View at Publisher · View at Google Scholar
  18. A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, “Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson's disease symptom severity,” Journal of the Royal Society Interface, vol. 8, no. 59, pp. 842–855, 2011. View at Publisher · View at Google Scholar · View at Scopus
  19. G. Pouchoulin, C. Fredouille, J.-F. Bonastre, A. Ghio, and A. Giovanni, “Frequency study for the characterization of the dysphonic voices,” in Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech '07), pp. 1198–1201, Antwerp, Belgium, August 2007.
  20. G. Pouchoulin, C. Fredouille, J. Bonastre et al., “Dysphonic voices and the 0–3000 Hz frequency band,” in Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech '08), pp. 2214–2217, ISCA, Brisbane, Australia, September 2008.
  21. N. Sáenz-Lechón, J. I. Godino-Llorente, V. Osma-Ruiz, M. Blanco-Velasco, and F. Cruz-Roldán, “Automatic assessment of voice quality according to the GRBAS scale,” in Proceedings of the 28th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS '06), pp. 2478–2481, September 2006. View at Publisher · View at Google Scholar · View at Scopus
  22. A. Stráník, R. Čmejla, and J. Vokřál, “Acoustic parameters for classification of breathiness in continuous speech according to the GRBAS scale,” Journal of Voice, vol. 28, no. 5, pp. 653.e9–653.e17, 2014. View at Publisher · View at Google Scholar · View at Scopus
  23. M. Markaki and Y. Stylianou, “Modulation spectral features for objective voice quality assessment: the breathiness case,” in Proceedings of the 6th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, Firenze, Italy, December 2009.
  24. W. S. Winholtz and L. O. Ramig, “Vocal tremor analysis with the vocal demodulator,” Journal of Speech and Hearing Research, vol. 35, no. 3, pp. 562–573, 1992. View at Publisher · View at Google Scholar · View at Scopus
  25. M. A. Little, P. E. McSharry, S. J. Roberts, D. A. E. Costello, and I. M. Moroz, “Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection,” BioMedical Engineering Online, vol. 6, article 23, 2007. View at Publisher · View at Google Scholar · View at Scopus
  26. C. Peng, W. Chen, X. Zhu, B. Wan, and D. Wei, “Pathological voice classification based on a single Vowel's acoustic features,” in Proceedings of the 7th IEEE International Conference on Computer and Information Technology (CIT '07), pp. 1106–1110, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  27. J. I. Godino-Llorente and P. Gómez-Vilda, “Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors,” IEEE Transactions on Biomedical Engineering, vol. 51, no. 2, pp. 380–384, 2004. View at Publisher · View at Google Scholar · View at Scopus
  28. C. Maguire, P. de Chazal, R. B. Reilly, and P. D. Lacy, “Identification of voice pathology using automated speech analysis,” in Proceedings of the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA '03), pp. 259–262, Florence, Italy, December 2003.
  29. M. Marinaki and C. Kotropoulos, “Automatic detection of vocal fold paralysis and edema,” in Proceedings of the ICSLP, Jeju Island, Republic of Korea, 2004.
  30. J. Godino-Llorente, P. Gómez-Vilda, N. Sáenz-Lechón, M. Blanco-Velasco, F. Cruz-Roldán, and M. A. Ferrer, “Discriminative methods for the detection of voice disorders,” in Proceedings of the International Conference on Non-Linear Speech Processing (NOLISP '05), pp. 158–167, Barcelona, Spain, April 2005.
  31. K. Shama, A. Krishna, and N. U. Cholayya, “Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology,” EURASIP Journal on Advances in Signal Processing, vol. 2007, no. 1, Article ID 85286, 2007. View at Publisher · View at Google Scholar · View at Scopus
  32. A. I. Fontes, P. T. Souza, A. D. Neto, A. d. Martins, and L. F. Silveira, “Classification system of pathological voices using correntropy,” Mathematical Problems in Engineering, vol. 2014, Article ID 924786, 7 pages, 2014. View at Publisher · View at Google Scholar
  33. R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI '95), vol. 2, pp. 1137–1145, 1995.
  34. S. Jannetts and A. Lowit, “Cepstral analysis of hypokinetic and ataxic voices: correlations with perceptual and other acoustic measures,” Journal of Voice, vol. 28, no. 6, pp. 673–680, 2014. View at Publisher · View at Google Scholar · View at Scopus
  35. P. Boersma, “Praat, a system for doing phonetics by computer,” Glot International, vol. 5, no. 9-10, pp. 341–345, 2002. View at Google Scholar
  36. J. D. Arias-Londoño, J. I. Godino-Llorente, N. Sáenz-Lechón et al., “Automatic GRBAS assessment using complexity measures and a multiclass GMM-based detector,” in Proceedings of the 7th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2011.
  37. N. C. Singha and F. E. Theunissen, “Modulation spectra of natural sounds and ethological theories of auditory processing,” The Journal of the Acoustical Society of America, vol. 114, no. 6, pp. 3394–3411, 2003. View at Publisher · View at Google Scholar · View at Scopus
  38. L. Atlas and S. A. Shamma, “Joint acoustic and modulation frequency,” EURASIP Journal on Applied Signal Processing, vol. 2003, no. 7, pp. 668–675, 2003. View at Publisher · View at Google Scholar · View at Scopus
  39. S.-C. Lim, S.-J. Jang, S.-P. Lee, and M. Y. Kim, “Music genre/mood classification using a feature-based modulation spectrum,” in Proceedings of the International Conference on Mobile IT-Convergence (ICMIC '11), pp. 133–136, IEEE, September 2011. View at Scopus
  40. W.-Y. Chu, J.-W. Hung, and B. Chen, “Modulation spectrum factorization for robust speech recognition,” in Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC '11), pp. 1–6, October 2011. View at Scopus
  41. H.-T. Fan, Y.-C. Tsai, and J.-W. Hung, “Enhancing the sub-band modulation spectra of speech features via nonnegative matrix factorization for robust speech recognition,” in Proceedings of the International Conference on System Science and Engineering (ICSSE '12), pp. 179–182, July 2012. View at Publisher · View at Google Scholar · View at Scopus
  42. E. Bozkurt, O. Toledo-Ronen, A. Sorin, and R. Hoory, “Exploring modulation spectrum features for speech-based depression level classification,” in Proceedings of the 15th Annual Conference of the International Speech Communication Association, Singapore, September 2014.
  43. K. M. Carbonell, R. A. Lester, B. H. Story, and A. J. Lotto, “Discriminating simulated vocal tremor source using amplitude modulation spectra,” Journal of Voice, vol. 29, no. 2, pp. 140–147, 2015. View at Publisher · View at Google Scholar
  44. P. H. Dejonckere, C. Obbens, G. M. de Moor, and G. H. Wieneke, “Perceptual evaluation of dysphonia: reliability and relevance,” Folia Phoniatrica, vol. 45, no. 2, pp. 76–83, 1993. View at Publisher · View at Google Scholar · View at Scopus
  45. M. P. Karnell, S. D. Melton, J. M. Childes, T. C. Coleman, S. A. Dailey, and H. T. Hoffman, “Reliability of clinician-based (grbas and cape-v) and patientbased (v-rqol and ipvi) documentation of voice disorders,” Journal of Voice, vol. 21, no. 5, pp. 576–590, 2007. View at Publisher · View at Google Scholar · View at Scopus
  46. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall, New York, NY, USA, 1993.
  47. S. M. Schimmel, L. E. Atlas, and K. Nie, “Feasibility of single channel speaker separation based on modulation frequency analysis,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07), vol. 4, pp. IV605–IV608, April 2007. View at Publisher · View at Google Scholar · View at Scopus
  48. L. Atlas, P. Clark, and S. Schimmel, “Modulation Toolbox Version 2.1 for MATLAB,” 2010, http://isdl.ee.washington.edu/projects/modulationtoolbox/.
  49. S. M. Schimmel and L. E. Atlas, “Coherent envelope detection for modulation filtering of speech,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), vol. 1, pp. I221–I224, IEEE, March 2005. View at Publisher · View at Google Scholar · View at Scopus
  50. R. Cusack and N. Papadakis, “New robust 3-D phase unwrapping algorithms: application to magnetic field mapping and undistorting echoplanar images,” NeuroImage, vol. 16, no. 3, pp. 754–764, 2002. View at Publisher · View at Google Scholar · View at Scopus
  51. B. Gajić and K. K. Paliwal, “Robust speech recognition in noisy environments based on subband spectral centroid histograms,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 2, pp. 600–608, 2006. View at Publisher · View at Google Scholar · View at Scopus
  52. R. J. Elble, “Central mechanisms of tremor,” Journal of Clinical Neurophysiology, vol. 13, no. 2, pp. 133–144, 1996. View at Publisher · View at Google Scholar · View at Scopus
  53. M. Bové, N. Daamen, C. Rosen, C.-C. Wang, L. Sulica, and J. Gartner-Schmidt, “Development and validation of the vocal tremor scoring system,” The Laryngoscope, vol. 116, no. 9, pp. 1662–1667, 2006. View at Publisher · View at Google Scholar · View at Scopus
  54. R. Peters and R. Strickland, “Image complexity metrics for automatic target recognizers,” in Proceedings of the Automatic Target Recognizer System and Technology Conference, October 1990.
  55. Voice Disorders Database, Kay Elemetrics Corporation, Lincoln Park, NJ, USA, 1994.
  56. V. Parsa and D. G. Jamieson, “Identification of pathological voices using glottal noise measures,” Journal of Speech, Language, and Hearing Research, vol. 43, no. 2, pp. 469–485, 2000. View at Publisher · View at Google Scholar · View at Scopus
  57. L. Smith, A Tutorial on Principal Components Analysis, vol. 51, Cornell University, Ithaca, NY, USA, 2002.
  58. R. Haeb-Umbach and H. Ney, “Linear discriminant analysis for improved large vocabulary continuous speech recognition,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '92), vol. 1, pp. 13–16, San Francisco, Calif, USA, March 1992. View at Publisher · View at Google Scholar
  59. B. Efron and G. Gong, “A leisurely look at the bootstrap, the jackknife, and cross-validation,” The American Statistician, vol. 37, no. 1, pp. 36–48, 1983. View at Publisher · View at Google Scholar · View at MathSciNet
  60. T. K. Moon, “The expectation-maximization algorithm,” IEEE Signal Processing Magazine, vol. 13, no. 6, pp. 47–60, 1996. View at Publisher · View at Google Scholar · View at Scopus
  61. J. Cohen, “A coefficient of agreement for nominal scales,” Educational and Psychological Measurement, vol. 20, no. 1, pp. 37–46, 1960. View at Publisher · View at Google Scholar
  62. D. G. Altman, Practical Statistics for Medical Research, CRC Press, 1990.
  63. H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009. View at Publisher · View at Google Scholar · View at Scopus
  64. D. G. Childers, “Detection of laryngeal function using speech and electroglottographic data,” IEEE Transactions on Biomedical Engineering, vol. 39, no. 1, pp. 19–25, 1992. View at Publisher · View at Google Scholar · View at Scopus
  65. N. Sáenz-Lechón, J. I. Godino-Llorente, V. Osma-Ruiz, and P. Gómez-Vilda, “Methodological issues in the development of automatic systems for voice pathology detection,” Biomedical Signal Processing and Control, vol. 1, no. 2, pp. 120–128, 2006. View at Publisher · View at Google Scholar · View at Scopus
  66. V. N. Vapnik and V. Vapnik, Statistical Learning Theory, vol. 2 of Adaptive and Learning Systems for Signal Processing, Communications, and Control, John Wiley & Sons, 1998. View at MathSciNet
  67. R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society Series B: Methodological, vol. 58, no. 1, pp. 267–288, 1996. View at Google Scholar · View at MathSciNet
  68. R. Fraile, N. Sáenz-Lechón, J. I. Godino-Llorente, V. Osma-Ruiz, and C. Fredouille, “Automatic detection of laryngeal pathologies in records of sustained vowels by means of mel-frequency cepstral coefficient parameters and differentiation of patients by sex,” Folia Phoniatrica et Logopaedica, vol. 61, no. 3, pp. 146–152, 2009. View at Publisher · View at Google Scholar · View at Scopus