Table of Contents Author Guidelines Submit a Manuscript
Computational and Mathematical Methods in Medicine
Volume 2013, Article ID 297860, 15 pages
http://dx.doi.org/10.1155/2013/297860
Research Article

Estimation of Phoneme-Specific HMM Topologies for the Automatic Recognition of Dysarthric Speech

Technological University of the Mixteca, Road to Acatlima K.m. 2.5, Huajuapan de León, 69000 Oaxaca, OAX, Mexico

Received 31 May 2013; Revised 16 August 2013; Accepted 25 August 2013

Academic Editor: Volkhard Helms

Copyright © 2013 Santiago-Omar Caballero-Morales. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. F. L. Darley, A. E. Aronson, and J. R. Brown, “Differential diagnostic patterns of dysarthria,” Journal of Speech and Hearing Research, vol. 12, no. 2, pp. 246–269, 1969. View at Google Scholar · View at Scopus
  2. F. L. Darley, A. E. Aronson, and J. R. Brown, “Clusters of deviant speech dimensions in the dysarthrias,” Journal of Speech and Hearing Research, vol. 12, no. 3, pp. 462–496, 1969. View at Google Scholar · View at Scopus
  3. A. B. Kain, J.-P. Hosom, X. Niu, J. P. H. van Santen, M. Fried-Oken, and J. Staehely, “Improving the intelligibility of dysarthric speech,” Speech Communication, vol. 49, no. 9, pp. 743–759, 2007. View at Publisher · View at Google Scholar · View at Scopus
  4. H. V. Sharma, Acoustic model adaptation for recognition of dysarthric speech [Ph.D. dissertation], University of Illinois, Urbana, Ill, USA, 2012.
  5. P. C. Doyle, H. A. Leeper, A. L. Kotler et al., “Dysarthric speech: a comparison of computerized speech recognition and listener intelligibility,” Journal of Rehabilitation Research and Development, vol. 34, no. 3, pp. 309–316, 1997. View at Google Scholar · View at Scopus
  6. A. Kain, X. Niu, J. P. Hosom, Q. Miao, and J. P. H. van Santen, “Formant re-synthesis of dysarthric speech,” in Proceedings of the ISCA Speech Synthesis Workshop, 2004.
  7. P. D. Polur and G. E. Miller, “Effect of high-frequency spectral components in computer recognition of dysarthric speech based on a Mel-cepstral stochastic model,” Journal of Rehabilitation Research and Development, vol. 42, no. 3, pp. 363–372, 2005. View at Publisher · View at Google Scholar · View at Scopus
  8. W. J. Hardcastle, R. A. Morgan Barry, and C. J. Clark, “Articulatory and voicing characteristics of adult dysarthric and verbal dyspraxic speakers: an instrumental study,” British Journal of Disorders of Communication, vol. 20, no. 3, pp. 249–270, 1985. View at Google Scholar · View at Scopus
  9. G. Weismer, “Articulatory characteristics of Parkinsonian dysarthria: segmental and phrase-level timing, spirantization, and glottal-supraglottal coordination,” in The Dysarthrias: Physiology, Acoustics, Perception, Management, M. R. McNeil, J. C. Rosenbek, and A. E. Aronson, Eds., pp. 101–130, Timonium, Maryland, Md, USA, College Hill Press, San Diego, Calif, USA.
  10. R. D. Kent and J. C. Rosenbek, “Acoustic patterns of apraxia of speech,” Journal of Speech and Hearing Research, vol. 26, no. 2, pp. 231–249, 1983. View at Google Scholar · View at Scopus
  11. H. Ackermann and I. Hertrich, “Dysarthria in Friedreich's ataxia: timing of speech segments,” Clinical Linguistics and Phonetics, vol. 7, no. 1, pp. 75–91, 1993. View at Google Scholar · View at Scopus
  12. H. Kim, H. Hasegawa-Johnson, and A. Perlman, “Acoustic cues to lexical stress in spastic dysarthria,” in Proceedings of the Speech Prosody, vol. 100891, pp. 1–4, 2010.
  13. R. Patel, “Prosodic control in severe dysarthria: preserved ability to mark the question-statement contrast,” Journal of Speech, Language, and Hearing Research, vol. 45, no. 5, pp. 858–870, 2002. View at Google Scholar · View at Scopus
  14. R. D. Kent, J. F. Kent, J. R. Duffy, and G. Weismer, “The dysarthrias: speech-voice profiles, related dysfunctions, and neuropathology,” Journal of Medical Speech-Language Pathology, vol. 6, no. 4, pp. 165–211, 1998. View at Google Scholar · View at Scopus
  15. W. Ziegler and P. Hoole, “Voice quality measurement,” in Neurologic Disease, R. D. Kent and M. J. Ball, Eds., pp. 397–410, Singular, 2000. View at Google Scholar
  16. R. D. Kent, H. K. Vorperian, J. F. Kent, and J. R. Duffy, “Voice dysfunction in dysarthria: application of the multi-dimensional voice program,” Journal of Communication Disorders, vol. 36, no. 4, pp. 281–306, 2003. View at Publisher · View at Google Scholar · View at Scopus
  17. P. Raghavendra, E. Rosengren, and S. Hunnicutt, “An investigation of different degrees of dysarthric speech as input to speaker-adaptive and speaker-dependent recognition systems,” Augmentative and Alternative Communication, vol. 17, no. 4, pp. 265–275, 2001. View at Google Scholar · View at Scopus
  18. S. O. Caballero Morales and S. J. Cox, “Modelling errors in automatic speech recognition for dysarthric speakers,” EURASIP Journal on Advances in Signal Processing, vol. 2009, Article ID 308340, 14 pages, 2009. View at Publisher · View at Google Scholar · View at Scopus
  19. F. Hamidi, M. Baljko, N. Livingston et al., “A customizable speech interface for people with dysatric speech,” in Proceedings of the 12th International Conference on Computers Helping People with Special Needs (ICCHP '10), K. Miesenberger, J. Klaus, W. Zagler, and A. Karshmer, Eds., vol. 6179 of Lecture Notes in Computer Sciences (LNCS), pp. 605–612, Springer, Berlin, Germany, 2010.
  20. K. Rosen and S. Yampolsky, “Automatic speech recognition and a review of its functioning with dysarthric speech,” Augmentative and Alternative Communication, vol. 16, no. 1, pp. 48–60, 2000. View at Google Scholar · View at Scopus
  21. L. Ferrier, H. Shane, H. Ballard, T. Carpenter, and A. Benoit, “Dysarthric speaker's intelligibility and speech characteristics in relation to computer speech recognition,” Augmentative and Alternative Communication, vol. 11, no. 3, pp. 165–175, 1995. View at Publisher · View at Google Scholar
  22. N. J. Manasse, K. Hux, and J. L. Rankin-Erickson, “Speech recognition training for enhancing written language generation by a traumatic brain injury survivor,” Brain Injury, vol. 14, no. 11, pp. 1015–1034, 2000. View at Google Scholar · View at Scopus
  23. N. Manasse, K. Hux, J. Rankin-Erickson, and E. Lauritzen, “Accuracy of three speech recognition systems: case study of dysarthric speech,” Augmentative and Alternative Communication, vol. 16, no. 3, pp. 186–196, 2000. View at Google Scholar · View at Scopus
  24. G. Jayaram and K. Abdelhamied, “Experiments in dysarthric speech recognition using artificial neural networks,” Journal of Rehabilitation Research and Development, vol. 32, no. 2, pp. 162–169, 1995. View at Google Scholar · View at Scopus
  25. H. Strik, E. Sanders, M. Ruiter, and L. Beijer, “Automatic recognition of dutch dysarthric speech: a pilot study,” in Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP '02), pp. 661–664, 2002.
  26. H. Matsumasa, T. Takiguchi, Y. Ariki, I. Li, and T. Nakabayashi, “Integration of metamodel and acoustic model for speech recognition,” in Proceedings of the 9th Annual Conference of the International Speech Communication Association (Interspeech '08), pp. 2234–2237, September 2008. View at Scopus
  27. M. S. Hawley, P. Enderby, P. Green et al., “A speech-controlled environmental control system for people with severe dysarthria,” Medical Engineering & Physics, vol. 29, no. 5, pp. 586–593, 2007. View at Publisher · View at Google Scholar · View at Scopus
  28. M. S. Hawley, P. Enderby, P. Green, S. Cunningham, and R. Palmer, “Development of a voice-input voice-output communication aid (VIVOCA) for people with severe dysarthria,” in Proceedings of the 10th International Conference on Computers Helping People with Special Needs (ICCHP '06), K. Miesenberger, J. Klaus, W. L. Zagler, and A. I. Karshmer, Eds., vol. 4061 of Lecture Notes in Computer Science (LNCS), pp. 882–885, 2006.
  29. L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–286, 1989. View at Publisher · View at Google Scholar · View at Scopus
  30. P. D. Polur and G. E. Miller, “Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals,” Medical Engineering & Physics, vol. 28, no. 8, pp. 741–748, 2006. View at Publisher · View at Google Scholar · View at Scopus
  31. M. S. Yakcoub, S.-A. Selouani, and D. O'Shaughnessy, “Speech assistive technology to improve the interaction of dysarthric speakers with machines,” in Proceedings of the 3rd International Symposium on Communications, Control, and Signal Processing (ISCCSP '08), pp. 1150–1154, March 2008. View at Publisher · View at Google Scholar · View at Scopus
  32. M. S. Hawley, P. Enderby, P. Green et al., “A voice-input voice-output communication aid for people with severe speech impairment,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 21, no. 1, pp. 23–31, 2013. View at Publisher · View at Google Scholar
  33. H.-P. Chang, “Speech input for dysarthric users,” in Proceedings of the Meeting of the Acoustical Society of America, 1993.
  34. N. Thomas-Stonell, A.-L. Kotler, H. A. Leeper, and P. C. Doyle, “Computerized speech recognition: influence of intelligibility and perceptual consistency on recognition accuracy,” Augmentative and Alternative Communication, vol. 14, no. 1, pp. 51–56, 1998. View at Google Scholar · View at Scopus
  35. M. Hasegawa-Johnson, J. Gunderson, A. Perlman, and T. Huang, “HMM-based and SVM-based recognition of the speech of talkers with spastic dysarthria,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '06), vol. 3, pp. 1060–1063, May 2006. View at Scopus
  36. W. K. Seong, J. H. Park, and H. K. Kim, “Dysarthric speech recognition error correction using weighted finite state transducers based on context-dependent pronunciation variation,” in Proceedings of the 13th International Conference on Computers Helping People with Special Needs (ICCHP '12), K. Miesenberger, A. I. Karshmer, P. Penaz, and W. L. Zagler, Eds., vol. 7383 of Lecture Notes in Computer Science, pp. 475–482, Springer, 2012.
  37. W. K. Seong, J. H. Park, and H. K. Kim, “Multiple pronunciation lexical modeling based on phoneme confusion matrix for dysarthric speech recognition,” Advanced Science and Technology Letters, vol. 14, pp. 57–60, 2012. View at Google Scholar
  38. S. O. Caballero-Morales and F. Trujillo-Romero, “Dynamic estimation of phoneme confusion patterns with a genetic algorithm to improve the performance of metamodels for recognition of disordered speech,” in Advances in Computational Intelligence, I. Batyrshin and M. González-Mendoza, Eds., vol. 7630 of Lecture Notes in Artificial Intelligence, pp. 175–186, Springer, 2013. View at Google Scholar
  39. P. Green, J. Carmichael, A. Hatzis, P. Enderby, M. S. Hawley, and M. Parker, “Automatic speech recognition with sparse training data for dysarthric speakers,” in Proceedings of the the 8th European Conference on Speech Communication and Technology (EUROSPEECH '03), pp. 1189–1192, 2003.
  40. M. Frikha and A. B. Hamida, “A comparative survey of ANN and hybrid HMM/ANN architectures for robust speech recognition,” American Journal of Intelligent Systems, vol. 2, no. 1, pp. 1–8, 2012. View at Google Scholar
  41. X. Menéndez-Pidal, J. B. Polikoff, S. M. Peters, J. E. Leonzio, and H. T. Bunnell, “The nemours database of dysarthric speech,” in Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP '96), vol. 3, pp. 1962–1965, October 1996. View at Scopus
  42. C. W. Chau, S. Kwong, K. F. Man, and K. S. Tang, “Optimisation of HMM topology and its model parameters by genetic algorithms,” Pattern Recognition, vol. 34, no. 2, pp. 509–522, 2001. View at Publisher · View at Google Scholar · View at Scopus
  43. S. Young and P. Woodland, The HTK Book, (for HTK Version 3.4), Cambridge University Engineering Department, 2006.
  44. D. Jurafsky and J. H. Martin, Speech and Language Processing, Pearson Prentice Hall, 2009.
  45. F. Rudzicz, “Comparing speaker-dependent and speaker-adaptive acoustic models for recognizing dysarthric speech,” in Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS '07), pp. 255–256, October 2007. View at Publisher · View at Google Scholar · View at Scopus
  46. D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, 1989.
  47. L. J. Ferrier, J. R. Deller, and D. Hsu, “On the use of hidden Markov modelling for recognition of dysarthric speech,” Computer Methods and Programs in Biomedicine, vol. 35, no. 2, pp. 125–139, 1991. View at Publisher · View at Google Scholar · View at Scopus
  48. Q. Y. Hong and S. Kwong, “A genetic classification method for speaker recognition,” Engineering Applications of Artificial Intelligence, vol. 18, no. 1, pp. 13–19, 2005. View at Publisher · View at Google Scholar · View at Scopus
  49. T. Takara, Y. Iha, and I. Nagayama, “Selection of the optimal structure of the continuous HMM using the genetic algorithm,” in Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP '98), 1998.
  50. G. A. Bakare, G. K. Venayagamoorthy, and U. O. Aliyu, “Reactive power and voltage control of the Nigerian grid system using micro-genetic algorithm,” in Proceedings of IEEE Power Engineering Society General Meeting, vol. 2, pp. 1916–1922, June 2005. View at Scopus
  51. K. F. Leung, F. H. F. Leung, H. K. Lam, and S. H. Ling, “Application of a modified neural fuzzy network and an improved genetic algorithm to speech recognition,” Neural Computing and Applications, vol. 16, no. 4-5, pp. 419–431, 2007. View at Publisher · View at Google Scholar · View at Scopus
  52. B. Kumar and R. Dhiman, “Tuning of PID controller for liquid level tank system using intelligent techniques,” International Journal of Computer Science and Technology, vol. 2, no. 4, pp. 257–260, 2011. View at Google Scholar
  53. R. Kumar, “An experimental analysis of explorative and exploited operators of genetic algorithm for operating system process scheduling problem,” International Journal of Engineering and Technology, vol. 2, no. 6, pp. 472–476, 2010. View at Google Scholar · View at Scopus
  54. T. Nomura, “Analysis on linear crossover for real number chromosomes in an infinite population size,” in Proceedings of IEEE International Conference on Evolutionary Computation (ICEC '97), pp. 111–114, April 1997. View at Scopus
  55. J. Xiao, L. Zou, and C. Li, “Optimization of hidden Markov model by a genetic algorithm for web information extraction,” in Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering (ISKE '07), 2007.
  56. D. T. Vollmer, T. Soule, and M. Manic, “A distance measure comparison to improve crowding in multi-modal optimization problems,” in Proceedings of the 3rd International Symposium on Resilient Control Systems (ISRCS '10), pp. 31–36, August 2010. View at Publisher · View at Google Scholar · View at Scopus
  57. J. P. Hosom, A. B. Kain, T. Mishra, J. P. H. van Santen, M. Fried-Oken, and J. Staehely, “Intelligibility of modifications to dysarthric speech,” in Proceedings of IEEE International Conference on Accoustics, Speech, and Signal Processing (ICASSP '03), vol. 1, pp. 924–927, April 2003. View at Publisher · View at Google Scholar · View at Scopus
  58. W. K. Seong, J. H. Park, and H. K. Kim, “Performance improvement of dysarthric speech recognition using context-dependent pronunciation variation modeling based on Kullback-Leibler distance,” Advanced Science and Technology Letters, vol. 14, pp. 53–56, 2012. View at Google Scholar
  59. H. T. Bunnel and J. B. Polikoff, “The nemours database of dysarthric speech: a perceptual analysis,” in Proceedings of the 14th International Congress of Phonetic Sciences, vol. 1, pp. 783–786, 1999.
  60. L. Gillick and S. J. Cox, “Some statistical issues in the comparison of speech recognition algorithms,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '89), pp. 532–535, May 1989. View at Scopus
  61. National Institute of Standards and Technology (NIST), The History of Automatic Speech Recognition Evaluations at NIST, 2009, http://www.itl.nist.gov/iad/mig/publications/ASRhistory/index.html.
  62. J. M. Noyes and C. R. Frankish, “Speech recognition technology for individuals with disabilities,” Augmentative and Alternative Communication, vol. 8, no. 4, pp. 297–303, 1992. View at Publisher · View at Google Scholar