Table of Contents Author Guidelines Submit a Manuscript
Journal of Healthcare Engineering
Volume 2017, Article ID 8783751, 13 pages
https://doi.org/10.1155/2017/8783751
Research Article

Development of the Arabic Voice Pathology Database and Its Evaluation by Using Speech Features and Machine Learning Algorithms

1ENT Department, College of Medicine, King Saud University, Riyadh, Saudi Arabia
2Digital Speech Processing Group, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia

Correspondence should be addressed to Zulfiqar Ali; moc.liamg@0002ttubraqifluz

Received 14 December 2016; Revised 4 April 2017; Accepted 2 May 2017; Published 19 October 2017

Academic Editor: Tiago H. Falk

Copyright © 2017 Tamer A. Mesallam et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. J. H. Walton and R. F. Orlikoff, “Speaker race identification from acoustic cues in the vocal signal,” Journal of Speech and Hearing Research, vol. 37, no. 4, pp. 738–745, 1994. View at Publisher · View at Google Scholar · View at Scopus
  2. C. M. Sapienza, “Aerodynamic and acoustic characteristics of the adult African American voice,” Journal of Voice, vol. 11, no. 4, pp. 410–416, 1997. View at Publisher · View at Google Scholar · View at Scopus
  3. P. H. Boshoff, “The anatomy of the South African Negro larynx,” South African Journal of Medical Sciences, vol. 10, pp. 113–11, 1945. View at Google Scholar
  4. K. H. Malki, S. F. Al-Habib, A. A. Hagr, and M. M. Farahat, “Acoustic analysis of normal Saudi adult voices,” Saudi Medical Journal, vol. 30, no. 8, pp. 1081–1086, 2009. View at Google Scholar · View at Scopus
  5. Kay Elemetric Corp, “Muti-Dimensional Voice Program (MDVP) Ver. 3.3,” Lincoln Park, NJ, 1993.
  6. J.-W. Lee, H.-G. Kang, J.-Y. Choi, and Y.-I. Son, “An investigation of vocal tract characteristics for acoustic discrimination of pathological voices,” BioMed Research International, vol. 2013, Article ID 758731, 11 pages, 2013. View at Publisher · View at Google Scholar · View at Scopus
  7. D. Martínez, E. Lleida, A. Ortega, and A. Miguel, “Score level versus audio level fusion for voice pathology detection on the Saarbrücken Voice Database,” Communications in Computer and Information Science, vol. 328, pp. 110–120, 2012. View at Publisher · View at Google Scholar · View at Scopus
  8. A. Maier, T. Haderlein, U. Eysholdt et al., “PEAKS - A system for the automatic evaluation of voice and speech disorders,” Speech Communication, vol. 51, no. 5, pp. 425–437, 2009. View at Publisher · View at Google Scholar · View at Scopus
  9. J. D. Arias-Londoño, J. I. Godino-Llorente, N. Sáenz-Lechón, V. Osma-Ruiz, and G. Castellanos-Domínguez, “An improved method for voice pathology detection by means of a HMM-based feature space transformation,” Pattern Recognition, vol. 43, no. 9, pp. 3100–3112, 2010. View at Publisher · View at Google Scholar · View at Scopus
  10. M. Vasilakis and Y. Stylianou, “Voice pathology detection based eon short-term jitter estimations in running speech,” Folia Phoniatrica et Logopaedica, vol. 61, no. 3, pp. 153–170, 2009. View at Publisher · View at Google Scholar · View at Scopus
  11. G. B. Kempster, B. R. Gerratt, K. V. Abbott, J. Barkmeier-Kraemer, and R. E. Hillman, “Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol,” American Journal of Speech-Language Pathology, vol. 18, no. 2, pp. 124–132, 2009. View at Publisher · View at Google Scholar · View at Scopus
  12. M. S. de Bodt, F. L. Wuyts, P. H. van de Heyning, and C. Croux, “Test-retest study of the GRBAS scale: influence of experience and professional background on perceptual rating of voice quality,” Journal of Voice, vol. 11, no. 1, pp. 74–80, 1997. View at Publisher · View at Google Scholar · View at Scopus
  13. N. Bhattacharyya, “The prevalence of voice problems among adults in the United States,” Laryngoscope, vol. 124, no. 10, pp. 2359–2362, 2014. View at Publisher · View at Google Scholar · View at Scopus
  14. Research Chair of Voicing and Swallowing Disorders, 2016, http://faculty.ksu.edu.sa/kmalky/Pages/researchchair.aspx.
  15. N. Roy, R. M. Merrill, S. Thibeault, R. A. Parsa, S. D. Gray, and E. M. Smith, “Prevalence of voice disorders in teachers and the general population,” Journal of Speech, Language, and Hearing Research, vol. 47, no. 2, pp. 281–293, 2004. View at Publisher · View at Google Scholar · View at Scopus
  16. A. Alatabani., A. Mashi., N. Mahdi., M. Alkhelaif., E. Alhwsawi., and S. Madkhaly., “Mothers knowledge about the otitis media risk factors among children: multi-centric Saudi study,” International Journal of Advanced Research, vol. 5, no. 1, pp. 980–985, 2017. View at Publisher · View at Google Scholar
  17. Z. Ali, M. Alsulaiman, G. Muhammad, I. Elamvazuthi, and T. A. Mesallam, “Vocal fold disorder detection based on continuous speech by using MFCC and GMM,” in Proceedings of the 2013 7th IEEE GCC Conference and Exhibition, GCC 2013, pp. 292–297, Doha, Qatar, November 2013. View at Publisher · View at Google Scholar · View at Scopus
  18. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall Press, Englewood Cliffs, NJ, USA, 1993.
  19. M. A. Anusuya and S. K. Katti, “Front end analysis of speech recognition: a review,” International Journal of Speech Technology, vol. 14, no. 2, pp. 99–145, 2011. View at Publisher · View at Google Scholar · View at Scopus
  20. B. S. Atal and S. L. Hanauer, “Speech analysis and synthesis by linear prediction of the speech wave,” Journal of the Acoustical Society of America, vol. 50, no. 2, pp. 637–655, 1971. View at Publisher · View at Google Scholar · View at Scopus
  21. H. Hermansky, “Perceptual linear predictive (PLP) analysis of speech,” Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738–1752, 1990. View at Publisher · View at Google Scholar · View at Scopus
  22. H. Hermansky and N. Morgan, “RASTA Processing of Speech,” IEEE Transactions on Speech and Audio Processing, vol. 2, no. 4, pp. 578–589, 1994. View at Publisher · View at Google Scholar · View at Scopus
  23. W. Roberts and J. Willmore, “Automatic speaker recognition using Gaussian mixture models,” in Proceedings of the 1999 Information, Decision and Control. Data and Information Fusion Symposium, Signal Processing and Communications Symposium and Decision and Control Symposium. Proceedings (Cat. No.99EX251), pp. 465–470, Adelaide, Australia, Feburary 1999. View at Publisher · View at Google Scholar
  24. A. Zulfiqar, A. Muhammad, A. M. Martinez-Enriquez, and G. Escalada-Imaz, “Text-independent speaker identification using VQ-HMM model based multiple classifier system,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6438, no. 2, pp. 116–125, 2010. View at Publisher · View at Google Scholar · View at Scopus
  25. L. E. Baum and T. Petrie, “Statistical inference for probabilistic functions of finite state Markov chains,” The Annals of Mathematical Statistics, vol. 37, no. 6, pp. 1554–1563, 1966. View at Publisher · View at Google Scholar · View at MathSciNet
  26. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. View at Publisher · View at Google Scholar · View at Scopus
  27. Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Transactions on Communications Systems, vol. 28, no. 1, pp. 84–95, 1980. View at Publisher · View at Google Scholar · View at Scopus
  28. P. Boersma, “Praat, a system for doing phonetics by computer,” Glot International, vol. 5, pp. 341–345, 2001. View at Google Scholar
  29. Z. Ali, G. Muhammad, and M. F. Alhamid, “An Automatic Health Monitoring System for Patients Suffering From Voice Complications in Smart Cities,” IEEE Access, vol. 5, pp. 3900–3908, 2017. View at Publisher · View at Google Scholar
  30. J. Benesty and T. Gänsler, “Computation of the condition number of a nonsingular symmetric Toeplitz matrix with the Levinson-Durbin algorithm,” IEEE Transactions on Signal Processing, vol. 54, no. 6 I, pp. 2362–2364, 2006. View at Publisher · View at Google Scholar · View at Scopus
  31. Z. Ali, I. Elamvazuthi, M. Alsulaiman, and G. Muhammad, “Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model,” Journal of Voice, vol. 30, no. 6, pp. 757–757.e19, 2016. View at Publisher · View at Google Scholar · View at Scopus
  32. Y. Lin and W. H. Abdulla, “Principles of Psychoacoustics,” in Audio Watermark: A Comprehensive Foundation Using MATLAB, Springer International Publishing, Cham, Switzerland, 2015. View at Publisher · View at Google Scholar
  33. E. Zwicker, “Subdivision of the audible frequency range into critical bands,” The Journal of the Acoustical Society of America, vol. 33, no. 2, article 248, 1961. View at Publisher · View at Google Scholar
  34. S. S. Stevens, “On the psychophysical law,” Psychological Review, vol. 64, no. 3, pp. 153–181, 1957. View at Publisher · View at Google Scholar
  35. M. K. Arjmandi, M. Pooyan, M. Mikaili, M. Vali, and A. Moqarehzadeh, “Identification of voice disorders using long-time features and support vector machine with different feature reduction methods,” Journal of Voice, vol. 25, no. 6, pp. e275–e289, 2011. View at Publisher · View at Google Scholar · View at Scopus
  36. J. I. Godino-Llorente, P. Gómez-Vilda, and M. Blanco-Velasco, “Dimensionality reduction of a pathological voice quality assessment system based on gaussian mixture models and short-term cepstral parameters,” IEEE Transactions on Biomedical Engineering, vol. 53, no. 10, pp. 1943–1953, 2006. View at Publisher · View at Google Scholar · View at Scopus
  37. B. Yildiz, J. I. Bilbao, and A. B. Sproul, “A review and analysis of regression and machine learning models on commercial building electricity load forecasting,” Renewable and Sustainable Energy Reviews, vol. 73, pp. 1104–1122, 2017. View at Publisher · View at Google Scholar · View at Scopus
  38. J. Hagenauer and M. Helbich, “A comparative study of machine learning classifiers for modeling travel mode choice,” Expert Systems with Applications, vol. 78, pp. 273–282, 2017. View at Publisher · View at Google Scholar · View at Scopus
  39. C. M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, NY, USA, 2006. View at MathSciNet
  40. Massachusette Eye & Ear Infirmry Voice & Speech LAB, Disordered Voice Database Model 4337 (Ver. 1.03), Kay Elemetrics Corp, Lincoln Park, NJ, USA, 1994.
  41. T. Villa-Canas, E. Belalcazar-Bolamos, S. Bedoya-Jaramillo et al., “Automatic detection of laryngeal pathologies using cepstral analysis in Mel and Bark scales,” in Proceedings of the 17th Symposium of Image, Signal Processing, and Artificial Vision, STSIVA 2012, pp. 116–121, col, September 2012. View at Publisher · View at Google Scholar · View at Scopus
  42. G. Muhammad and M. Melhem, “Pathological voice detection and binary classification using MPEG-7 audio features,” Biomedical Signal Processing and Control, vol. 11, pp. 1–9, 2014. View at Publisher · View at Google Scholar
  43. M. Markaki and Y. Stylianou, “Voice pathology detection and discrimination based on modulation spectral features,” IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 7, pp. 1938–1948, 2011. View at Publisher · View at Google Scholar · View at Scopus
  44. V. Parsa and D. G. Jamieson, “Identification of pathological voices using glottal noise measures,” Journal of Speech, Language, and Hearing Research, vol. 43, no. 2, pp. 469–485, 2000. View at Publisher · View at Google Scholar · View at Scopus
  45. G. De Krom, “Consistency and reliability of voice quality ratings for different types of speech fragments,” Journal of Speech and Hearing Research, vol. 37, no. 5, pp. 985–1000, 1994. View at Publisher · View at Google Scholar · View at Scopus
  46. A. Dibazar, S. Narayanan, and T. Berger, “Feature analysis for automatic detection of pathological speech,” in Proceedings of the Second Joint EMBS-BMES Conference 2002 24th Annual International Conference of the Engineering in Medicine and Biology Society. Annual Fall Meeting of the Biomedical Engineering Society, pp. 182-183, Houston, TX, USA. View at Publisher · View at Google Scholar
  47. Z. Ali, M. Alsulaiman, I. Elamvazuthi et al., “Voice pathology detection based on the modified voice contour and SVM,” Biologically Inspired Cognitive Architectures, vol. 15, pp. 10–18, 2016. View at Publisher · View at Google Scholar · View at Scopus
  48. D. D. Deliyski, H. S. Shaw, and M. K. Evans, “Influence of sampling rate on accuracy and reliability of acoustic voice analysis,” Logopedics Phoniatrics Vocology, vol. 30, no. 2, pp. 55–62, 2005. View at Publisher · View at Google Scholar · View at Scopus
  49. Y. Horii, “Jitter and shimmer in sustained vocal fry phonation.,” Folia Phoniatrica, vol. 37, no. 2, pp. 81–86, 1985. View at Publisher · View at Google Scholar · View at Scopus
  50. J. L. Fitch, “Consistency of fundamental frequency and perturbation in repeated phonations of sustained vowels, reading, and connected speech,” Journal of Speech and Hearing Disorders, vol. 55, no. 2, pp. 360–363, 1990. View at Publisher · View at Google Scholar · View at Scopus
  51. M. O. Sarria Paja, G. Daza Santacoloma, J. I. Godino Llorente, C. G. Castellanos Domínguez, and N. Sáenz Lechón, “Feature selection in pathological voice classification using dinamyc of component analysis,” in Proceedings of the 4th International Symposium on Image/Video Communications, Bilbao, Spain, 2008.
  52. D. Martínez, E. Lleida, A. Ortega, A. Miguel, and J. Villalba, “Voice pathology detection on the Saarbrücken voice database with calibration and fusion of scores using multifocal toolkit,” in Advances in Speech and Language Technologies for Iberian Languages, vol. 328 of Communications in Computer and Information Science, pp. 99–109, Springer, Berlin, Germany, 2012. View at Publisher · View at Google Scholar
  53. N. Sáenz-Lechón, J. I. Godino-Llorente, V. Osma-Ruiz, and P. Gómez-Vilda, “Methodological issues in the development of automatic systems for voice pathology detection,” Biomedical Signal Processing and Control, vol. 1, no. 2, pp. 120–128, 2006. View at Publisher · View at Google Scholar · View at Scopus