About this Journal Submit a Manuscript Table of Contents
BioMed Research International
Volume 2013 (2013), Article ID 720834, 27 pages
Research Article

Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection

1Department of Computer and Information Science, University of Macau, Macau
2School of Computer Science and Engineering, University of New South Wales, Kensington, NSW 2052, Australia

Received 25 June 2013; Accepted 1 August 2013

Academic Editor: Sabah Mohammed

Copyright © 2013 Simon Fong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. S. Fong, “Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification,” Journal of Biomedicine and Biotechnology, vol. 2012, Article ID 215019, 12 pages, 2012. View at Publisher · View at Google Scholar
  2. S. Fong, K. Lan, P. Sun, O. Mohammed, J. Fiaidhi, and S. Mohammed, “A timeseries pre-processing methodology for biosignal classification using statistical feature extraction,” in Proceedings of the 10th IASTED International Conference on Biomedical Engineering (Biomed '13), pp. 207–214, Innsbruck, Austria, February 2013. View at Publisher · View at Google Scholar
  3. C. F. Chan and W. M. E. Yu, “An abnormal sound detection and classification system for surveillance applications,” in Proceedings of the European Signal Processing Conference (EUSIPCO '10), pp. 1–2, Aalborg, Denmark, August 2010.
  4. G. Peeters, “A large set of audio features for sound description (similarity and classification) in the cuidado project,” CUIDADO Project Report, 2004.
  5. C. Aguiar, Modelling the Excitation Function to Improve Quality in LPC's Resynthesis, Center for Computer Research in Music and Acoustics, Stanford University, Stanford, Calif, USA.
  6. L. R. Rabiner and M. R. Sambur, “Application of an LPC distance measure to the voiced-unvoiced-silence detection problem,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 25, no. 4, pp. 338–343, 1977. View at Publisher · View at Google Scholar · View at Scopus
  7. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Prentice Hall, Englewood Cliffs, NJ, USA.
  8. G. Antoniol, V. F. Rollo, and G. Venturi, “Linear Predictive Coding and Cepstrum coefficients for mining time variant information from software repositories,” in Proceedings of the 2005 International Workshop on Mining Software Repositories, pp. 1–5, July 2005. View at Publisher · View at Google Scholar
  9. N. Awasthy, J. P. Saini, and D. S. Chauhan, “Spectral analysis of speech: a new technique,” International Journal of Information and Communication Engineering, vol. 2, no. 1, pp. 19–28, 2006.
  10. B. Logan, “Mel frequency cepstral coefficients for music modeling,” in Proceedings of the International Symposium on Music Information Retrieval, pp. 1–3, 2000.
  11. S. V. Chapaneri, “Spoken digits recognition using weighted MFCC and improved features for dynamic time warping,” International Journal of Computer Applications, vol. 40, no. 3, pp. 6–12, 2012.
  12. X. Zhou, Y. Fu, M. Liu, M. Hasegawa-Johnson, and T. S. Huang, “Robust analysis and weighting on MFCC components for speech recognition and speaker identification,” in Proceedings of the IEEE International Conference onMultimedia and Expo (ICME '07), pp. 188–191, July 2007. View at Scopus
  13. H. Hermansky, “Perceptual linear predictive (PLP) analysis of speech,” Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738–1752, 1990. View at Publisher · View at Google Scholar
  14. H. Hermansky and N. Morgan, “RASTA processing of speech,” IEEE Transactions on Speech and Audio Processing, vol. 2, no. 4, pp. 578–589, 1994. View at Publisher · View at Google Scholar · View at Scopus
  15. T. Nitta, “Feature extraction for speech recognition based on orthogonal acoustic-feature planes and LDA,” in Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-99), pp. 421–424, March 1999. View at Scopus
  16. J. H. Lee, H. Y. Jung, T. W. Lee, and S. Y. Lee, “Speech feature extraction using independent component analysis,” in Proceedings of the IEEE Interntional Conference on Acoustics, Speech, and Signal Processing, pp. 1631–1634, June 2000. View at Scopus
  17. B. J. Lee, B. Ku, K. Park, K. H. Kim, and J. Y. Kim, “A new method of diagnosing constitutional types based on vocal and facial features for personalized medicine,” Journal of Biomedicine and Biotechnology, vol. 2012, Article ID 818607, 8 pages, 2012. View at Publisher · View at Google Scholar
  18. D. Maunder, J. Epps, E. Ambikairajah, and B. Celler, “Robust sounds of activities of daily living classification in two-channel audio-based telemonitoring,” International Journal of Telemedicine and Applications, vol. 2013, Article ID 696813, 12 pages, 2013. View at Publisher · View at Google Scholar
  19. K. Chenausky, J. MacAuslan, and R. Goldhor, “Acoustic analysis of PD speech,” Parkinson's Disease, vol. 2011, Article ID 435232, 13 pages, 2011. View at Publisher · View at Google Scholar · View at Scopus
  20. S. Fong, “Opportunities and challenges of integrating bio-inspired optimization and data mining algorithms,” in Swarm Intelligence and Bioinspired Computation, chapter 18, pp. 385–401, Elsevier, 2013.
  21. R. Daniloff, G. Schuckers, and L. Feth, The Physiology of Speech and Hearing: An Introduction, Prentice Hall, 1980.
  22. A. V. Oppenheim and R. W. Schafer, Digital Signal Processing, Prentice Hall, Englewood Cliffs, NJ, USA, 1975.
  23. J. G. Proakis and M. Salehi, Communication Systems Engineering, Prentice Hall, Upper Saddle River, NJ, USA, 2nd edition, 2002.
  24. A. Ó. Cinnéide, Linear Prediction: The Technique, Its Solution and Application to Speech, Dublin Institute of Technology, Dublin, Ireland.
  25. S. Furui, “Cepstral analysis technique for automatic speaker verification,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 29, no. 2, pp. 254–272, 1981. View at Scopus
  26. G. Box, G. M. Jenkins, and G. C. Reinsel, Time Series Analysis: Forecasting and Control, Prentice-Hall, 3rd edition, 1994.
  27. R. F. Engle, “Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation,” Econometrica, vol. 50, no. 4, pp. 987–1007, 1982. View at Publisher · View at Google Scholar
  28. E. Keogh and C. A. Ratanamahatana, “Exact indexing of dynamic time warping,” Knowledge and Information Systems, vol. 7, no. 3, pp. 358–386, 2005. View at Publisher · View at Google Scholar · View at Scopus
  29. VidTIMIT Audio-Video Dataset, Conrad Sanderson, 2001–2009, School of Information Technology and Electrical Engineering (ITEE), University of Queensland, St Lucia, Australia, 2013, http://itee.uq.edu.au/~conrad/vidtimit/.
  30. “A database of German emotional speech,” Institute of Communication Science of the TU-Berlin (Technical University of Berlin) and funded by the German Research Community (DFG), 2013, http://pascal.kgw.tu-berlin.de/emodb/.
  31. Y. Obuchi, The PDA speech database, Carnegie Mellon University (CMU), 2003, http://www.speech.cs.cmu.edu/databases/pda/README.html.
  32. Microsoft Text-to-Speech engine, 2013, http://msdn.microsoft.com/en-us/library/hh361572.aspx.
  33. R. Tang and S. Fong, “Wolf search algorithm with ephemeral memory,” in Proceedings of the 7th International Conference on Digital Information Management (ICDIM '12), pp. 1–3, University of Macau, Macau, China, August 2012.
  34. C. D. Manning, P. Raghavan, and H. Schutze, Introduction to Information Retrieval, Cambridge University Press, Cambridge, UK, 2008.
  35. “Chi2 Feature Selection,” Stanford Natural Language Processing Group, 2009, http://nlp.stanford.edu/IR-book/html/htmledition/feature-selectionchi2-feature-selection-1.html.
  36. F. García López, M. García Torres, B. Melián Batista, J. A. Moreno Pérez, and J. M. Moreno-Vega, “Solving feature subset selection problem by a parallel scatter search,” European Journal of Operational Research, vol. 169, no. 2, pp. 477–489, 2006. View at Publisher · View at Google Scholar · View at Scopus
  37. H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226–1238, 2005. View at Publisher · View at Google Scholar · View at Scopus
  38. M. A. Osman, A. Nasser, H. M. Magboub, and S. A. Alfandi, “Speech compression using LPC and wavelet,” in Proceedings of the 2nd International Conference on Computer Engineering and Technology (ICCET '10), vol. 7, pp. 92–99, April 2010. View at Publisher · View at Google Scholar · View at Scopus
  39. A. G. Janecek, W. N. Gansterer, M. A. Demel, and G. F. Ecker, “On the relationship between feature selection and classification accuracy,” JMLR Workshop and Conference Proceedings, vol. 4, pp. 90–105, 2008.
  40. N. Landwehr, M. Hall, and E. Frank, “Logistic model trees,” Machine Learning, vol. 59, no. 1-2, pp. 161–205, 2005. View at Publisher · View at Google Scholar · View at Scopus
  41. J. Carletta, “Squibs and discussions: assessing agreement on classification tasks: the kappa statistic,” Computational Linguistics, vol. 22, no. 2, pp. 248–254, 1996. View at Scopus
  42. A. J. Viera and J. M. Garrett, “Understanding interobserver agreement: the kappa statistic,” Family Medicine, vol. 37, no. 5, pp. 360–363, 2005. View at Scopus
  43. P. M. W. David, “Evaluation: from precision, recall and F-factor to ROC, informedness, markedness & correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.
  44. D. L. Olson and D. Dursun, Advanced Data Mining Techniques, Springer, 1st edition, 2008.
  45. P. Baldi, S. Brunak, Y. Chauvin, C. A. F. Andersen, and H. Nielsen, “Assessing the accuracy of prediction algorithms for classification: an overview,” Bioinformatics, vol. 16, no. 5, pp. 412–424, 2000. View at Scopus