Table of Contents
ISRN Signal Processing
Volume 2012, Article ID 306305, 9 pages
http://dx.doi.org/10.5402/2012/306305
Research Article

Direct Recovery of Clean Speech Using a Hybrid Noise Suppression Algorithm for Robust Speech Recognition System

1School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798
2School of Electronic and Information Engineering, Beihang University, China

Received 9 November 2012; Accepted 28 November 2012

Academic Editors: L. Fan and A. M. Peinado

Copyright © 2012 Peng Dai et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, Ny, USA, 1993.
  2. B. Gold and N. Morgan, Speech and Audio Signal Processing—Processing and Perception of Speech and Music, John Wiley & Sons, 2000.
  3. Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 32, no. 6, pp. 1109–1121, 1984. View at Google Scholar · View at Scopus
  4. D. Yu, L. Deng, J. Droppo, J. Wu, Y. Gong, and A. Acero, “Robust speech recognition using a cepstral minimum-mean-square-error- motivated noise suppressor,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 5, pp. 1061–1070, 2008. View at Publisher · View at Google Scholar · View at Scopus
  5. K. M. Indrebo, R. J. Povinelli, and M. T. Johnson, “Minimum mean-squared error estimation of mel-frequency cepstral coefficients using a novel distortion model,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 8, pp. 1654–1661, 2008. View at Publisher · View at Google Scholar · View at Scopus
  6. J. Chen, J. Benesty, Y. Huang, and S. Doclo, “New insights into the noise reduction Wiener filter,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1218–1233, 2006. View at Publisher · View at Google Scholar · View at Scopus
  7. L. Deng, J. Droppo, and A. Acero, “Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features,” IEEE Transactions on Speech and Audio Processing, vol. 12, no. 3, pp. 218–233, 2004. View at Publisher · View at Google Scholar · View at Scopus
  8. S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans Acoust Speech Signal Process, vol. 27, no. 2, pp. 113–120, 1979. View at Google Scholar · View at Scopus
  9. European Telecommunications Standards Institute (ETSI), ETSI ES 202 050 V1.1.5, 2007.
  10. C. Chia-Ping and A. B. Jeff, “MVA processing of speech features,” IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 1, pp. 257–270, 2007. View at Publisher · View at Google Scholar · View at Scopus
  11. A. Acero, Acoustical and environmental robustness in automatic speech recognition [Ph.D. thesis], Department of Electrical and Computer Engineering, Carnegie Mellon University, 1990.
  12. K. Ishizuka, T. Nakatani, M. Fujimoto, and N. Miyazaki, “Noise robust voice activity detection based on periodic to aperiodic component ratio,” Speech Communication, vol. 52, no. 1, pp. 41–60, 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. R. Martin, “Noise power spectral density estimation based on optimal smoothing and minimum statistics,” IEEE Transactions on Speech and Audio Processing, vol. 9, no. 5, pp. 504–512, 2001. View at Publisher · View at Google Scholar · View at Scopus
  14. H. Hirsch and D. Pearce, “The Aurora experimental framework for the performance evaluations of speech recognition system under noisy conditions,” in Proceedings of the 7th international conference on Information, communications and signal processing (ICICS '09), Paris, France, 2000.
  15. ITU-T, Recommendation G.712. Transmission Performance Characteristics for Pulse Code Modulation Channels, Geneva, Switzerland, 1996.
  16. R. G. Leonard, “A database for speaker independent digit recognition,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP '84, vol. 3, pp. 42–53, 1984.
  17. M. Brookes, Voicebox, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html.
  18. J. Droppo, A. Acero, and L. Deng, “Evaluation of the SPLICE algorithm on the Aurora 2 database,” in Proceedings of the Eurospeech Conference, International Speech Communication Association, Aalbodk, Denmark, September 2001.
  19. R. Martin, “Spectral subtraction based on minimum statistics,” in Proceedings of the European Signal Processing Conference (EUSIPCO '96), pp. 1182–1185, 1994.