Table of Contents Author Guidelines Submit a Manuscript
The Scientific World Journal
Volume 2014 (2014), Article ID 146040, 12 pages
http://dx.doi.org/10.1155/2014/146040
Research Article

Voice Activity Detection in Noisy Environments Based on Double-Combined Fourier Transform and Line Fitting

1Department of Biomicrosystem Technology, Korea University, Anam-dong, Seongbuk-gu, Seoul 136-713, Republic of Korea
2School of Computer Science Engineering, Incheon National University, Songdo-dong, Yeonsu-gu, Incheon 406-772, Republic of Korea
3Office of Naval Research, Arlington, VA 22203, USA
4School of Electrical Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul 136-713, Republic of Korea

Received 7 February 2014; Revised 4 July 2014; Accepted 10 July 2014; Published 6 August 2014

Academic Editor: Juan Manuel Gorriz Saez

Copyright © 2014 Jinsoo Park et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. J. Beh, R. H. Baran, and H. Ko, “Dual channel based speech enhancement using novelty filter for robust speech recognition in automobile environment,” IEEE Transactions on Consumer Electronics, vol. 52, no. 2, pp. 583–589, 2006. View at Publisher · View at Google Scholar · View at Scopus
  2. J. Beh and H. Ko, “Spectral subtraction using spectral harmonics for robust speech recognition in car environments,” in Computational Science, vol. 2660 of Lecture Notes in Computer Science, pp. 1109–1116, 2003. View at Google Scholar
  3. J. G. Wilpon and L. R. Rabiner, “Application of hidden Markov models to automatic speech endpoint detection,” Computer Speech and Language, vol. 2, no. 3-4, pp. 321–341, 1987. View at Publisher · View at Google Scholar · View at Scopus
  4. B. Wu and K. Wang, “Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments,” IEEE Transactions on Speech and Audio Processing, vol. 13, no. 5, pp. 762–774, 2005. View at Publisher · View at Google Scholar · View at Scopus
  5. N. S. A. Kadel and A. M. Refat, “End points detection for noisy speech using a wavelet based algorithm,” in Proceedings of the 16th National Radio Science Conference (NRSC ’99), pp. C18/1–C18/5, February 1999.
  6. “Voice Activity Detector (VAD) for Adaptive Multi-Rate (AMR) Speech Traffic Channels, ETSI EN 301 708 Recommendation, ETSI, 1999”.
  7. “Speech processing, transmission and quality aspects (STQ), Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithm,” ETSI ES 202 050 Recommendation, ETSI, 2002.
  8. M. Bahoura and J. Rouat, “Wavelet speech enhancement based on the Teager energy operator,” IEEE Signal Processing Letters, vol. 8, no. 1, pp. 10–12, 2001. View at Publisher · View at Google Scholar · View at Scopus
  9. J. Ramírez, J. C. Segura, C. Benítez, Á. de la Torre, and A. Rubio, “Efficient voice activity detection algorithms using long-term speech information,” Speech Communication, vol. 42, no. 3-4, pp. 271–287, 2004. View at Publisher · View at Google Scholar · View at Scopus
  10. J. Sohn, N. S. Kim, and W. Sung, “A statistical model-based voice activity detection,” IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1–3, 1999. View at Publisher · View at Google Scholar · View at Scopus
  11. J. M. Górriz, J. Ramírez, E. W. Lang, and C. G. Puntonet, “Jointly gaussian pdf-based likelihood ratio test for voice activity detection,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 8, pp. 1565–1578, 2008. View at Publisher · View at Google Scholar · View at Scopus
  12. J. Ramírez, J. C. Segura, C. Benítez, L. García, and A. Rubio, “Statistical voice activity detection using a multiple observation likelihood ratio test,” IEEE Signal Processing Letters, vol. 12, no. 10, pp. 689–692, 2005. View at Publisher · View at Google Scholar · View at Scopus
  13. J. Chang, N. S. Kim, and S. K. Mitra, “Voice activity detection based on multiple statistical models,” IEEE Transactions on Signal Processing, vol. 54, no. 6, pp. 1965–1976, 2006. View at Publisher · View at Google Scholar · View at Scopus
  14. J. Ramírez, J. C. Segura, J. M. Górriz, and L. García, “Improved voice activity detection using contextual multiple hypothesis testing for robust speech recognition,” IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 8, pp. 2177–2189, 2007. View at Publisher · View at Google Scholar · View at Scopus
  15. J. M. Górriz, J. Ramírez, E. W. Lang, C. G. Puntonet, and I. Turias, “Improved likelihood ratio test based voice activity detector applied to speech recognition,” Speech Communication, vol. 52, no. 7-8, pp. 664–677, 2010. View at Publisher · View at Google Scholar · View at Scopus
  16. Q. Li and A. Tsai, “A matched filter approach to endpoint detection for robust speaker verification,” in Proceedings of the IEEE Workshop on Automatic Identification, Summit, NJ, USA, October 1999.
  17. Q. Li, J. Zheng, A. Tsai, and Q. Zhou, “Robust endpoint detection and energy normalization for real-time speech and speaker recognition,” IEEE Transactions on Speech and Audio Processing, vol. 10, no. 3, pp. 146–157, 2002. View at Publisher · View at Google Scholar · View at Scopus
  18. T. Fukuda, O. Ichikawa, and M. Nishimura, “Long-term spectro-temporal and static harmonic features for voice activity detection,” IEEE Journal on Selected Topics in Signal Processing, vol. 4, no. 5, pp. 834–844, 2010. View at Publisher · View at Google Scholar · View at Scopus
  19. S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979. View at Publisher · View at Google Scholar · View at Scopus
  20. Y. Lee, B. Kim, Y. Kim, D. Choi, K. Lee, and Y. Um, “Creation and assessment of Korean speech and noise DB in car environment,” in Proceedings of the International Conference on Language Resources and Evaluation, pp. 1403–1406, Lisbon, Portugal, May 2004.
  21. Y. Lee, B. Kim, and Y. Um, “Speech information technology & industry promotion center in Korea: activities and directions,” in Proceedings of the International Conference on Language Resources and Evaluation, pp. 1851–1854, 2002.