EURASIP Journal on Audio, Speech, and Music Processing
Volume 2007 (2007), Article ID 65420, 13 pages
doi:10.1155/2007/65420
Research Article
An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
1Mitsubishi Electric Research Laboratories (MERL), 201 Broadway, Cambridge 02139-4307, MA, USA
2Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge 02139, MA, USA
Received 29 November 2006; Revised 14 March 2007; Accepted 23 April 2007
Academic Editor: Stephen Voran
Copyright © 2007 Bhiksha Raj et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
We describe an FFT-based companding algorithm for preprocessing speech before
recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at −5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (−5 dB SNR to +15 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations.