EURASIP Journal on Audio, Speech, and Music Processing
Volume 2008 (2008), Article ID 278185, 14 pages
doi:10.1155/2008/278185
Abstract
The behavior of time delay estimation (TDE) is well understood and therefore attractive to apply in acoustic source localization (ASL). A time delay between microphones maps into a hyperbola. Furthermore, the likelihoods for different time delays are mapped into a set of weighted nonoverlapping hyperbolae in the spatial domain. Combining TDE functions from several microphone pairs results in a spatial likelihood function (SLF) which is a combination of sets of weighted hyperbolae. Traditionally, the maximum SLF point is considered as the source location but is corrupted by reverberation and noise. Particle filters utilize past source information to improve localization performance in such environments. However, uncertainty exists on how to combine the TDE functions. Results from simulated dialogues in various conditions favor TDE combination using intersection-based methods over union. The real-data dialogue results agree with the simulations, showing a 45% RMSE reduction when choosing the
intersection over union of TDE functions.