Abstract

The rich literature on acoustic source localization mostly relies on the assumption of a constant value for the speed of sound. This hypothesis allows establishing simple relations between range differences and time differences and leads to effective estimation algorithms. However, it must be challenged for certain applications of wireless acoustic sensor networks in multizone buildings and outdoor environments. This article revisits the source localization problem for the more general case of an unknown value for the speed of sound. It reviews the physical foundations for the dependence of the speed of sound on the air temperature and presents the essential approaches to acoustic source localization. On this basis, several methods for source localization under uncertain or variable speed of sound conditions from the literature are discussed. Applications from different fields are shown. They comprise the localization of sources, sensors, and reflecting surfaces, time-difference-of-arrival disambiguation, and the direct determination of the speed of sound or the air temperature from acoustic measurements.

1. Introduction

Acoustic sensor networks attempt to record one or more desired sound sources in the presence of other unwanted sources. These unwanted sources may be competing sources, noise sources, or reflections thereof. A popular way to distinguish between desired and unwanted sources is by their direction or location. For mirrored reflections also the reflector position may be of interest. The localization of sound sources provides information to enhance the desired source and to attenuate unwanted sources by beamforming techniques [1]. Source localization is also a topic in its own right for the analysis of acoustic scenes or for tracking of sound sources.

Several methods for source localization from multimicrophone recordings are available [2]. Among these, the classical time-of-arrival (TOA) and time-difference-of-arrival (TDOA) methods are still competitive due to their computational simplicity and reasonable performance. This property carries over to wireless acoustic sensor networks where the computing power is restricted by hardware and energy constraints.

Both TOA- and TDOA-methods estimate time differences between source and receiver or between receivers and convert them to range differences. The conversion factor is the propagation speed of the sound waves or briefly the speed of sound. In many applications it is considered to be constant. This choice is derived from the historical roots of TDOA-methods (and similar for TOA) in phased array antennas. There, the corresponding conversion factor is the propagation speed of electromagnetic waves in the earth’s atmosphere which is not subject to appreciable environmental influence.

The corresponding assumption of a constant speed of sound is also justified for indoor applications under controlled environmental conditions, in particular for constant room temperature. Typical applications are laboratory environments or office spaces for video and speech communications. However, wireless acoustic sensor networks have potential outdoor applications where the air temperature may be subject to considerable daily or seasonal variations. Also large temperature variations along the propagation path are conceivable for communication in spaces with extreme working conditions like steel mills or cold storage houses.

The fact that environmental conditions may affect wave propagation strongly is well known from underwater acoustics. Here the variation of the salinity degree in ocean water leads to nonstraight propagation paths and multiple reflections at the water surface [36]. Are similar effects possible also for outdoor sound propagation and how would TDOA measurements be affected?

This article reviews the effect of temperature variations on sound propagation in air and further on the accuracy of source localization with acoustic sensors. These results are of interest also to wireless acoustic sensor networks where synchronization between all sensors is not guaranteed.

Wireless networks have received considerable attention in the recent literature. A hierarchical approach has been presented in [7] where a distributed network consists of multiple compact arrays, the so-called network nodes. The microphones in each node are synchronized and allow TDOA estimates but there is no synchronization between the nodes. Ranging and self-positioning in wireless acoustic sensor networks are considered in [8]. Here active nodes with one microphone and one loudspeaker each permit TOA estimation by emitting test signals. The nodes are not synchronized among each other. A similar problem is addressed in [9] where TDOA estimates with unknown time offset are used. The special problems of low-cost wireless acoustic sensor networks are discussed in [10]. Here a statistical framework is invoked to provide an efficient localization algorithm. Unsynchronized communication between wireless acoustic sensors is studied in [11] where bearing-only information is exchanged between the network nodes. In all these cases, the conversion from time delay estimates to range estimates (where required) is based on the assumption of a common and known speed of sound.

This limiting assumption is dropped here and the consequences for source localization are investigated. This article is an extended version of a slide presentation at [12]. It is structured as follows: Section 2 gives a brief overview on delay-based source localization. Then Section 3 investigates the influence of the propagation speed on the shape of the propagation path and on the travelled distance of sound waves. Section 4 discusses the mathematical formulation of the TDOA-based source localization problem in some detail. This formulation is extended in Section 5 to include the propagation speed as an additional unknown in the estimation process. Some applications from the literature are reviewed in Section 6 before Section 7 concludes the article.

2. Delay-Based Localization of a Sound Source

This section reviews briefly the foundations of delay-based localization. It is meant to be a first introduction to the basic idea. A more profound discussion follows in Section 4. The topic is also well covered in the literature; see, for example, [1, 2, 1317].

Figure 1 shows on the left a point-like sound source. Its sound waves are recorded by four microphones in an arbitrary geometric arrangement. All four microphones record the same waveform , but with a slightly different time delay according to their individual distance from the source. An example for the resulting microphone signals is shown on the right hand side.

The absolute delay of the individual signals corresponds to the distance between source and microphone. It is called the time-of-arrival (TOA) and can be inferred from the recorded signal only when source and receiver are synchronized. In all other cases, the difference between the arrival times between pairs of the four microphone signals can be determined as time-difference-of-arrival (TDOA). Since the TDOA-case is more important for practical applications, it will be emphasized in the sequel.

Figure 1(b) is somewhat idealistic since it might suggest that the estimation of TDOAs was an easy task. Actually, it can be quite problematic for general speech and audio signals and for different kinds of environments. Two methods are established to determine TDOAs from sensor signals: correlation of the sensor signals or estimation of the room impulse response.

Correlation methods determine the time lag between the arrival times of a source signal at two different sensors positions from the cross correlation between the corresponding sensor signals. Ideally the location of the maximum of the cross correlation function is detected as the most dominant peak. However due to self-similarity of the source signal, multiple source signals, or reverberation, the cross correlation may have multiple peaks and its true maximum may be ambiguous.

Estimates of the room impulse responses between source and sensors cannot rely on the knowledge of the source signal or source position. Thus they have to be conducted in a blind fashion from measured sensor signals only. Differences in the onset of the impulse responses with respect to two different sensors indicate the time-difference-of-arrival. Again the quality of the estimation may be impaired by competing sources and by reverberation. Temperature effects have been described in [18].

The estimation of TDOAs by both methods is well documented [13, 1517, 1921] and is not further elaborated here. As a very general statement, correlation methods are faster to compute than blind estimation methods. Therefore correlation is often preferred for real-time source localization, for example, for tracking purposes.

Once a time difference between two microphones has been estimated the corresponding range difference can be obtained fromfor a constant speed of sound . If the speed of sound is unknown then it has to be estimated from the recorded signals. Further, if the speed of sound varies along the propagation path, then the assumption of propagation along a straight line is questionable and (1) might not hold anymore. These cases are further investigated in Section 3.

3. Foundations from Physics

This section reviews some of the physical foundations of the propagation of sound waves. These comprise the relation between travel time and travelled distance, the dependence of the propagation speed on the air temperature, and the shape of the propagation path. Then some conclusions for the application to source localization are drawn.

3.1. Relation between Travel Time and Travelled Distance

The relation between travelled distance and the travel time of sound waves is determined by the acoustic wave equation. Since the acoustic wave equation admits very general and complex solutions often approximations are applied. An approximate solution which is valid for short wavelengths is the eikonal equation (see, e.g., [22, 23]). It describes the wavefronts and their gradients which point into the direction of propagation similar to rays in optics. Following the ray from source to receiver reveals the propagation path which is not necessarily a straight line.

Figure 2 shows the general case of a propagation path between a sound source and a microphone as receiver. The locations of source and receiver are given by the position vectors and , respectively. The length of the propagation path is measured by a coordinate along the path.

The propagation speed at each location on the path is simply the time derivative of . It can also be characterized by its inverse, the slowness :The length of the propagation path, that is, the distance travelled by a wave from source to receiver, as well as the travel time can be obtained by integration along the path:

The relation for the travel time in (3) is trivial if the propagation path is a straight line of length and if the propagation speed is constant :with the Euclidian norm . Indeed, most papers assume this case with or similar values.

However, the propagation speed of sound waves depends on the air temperature and—to a lesser extent—on humidity. Thus it may vary along the propagation path and it may vary with time at a fixed location.

3.2. Dependence of the Speed of Sound on the Air Temperature

For a specific gas the speed of sound depends on various variables and physical constants; see, for example, [24]. The variable quantities include air temperature, pressure, and density. Since variations of the air temperature are dominant over variations of air pressure and density, the latter are considered to be constant here. Then the dependence of the speed of sound on the air temperature can be written in a simple form aswith the absolute temperature , the adiabatic index , and the specific gas constant .

The adiabatic index is the ratio of the heat capacity with respect to constant pressure over the heat capacity with respect to constant volume. Its value can be deduced from the degrees of freedom of the gas molecules. Dry air is mainly composed of and with two atoms per molecule and an adiabatic index of . Environments with considerable concentrations of or require slightly different values [24]. The specific gas constant is calculated from the universal gas constant and the molecular mass of air. The value for dry air is .

For engineering applications, the Celsius scale is more convenient than the absolute temperature scale. Furthermore a Taylor expansion around the zero point gives a linear approximationwithFor room temperature at the familiar value of results.

The linear approximation (6) is a good fit to the square root law (5) as shown in Figure 3. The temperature range from to has been chosen to match the temperature specification of many mobile consumer devices. Therefore, the linear approximation is used from now in lieu of the physically correct model from (5).

Obviously, there is a considerable variation of the speed of sound over the shown temperature range. It can be calculated quickly from the temperature sensitivity of the linear approximation (6):It may be concluded that the variation of the speed of sound over the operation range of sensor devices cannot be neglected.

3.3. Shape of the Propagation Path

Having investigated the variation range of the speed of sound, it is now of interest to consider the shape of the propagation path. The relations between speed of sound variations and the path shape or the curvature of wavefronts is well studied; see, for example, [5, 22, 25] and references therein. A simplified analysis is presented here. It discusses only the extent to which the propagation path deviates from a straight line. To this end, the path from Figure 2 is approximated by the segment of a circle. A justification of this assumption is given at the end of this section.

Figure 4 shows the adopted circular shape of the propagation path. The full circle is characterized by its radius and the segment by the angle . The path length is connected to and by the definition of the radian, while the baseline of the segment is connected to and by the definition of the trigonometric sine function.

Eliminating the angle from the basic geometric relations in Figure 4 results in an expression with an inverse sine function which can be approximated by a truncated Taylor-series:This approximation gives an algebraic relation for the path length :which can be separated into a direct connection with a relative length of unity and into an excess length caused by the curvature of the pathThe relative excess length is obviously given by the relation between the length of the direct path and the radius .

The connection to the speed of sound is now established through the eikonal equation [22]. It states that the curvature as inverse radius is connected to the speed of sound and its spatial derivative along the propagation path (see Figure 2).The term in parenthesis can be approximated by the total variation of the propagation speed along the path and the denominator can be approximated by the mean velocity along the pathsuch that

For a rough estimate of the numerical values encountered in acoustic source localization consider an example where the source is at a temperature of and the receiver at a temperature of (see Table 1). The corresponding values for the speed of sound follow from (6) and (7).

Inserting the values from Table 1 into (12) and (11) gives

This example shows that a temperature variation of along the propagation path leads to a negligible relative excess length. Thus the curvature effect predicted by the eikonal equation is negligible under these conditions and the simple model of propagation along a straight line is a very good approximation.

It remains to justify the assumption of the circular shape of the propagation path. It is equivalent to a constant value of the curvature. To show that this is a reasonable assumption for slight variations of the speed of sound along the propagation path, assume a speed of sound ofA comparison with the values of Table 1 shows that this assumption is indeed realistic. For the components of (12) the following holds:where the latter result is a first-order Taylor approximation. Using the relation from (16) and inserting into (12) givewhich is indeed a constant value along the propagation path.

This analysis shows that the initial assumption of a circular shape of the propagation path is justified since the curvature is constant. Furthermore this constant has been shown to be negligible such that the propagation path can be well approximated by a straight line.

3.4. Lessons Learnt from Physics

What does the above digression into the physics of ideal gasses and the acoustic wave equation tell about source location?

At first, it can be stated that the travel time between source and receiver depends on(i)the length of the propagation path(ii)the speed of sound along the propagation path.

Thus travel times (or their differences) can be converted into ranges (or range differences) only when the speed of sound is known. The popular assumption of a constant and known propagation speed implies the assumption of a constant and known air temperature.

Furthermore, temperature variations along the propagation path have the potential to bend the propagation path away from a straight line, depending on the spatial derivative of the speed of sound. However, within the temperature range as experienced by human users such a deflection from the straight line is negligible. Similar considerations apply to variations of the air density (e.g., humid air) or the air pressure.

In conclusion, the conversion between time differences and range differences for source localization can be based on the following assumptions:(i)The propagation path is a straight line.(ii)For constant conditions with respect to temperature, density, and pressure, the propagation speed is also constant.(iii)Otherwise, variations of the propagation speed may require special consideration.

3.5. Mean Propagation Speed

The calculation of the speed of sound from the temperature and air density and pressure according to (5) is not practical since these quantities are not accessible by measurements. Even when the temperature at the source and the receiver were known from local measurements, the temperature variation along the propagation path would still not be available.

Therefore a mean temperature is introduced, which is treated as an unknown quantity and subject to estimation. A rigorous definition can be derived from (3) asIn the sequel, the notation “propagation speed ” is used in lieu of the mean propagation speed .

3.6. Different Levels of Assumptions

The investigation of the physical background shows that the propagation speed of sound waves is an unknown quantity. However its value can be predicted if standard conditions are assumed. The following assumptions are found in the literature on source localization:(1)The propagation speed is constant with respect to time and space and its value is known. This assumption is adopted in most of the literature with a room temperature of . Other conditions (like dry air, height at sea level) are assumed implicitly.(2)The propagation speed is constant with respect to space and also with respect to time during a short measurement period. Its actual value is unknown. This case is considered in some publications as discussed in Section 5.(3)The propagation speed is constant with respect to time during a short measurement period but it may vary in an unknown way along the propagation path. This case can be reduced to assumption (2) by considering the mean propagation speed from (19).

4. Source Localization

This section discusses the foundations of acoustic source localization from measured time differences in some detail. The estimation of these time differences from recorded microphone tracks is not reviewed here; see the literature references in Section 2. Instead the redundancy of the estimated time differences is discussed. To this end, methods from graph theory turn out to be useful. Further, the general estimation problem is formulated and approaches for its solution are discussed.

This field is well researched for the assumption of a constant speed of sound. Therefore the presentation is focussed on those topics which are of importance for discussing the case of unknown propagation speed in Section 5.

The general procedure is shown in Figure 5. At first time differences are estimated, for example, by pairwise correlation between microphone signals; see Figure 1. Then these time differences are converted into range differences with relations like (19) with known or unknown propagation speed. Finally geometric relations set up a system of equations with the source position as unknown variable.

The time differences can be estimated between source and receiver as times-of-arrival (TOAs), when synchronization between source and receivers can be established (see, e.g., [26]). In many practical applications such synchronization is not possible, for example, for human speaker localization. In these cases, time differences are estimated between the receivers as time-differences-of-arrival (TDOAs). Both approaches are similar. For TOA, the synchronized source can be regarded as a reference, while, for TDOA, one or more of the receivers are selected as reference. The further presentation is therefore formulated for the TDOA-case. Suitable extensions to the TOA case follow in a similar fashion.

4.1. Time-Differences-of-Arrival (TDOAs)

The time-differences-of-arrival are the differences in the absolute arrival times of a wavefront from a distant source measured at a pair of microphones (see Figure 6). The absolute arrival times are not required if one microphone is chosen as a reference, for example, microphone 0 in Figure 6.

The values of the time-difference-of-arrival (TDOA-values, TDOAs) between microphones and exhibit an odd symmetryThe arrangement of these TDOA-values in a matrix with row index and column index is shown in

This matrix is called a redundant TDOA matrix because not all elements are independent of each other. According to (21) and (22) the elements on the main diagonal are zero and the elements in the upper triangle are equal to the lower triangle up to a sign change. Therefore the elements in the lower triangle are sufficient to establish the complete matrix. They are called the full TDOA set. These values are shown in bold face in (23) while the redundant elements are shown in normal type.

The mathematical properties of TDOA matrices have been investigated recently in [27]. It is shown how the resulting algebraic structures can be exploited for TDOA denoising and missing data recovery.

4.2. Graph Representation of TDOAs

Further insight can be gained by a representation of the TDOA-values as weighted and directed graph (see, e.g., [16]). Such a graph is shown in Figure 7. The microphones are the nodes (black with white numbers). The edges represent the time differences between the microphones with the TDOA-values as weights. The graph is directed because changing the direction causes a sign inversion or interchanges the index values in (22).

It is also of interest to investigate the meshes of the graph from Figure 7. One of these meshes is shown by a blue path. Following this path shows that the TDOA-values in a mesh add up to zero This fact is known as the cyclic sum property [16]. It is analogous to the sum of the voltages around a mesh in an electrical circuit. The same result can also be established for the other meshes which involve the reference point 0The cyclic sum condition allows expressing the TDOA-values between two nonreference microphones (i.e., ) by two TDOAs which both involve the reference microphone.

Thus also the full TDOA set still contains some redundancy and the elements in the first column of the lower triangle are sufficient. They are called the spherical TDOA set. However, this statement holds only under ideal conditions, that is, no noise and known speed of sound.

The spherical set is indicated on Figure 7(b) in bold face and underlined (), while the dependent values on the outer edges are shown in bold face (). The corresponding matrix representation is shown in (26). Again the spherical set is printed in bold face and underlined (first column of the lower triangular matrix), while the elements with are shown in bold face. The elements which resemble the symmetry conditions (21) and (22) are shown in normal type:So far a minimum set of TDOAs has been identified. It represents the information on the arrival time of a wave front which can be retrieved by a sensor array. It is the basis for many source localization algorithms as outlined in Section 4.3.

4.3. Methods for TDOA-Based Source Localization

This section describes the general procedure of TDOA-based source localization. It presents only an overview of the main approach, which is subject to many variations.

The first step is to determine from a set of microphones a spherical set of TDOAs for . The microphone has been chosen as the reference. The TDOAs are all measured with respect to the reference.

Next, the time differences are converted into range differences for a fixed and known speed of sound

Figure 8 shows an example for and . The unknown source position must satisfywhere denotes the Euclidian distance of the source position from the origin. The difference of two distances in (28) is not very suitable for a solution by methods of linear algebra. Therefore, it is a common approach (see, e.g., [17, Chap. 51.3.5]) to rearrange (28) and square the result to arrive ator in matrix notationwith the vectors and matrices

Equation (30) is a matrix equation which is easy to solve for the vector of unknowns . However, the presence of both and makes the system of equations nonlinear. Furthermore, (30) describes an ideal case where all time differences and thus the range differences have been determined without any errors. Actually, the equality in (30) does not hold exactly due to measurement noise and uncertainties in the value of the speed of sound. Thus (30) has to be considered as a nonlinear estimation problem rather than a system of linear equations.

4.4. Approaches for Solving the Estimation Problem

Reconsidering the estimation problem (30) from a slightly different view point turns the nonlinear estimation problem into a linear one. Replacing (30) bywith the constraintallows solving (33) as a linear problem where the two- or three-dimensional position vector and the source distance are considered as independent variables. This solution can be improved by exploiting constraint (34).

This very general principle has been turned into practical estimation algorithms in many different ways; see, for example, [28, 29]. However, least squares approaches prevail [3033]. They can be grouped into so-called unconstrained and constrained least squares methods.

4.4.1. Unconstrained Least Squares Method

The family of unconstrained least squares (ULS) methods solves (33) with the pseudoinverse asThen is accepted as the position estimate, while is disregarded. Thus constraint (34) is neglected for the sake of simplicity.

In this case it is of advantage to perform the calculation of and separately. To this end define the orthogonal projection matricesThen the estimate for the source position and—if desired—of the source distance can be obtained with the pseudoinverses of the orthogonal projection matrices as

4.4.2. Constrained Least Squares Method

More accurate solutions can be obtained by solving (33) in the fashion of (35) and then by exploiting constraint (34) through minimization of the residual errorThese approaches are called constrained least squares (CLS) methods.

Both unconstrained and constrained least squares methods are well established in acoustic source localization for the case of known and constant speed of sound. However, both methods are also starting points for more general source localization algorithms where also the speed of sound is considered as an additional unknown.

5. Estimation of the Propagation Speed

This section reviews a number of source localization methods where the assumption of a constant and well known propagation speed is dropped. Instead the propagation speed is considered to be an additional unknown variable which is included into the estimation process.

5.1. Motivation

There are a number of reasons for extending the classical source localization methods to the more general case of variable speed of sound conditions. The most important is certainly to make source localization more robust against variations of the environmental conditions. This need arises for wireless acoustic sensor networks, since they provide the technical means for outdoor applications. Daily temperature variations of the day-night-cycle or seasonal variations may affect the speed of sound considerably as discussed in Section 3.

Another point where speed of sound estimates may matter is the detection of misestimated TDOAs. Time differences are mainly estimated from correlations between microphone signals. These correlations appear due to propagation delays of the same source signal and are related to range differences according to (27). However, also self-similarity in the source signal leads to correlations which are unrelated to any range differences. Estimating the propagation speed for each candidate TDOA-estimate and comparing it to physically meaningful values can unveil misestimated TDOAs [3437].

Another application where highly accurate localization matters is the calibration of microphone or loudspeaker arrays [26, 38, 39]. In a similar way, also the localization of reflecting surfaces, their properties, and thus the determination of the room geometry depends on correct estimates of the speed of sound [4044]. Finally source localization can be used for the direct estimation of temperature and flow in air [45, 46] and in other fluids [47, 48].

Most of these methods are extensions of the basic principle for constant speed of sound discussed in Section 4.

5.2. General Procedure

The general procedure to include the propagation speed as an additional unknown into the estimation process is summarized in (39). There are different ways to reformulate the linear problem (30) aswhere the matrix contains the known sensor positions and the measured TDOAs . The vector of unknowns contains the unknown source position and the unknown propagation speed . The vector may contain one or more of the known sensor positions , the measured TDOAs , and the unknown propagation speed . Various approaches are presented in detail in the following sections.

5.3. Individual Methods
5.3.1. Extension of the Unconstrained LS Method

Various authors have reconsidered the unconstrained least squares approach from Section 4.4.1 [4951]. To this end the square of (28) is written similar to (29) asand the range differences are replaced by time differences with (27)The matrix notation of these equations is given byA comparison with (31) shows that the vector of unknowns contains now also the propagation speed .

An estimate for the source position is obtained from the unconstrained solutionbut also the propagation speed can be read from (42) as . Note as a caveat that the matrix is likely to be ill-conditioned [35, 49, 51, 52]. Moreover, the obtained speed value tends to be unreliable even if the localization result is acceptable. Thus this approach is not useful when propagation speed estimation is the main purpose.

5.3.2. Plane Wave Approximation

The method from Section 5.3.1 can be simplified and made more robust by a plane wave approximation of the arriving wavefront [35]. In this case there is no source position but only a direction to a distant plane wave source. Consequently only the corresponding unit vector can be estimated.

From Figure 9 it follows by geometrical reasoning thatAgain a matrix equation can be set upwith the vectors and matricesSimilar to Section 5.3.1, the estimated solution gives estimates of both the unit vector and the propagation speed Indeed, this method works well for plane waves, but it gives erroneous estimates for nearby sources; see [35] for an error analysis.

5.3.3. Exploit the Constraint of the ULS Method

Another extension from Section 5.3.1 is to exploit the constraint of the ULS method [5355]. In contrast to the CLS method from Section 4.4.2 the constraint is not used to improve the position estimate but to estimate the propagation speed.

Here it is of advantage to formulate (35) asand to introduce the vector of TDOAs From (27) the following follows for the relation between the range vector and the vector of TDOAs and from (36) for the corresponding projection matrices:and further from (32) the vector of first-order polynomials in

The estimates for source position and source distance from (32) can now be formulated in a concise form with the abbreviationsas

In this form it is straightforward to minimize the constraintby seeking a physically meaningful value of the speed of sound such thatSquaring both sides and multiplying by giveInserting (51) shows that is a cubic polynomial in . It can be solved in closed form by Cardano’s method for values of which satisfy constraint (54).

5.3.4. Exploit the Constraint of the ULS Method by a Linear Approximation

The method just presented in Section 5.3.3 makes no assumptions whatsoever on the unknown propagation speed . Actually, in most applications the range of its probable values is known quite well; see, for example, Figure 3. This rough knowledge can be exploited by expanding constraint (54) into a Taylor series [44]:around a suitable value , for example, at room temperature. The linear approximation of constraint (57) is much easier to solve than the third-order polynomial (56) in . The linearized constraint (57) becomes zero atwhere the parameters and can be expressed by and with (54).

The constraint from (56) and its linearized version from (57) are compared in Figure 10 for an actual temperature of . This temperature is a zero of the constraint . The zero of the linearized constraint is off by only 0.5 K.

This comparison shows that solving the linearized constraint is a good alternative to the solution of the third-order equation (56) whenever the expected temperature range varies around the room temperature.

5.3.5. Exploit the Full TDOA Set

The method presented in Section 5.3.4 uses only one reference microphone, that is, the spherical TDOA set; see Section 4.2. As has been noted in Section 4.1, the full TDOA set contains information which is redundant under ideal conditions. However, for unknown propagation speed, the additional information in the full TDOA set may contribute to a more robust source localization [44].

To this end, constraint (54) is reformulated as constraint for each reference microphone . Calculating the first terms of the Taylor series (57) for all reference microphones and arranging them in vector form give

The resulting equations for a least squares’ solution are compiled in Table 2. The left column corresponds to (57) and (58) with the single reference microphone . The right column lists the corresponding relations where all available TDOAs have been used.

5.3.6. Exploit Multiple Sources

Depending on the method to determine the arrival times, the full TDOA set may not always be available. This is the case when the TDOAs are estimated from measured acoustic room impulse responses. Nevertheless, the principle introduced in Section 5.3.5 can still be applied as long as the additional information is supplied in another way.

One possibility is to estimate the speed of sound not from one but from multiple different source locations [44]. The TDOA sets for different reference microphones as in Section 5.3.5 can then be replaced by TDOA sets for different sources.

To be specific, review the relation for the full TDOA set from Table 2. The summation runs over multiple reference microphones , where and represent different estimates for the distance between a single source and the reference microphone If, on the other hand, multiple sources are present then the constraint can be formulated aswhere and are different estimates for the distance between the source and the single reference. Due to the formal equivalence between (60) and (61) the speed of sound estimate is calculated similarly to Table 2.

5.3.7. Uncertain Sensor Positions

The possibility of uncertain sensor position for the unconstrained least squares method from Section 5.3.1 has been considered in [56]. The vectors and matrices from (42) have been used in the general form of the estimation problem (39) which has been augmented by two noise termsHere, denotes sensor position errors and sensor noise with an error and noise covariance matrix . Then the cost functionis established and minimized by a bi-iterative procedure shown in Figure 11.

6. Applications

This section presents selected instances of combined position and speed of sound estimation. The recent literature on different application fields is surveyed for instructive examples.

The most important application field is the improvement of localization results not only for source bearing or source position but also for the localization of sensors or reflectors. Another application of speed of sound estimation is the disambiguation of TDOAs. Knowledge of the approximate propagation speed can help to discard peaks in the cross correlation which are not caused by time differences. Finally the knowledge of the propagation speed or of the air temperature can be of interest in its own right.

6.1. Improvement of Localization Results

Localization from TOAs or TDOAs requires converting time differences into range differences. Thus its accuracy depends on the correct estimation of the speed of sound as has been demonstrated by various experiments.

6.1.1. Source Localization in a Synchronized Array

The estimation of the source bearing in outdoor situations has been investigated in [57]. To this end, TDOA estimates are carried out by an expectation maximization algorithm due to the assumption of a non-Gaussian TDOA error distribution. Here the value for the speed of sound is calculated from assumed values of the temperature, wind speed, and wind direction.

6.1.2. Source Localization in a Distributed Sensor Network

So far, the problem of localizing acoustic sources by means of a compact sensor array has been considered. However the growing availability of devices endowed with multiple sensors and wireless communication capability allows addressing localization problems by exploiting distributed networks of multiple compact arrays (known as network nodes). A common issue in this scenario is the lack of synchronization between network nodes; as a consequence TDOAs can be calculated only between sensors of the same node.

Such a problem is typically solved by implementing a collaborative approach between the unsynchronized arrays as described, for example, in [7]. A common assumption is that the propagation speed is a known constant; moreover it is implicitly assumed that the propagation speed is the same for each node. This means that at the central node the same speed value is used to convert the TDOAs into range differences to be used for source localization regardless of the node of origin (see Figure 12). This is accomplished by minimization of a cost function based on a hypercone equation metric [7, Eq. ].

Nonetheless wireless acoustic sensor networks (WASN) might present at each node a different speed of sound value due to local temperature variations caused, for example, by device overheating, proximity to indoor heat sources, and different outdoor environmental conditions. State-of-the-art techniques that rely on the conversion of TDOAs into range differences may easily improve their accuracy and robustness by estimating at each node the actual propagation speed from the collected TDOAs. With this approach local speed of sound variations will be compensated without interfering with the final localization carried out at the central node as depicted in Figure 13. To this end, the hypercone equation metric associated with Figure 12 can be calculated with improved range differences for each node. In particular, these range differences for node are calculated from the selected TDOAs with the node specific speed of sound values .

6.1.3. Improvement of Sensor Localization

A sensor localization experiment under unknown speed of sound conditions has been described in [55]. The unknown positions of four microphones were estimated from signals in response to six different source positions. From estimated TOAs the sensor positions were determined by two different approaches: first using a constant standard value for the speed of sound at and second using the propagation speed estimate similar to Section 5.3.1. The actual room temperature varied between and during the course of the measurements. Even with this small temperature difference of five to seven degrees, the sensor location estimates turned out to be considerably more accurate with the second approach.

6.1.4. Improvement of Reflector Localization

The effect of temperature variations on the localization of reflectors have been evaluated in [44, 58]. An array of five microphones and a set of four loudspeakers were used to estimate the acoustic impulse responses in an almost quadratic enclosure. The estimation had been based on the multiple source approach from Section 5.3.6. The positions of the walls were estimated from reflections extracted from the acoustic impulse responses. Mismatches between the actual and the assumed room temperature showed up as misestimations of the room size. An error analysis showed that reflector localization in larger rooms is more sensitive to temperature variations than in small rooms.

6.2. TDOA Disambiguation

The determination of TDOAs from the peaks of cross-correlations is unreliable when the acoustic sensor signals are collected in adverse environments. Also self-similarities in source signals lead to peaks which are not related to source TDOAs. The distinction and exclusion of peaks caused by reverberation and self-similarity is called TDOA disambiguation. Its aim is to keep only those peaks that are related to direct path TDOAs.

Approaches to TDOA disambiguation are based on the correlation among the outputs of blind-source-separation algorithms [59], statistical models of the acoustical propagation delay [60, 61], consistent graph synthesis based on the zero-cyclic sum condition [16, 37], and speed of sound estimation [35].

These approaches have different kinds of drawbacks: statistical models require high computing power, the zero-cyclic sum condition does not distinguish between direct and reflected paths, and the method based on speed of sound estimation [35] is formulated for plane waves only.

However, a two-stage combination has been shown to be efficient for TDOA disambiguation [62]: at first, consistent graph synthesis removes all correlations which do not satisfy the zero-cyclic sum conditions; that is, it keeps only direct and reflected path TDOAs. Then the speed of sound value for each path is estimated as described in Section 5.3.1. The relation between speed of sound and the travelled distance is used to distinguish between direct and reflected path TDOAs. The reflected path TDOAs are removed such that only direct path TDOAs are left.

6.3. Determination of the Mean Air Temperature

Acoustic sensor arrays can also be applied to determine the mean air temperature. The term mean temperature is used in the same sense as the mean propagation speed in Section 3.5; that is, the mean temperature along the propagation path. This mean value cannot be obtained by point measurements with dedicated temperature sensors (thermometers).

A corresponding experiment is described in [44]. An array of ten acoustic sensors is used for source localization of an acoustic source which is moved to 48 equidistant positions around a circle. The source position does not need to be calibrated because the focus is not on the source position but on the speed of sound which minimizes constraint (57). The corresponding temperature can be inferred from the linear approximation shown in Figure 10. Exploiting the full TDOA set as in Section 5.3.5 allows a more robust estimation.

Figure 14 shows the results for two different methods, the minimization of the linearized constraint for the full TDOA set from Table 2 and the plane wave approach from [35]. The experiment has been conducted twice at and at . These temperature values had been determined by point sensor measurements and do not constitute a ground truth for the mean temperature along the path. Nevertheless, under the controlled conditions of a laboratory room, they can be expected to be a good reference.

The temperature readings of the plane wave approach are strongly dependent on the source angle with minima at , , , and and maxima in between. There is a vertical distance of about between both curves, but the angular variation exceeds this distance by far. It is thus not possible to infer reliable temperature information from the plane wave approach.

The estimated temperature from the linearized constraint for the full TDOA set shows a variation of ±1 K around mean values of about and . These mean temperatures are compatible with the point measurements. This example shows that the mean temperature can indeed be determined as a by-product of source localization.

6.4. Determination of the Speed of Sound

Similar to TDOA disambiguation in Section 6.2, the speed of sound is an important parameter to distinguish between different propagation media or types of waves in solids. The following examples are not directly related to wireless acoustic sensor networks but nevertheless they show how speed of sound estimates can be exploited to obtain quite diverse information from time delay measurements.

6.4.1. Biomedical Imaging

Speed of sound estimation for biomedical ultrasound imaging by synthetic aperture sequential beamforming is investigated in [63]. In these applications, the speed of sound depends on the kind of tissue under investigation and cannot be inferred from simple laws like (5). For the backscattering geometry of the ultrasound beamformer, a relation between propagation time and length of the travel path is established. Its square describes a parabola for the unknown speed of sound similar to (41). Starting from an initial guess, the speed of sound estimate is obtained by an iterative curve fitting procedure.

6.4.2. Building Technology

A method to localize footsteps in buildings has been developed in [64]. It is based on TOA and TDOA estimation by wave propagation within the structure of buildings. The speed of sound in typical building materials is well known and not subject to large environmental influences. However, the kind of relevant materials depends on the unknown location. Therefore a set of physically plausible propagation speeds is used for classical TDOA-based source localization and the best fit to an underlying model is selected.

A nondestructive testing method for structural monitoring of buildings is discussed in [27]. It is based on sound propagation in solids and shall reveal the existence and location of cracks. Solids support not only longitudinal waves (like sound waves in air) but also transversal waves. These different types of wave propagation are distinguished by their respective propagation speed based on a joint speed and position estimation from [51].

7. Conclusion

Variations of the speed of sound along the propagation path from source to receiver may affect source localization results from microphone recordings. The main causes of variations in the speed of sound are temperature differences along the propagation path. This effect is negligible in controlled indoor environments, in particular when air conditioning is in effect.

Temperature differences are more likely when acoustic sensor networks are deployed in multizone buildings or in outdoor environments. Then two possible effects can be expected: the propagation path might deviate from a straight line and the relation between time differences and range differences is unknown.

Theoretical considerations have shown that temperature variations in natural environments do not cause a tangible deviation from straight propagation paths. Therefore ray models for sound propagation are a valid assumption also under temperature variations.

On the other hand, even moderate temperature variations affect the relation between time and range differences and thus impair the accuracy of localization results. This effect can be counteracted by considering the speed of sound as an additional unknown variable in algorithms for position estimation. A variety of different approaches have been presented.

Wireless acoustic sensor networks open up new applications beyond the familiar indoor speech communication scenario. Examples are mobile devices for mixed indoor and outdoor use, outdoor devices which are robust against day/night or seasonal variations, or speech control for devices in outdoor or thermally stressed environments. In all these cases, unknown and variable propagation speed is an issue.

This article has tried to create awareness for this problem and has collected some possible solutions. However, many research topics are still open: there are no comprehensive comparisons of the different estimation methods for variable speed of sound conditions. For several methods theoretical bounds on their accuracy have been investigated, but application specific experimental validation is missing. Of interest is also the comparison to related work in underwater acoustics.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.