Abstract

Acoustic source mapping techniques using acoustic sensor arrays and delay-and-sum beamforming techniques suffer from bad spatial resolution at low-aperture-based Helmholtz numbers. This is especially a problem for three-dimensional map grids, when the sensor array is not arranged around the region spanned by the grid but on only one side of it. Then, the spatial resolution of the result map in the direction pointing away from the array is much worse than in the other lateral directions. Consequently, deconvolution techniques need to be applied. Some of the most efficient deconvolution techniques rely on the properties of the spatial beamformer filters used. As these properties are governed by the steering vectors, four different steering vector formulations from the literature are examined, and their theoretical background is discussed. It is found that none of the formulations provide both the correct location and source strength. As a practical example the CLEAN-SC deconvolution methodology is applied to simulated data for a three-source scenario. It is shown that the different steering vector formulations are not equally well suited for three-dimensional application. The two preferred formulations enable the correct estimation of the source location at the cost of a negligible error in the estimated source strength.

1. Introduction

In the context of acoustic measurements, methods based on acoustic sensor arrays can be used to locate acoustic sources and to estimate their strength [1]. In most cases, these methods are adopted to produce acoustic source maps. In general, such maps can be thought of as an image of the spatial distribution of an indicator quantity of source strength.

Acoustic source mapping techniques using beamforming methods have been widely applied for the study of acoustic sources (e.g., for trains [2], aeroacoustic testing [3, 4], airframe noise [5, 6], noise source characterization at a helicopter [7], and jet noise [8]). These methods use the signals from an array of acoustic sensors (mostly microphones) to filter out the signal from a source at an assumed location. Such a spatial filter behaves like a directional sound receiver with a directional characteristic that favours sound emanating from the assumed source location [9]. If several of such spatial filters are applied in parallel for a number of different assumed source locations, an acoustic source mapping may be generated from the filter outputs. Usually the assumed source locations are arranged in some sort of grid and the amplitude of the output from each individual filter is mapped on the respective grid location. Thus, the filters constitute a mapping device.

The map produced by this device is an image of the spatial source distribution. Ideally, it shall have the following properties: if a certain assumed source location in the grid coincides with an actual source location, the map shows a higher value at this location. If there is no coincidence of assumed and actual source location, the map shows a lower value. Moreover, stronger sources shall result in higher values in the map. Thus, the map provides information about the location of the source, and it allows to estimate the strength of the source.

The feasibility of this approach depends on the properties of the beamformer mapping device as given by the point spread function. The point spread function is the spatial impulse response of the beamformer. It can be thought of being the map that is produced if only one single point source is present at a certain location. It shows the image of the source as a spot at the source location (main lobe) that is accompanied by a number of spots (side lobes) at other locations and lower in level. This mapping is imperfect for two reasons. First, the width of the main lobe limits the spatial resolution because sources that are too close to each other will produce a mapping very similar to that of a single source. Second, the images of weaker sources may be masked by the side lobes of a stronger source. The point spread function depends on a number of factors: the number and the geometrical layout of the array microphones, the array aperture, the frequency, and the type and properties of the filter. It also depends on the source location.

A number of deconvolution techniques have been developed to recover the true spatial source distribution from the beamformer result by removing the influence of the point spread function. Some rely on precalculated point spread functions (e.g., DAMAS [10]), other approaches assess the point spread function from the acoustic data recorded during the measurement (e.g., CLEAN-SC [11]). A critical point, especially for the latter techniques, is that they require the maximum in the map to coincide with an actual source position.

The beamforming source mapping approach is typically applied using a planar two-dimensional grid. In this case, all sources are mapped into one plane regardless of their actual position. In situations where the acoustic sources under test are not in a common plane (e.g., for complex machinery parts, engines, and some aeroacoustic sources, such as, landing gear and pantographs), this leads to an erroneous source mapping. Therefore, a three-dimensional mapping is desirable that allows source localisation in three dimensions. In principle, the three-dimensional application of beamforming techniques is straightforward and can be easily realised by using a three-dimensional grid [1220]. However, there appear to be some practical problems in the application.

First, the resolution in the third dimension (depth-wise) is much worse than in the other dimensions unless the microphone array arrangement encloses the source region to be mapped. The second problem is the larger number of points in a three-dimensional grid. While for a two-dimensional mapping some thousand grid points may be sufficient, a three-dimensional grid can easily have some hundred thousand points. Because one filter per grid point is required to calculate the result, the computational effort increases considerably. That is why most applications are using only several ten thousand grid points.

As with two-dimensional map grids, the resolution can be improved using deconvolution techniques also for the third dimension. The computational effort connected with these techniques is generally high and depends on the number of grid points. Those deconvolution techniques that require precalculated point spread functions (e.g., DAMAS [10]) also require to solve a system of equations with as many unknowns as there are grid points. The effort for the estimation of the point spread function increases with the fourth power of the number of grid points in the map grid. The effort to solve the system of equations itself grows with the second to third power. Thus, the large number (hundreds of thousands or more) of grid points required for a three-dimensional mapping with fine spatial resolution renders the application of these techniques inefficient. A possible solution that has been proposed [21] is to assume that the point spread function is assumed to be shift invariant. Thus, the effort would reduce substantially and it would increase then only with the second power of the number of grid points for the estimation of the point spread function and with somewhat less than the second power for the solution of the system of equations based on the fast fourier transform. However, this approach is limited to cases where the source region is small compared with its distance to the array [21]. Thus, it is not applicable to the general case of a larger source region.

Computationally less demanding deconvolution techniques, such as, CLEAN-SC that do not require solving a huge system of equations in turn need to find sources by maximums in the three-dimensional map. Therefore, a final problem in the three-dimensional application of beamforming mapping techniques is to find spatial filters that have the desired properties also in three dimensions to provide these maximums. While this problem arises specifically for deconvolution methods that require maximums in the map to coincide with acoustic sources, it is relevant because the computational cost increases roughly linearly with the number of grid points. This makes these techniques most appropriate for three-dimensional application and allows for practical application with several hundred thousand grid points [1719].

The problem of desired beamformer filter properties shall be considered here by comparing different spatial filter characteristics given by different steering vectors and analysing their properties with regard to three-dimensional beamforming with deconvolution. In the remainder of this contribution the theoretical basis for the beamforming methods is briefly presented, and four different spatial filter characteristics are discussed. Their properties are demonstrated using a single-source scenario as an example. Finally, a slightly more realistic case is considered. Simulated data for three-dimensional source mapping is analysed using the different beamformers and the CLEAN-SC deconvolution approach. The results are compared with results from two-dimensional source mapping.

2. Theory

First, the analysis of a single sound source located at using an array of microphones is assumed. The complex-valued sound pressure at the -th microphone at is The strength of the source is characterised by the sound pressure at the reference location due to that source. Though in principle can be freely chosen, for the purpose of the following analysis it is set to the array centre at . The transfer function depends on the type of the source, its location , and the environmental conditions. If a monopole source under free-field conditions is assumed and no flow is present, the transfer function is given by: with and indicating the distance between the source and the microphone location and the distance between the source and the array centre location, respectively, and is the wave number. The vector of sound pressures at the microphones due to a source at is given by . The transfer vector contains all respective transfer functions and accounts for the individual time delays and attenuations of the sound that travels from the source to the microphones.

The beamformer filter is realised by calculating the weighted sum of the microphone sound pressures using complex-valued weight factors. The vector of these factors is called the steering vector and depends on an assumed source location . The filter output is then: where the superscript denotes the hermitian transpose. Instead of , the real-valued autopower spectrum of the filter output can be used as a quantity to construct a source map. Using the cross spectral matrix of microphone signals , it can be written as with denoting the expectation operator and the superscript denoting the complex conjugate.

Two properties of the beamformer filter are desirable for a successful application in the case of acoustic source mapping. First, the filter should provide maximum output power when the assumed and the actual source position are the same: This property is essential to have a map showing the peak value at the source position. Second, if the assumed and the actual source coincide, the filter output should be a measure of the source strength. This is true, if holds, where is an arbitrary constant.

The properties of the beamformer filter are governed by the steering vector. This vector depends on the assumed source location also referred to as steering location. While not further considered here, it can be noted that it may also depend on the measured data itself and adapt the filter properties to the data. Sometimes the distance between array and source is very large and a plane wave propagation can be assumed. In this case, is replaced by the direction of arrival and the distance between array and source is set to be practically infinite. However, for three-dimensional application this approach is not feasible.

An in-depth examination of the literature on acoustic beamforming reveals that there are at least four choices available regarding the formulation of the steering vector elements under these circumstances. In the following, these formulations are presented and analysed regarding the desirable properties stated in (5) and (6).

2.1. Formulation I

The most basic idea is to simply compensate for the phase delay [22] between assumed source and the individual microphone. The steering vector elements are then derived from the phase part of the transfer vector elements for the assumed source location: The latter part of this equation holds if the transfer function from (2) is assumed and and (see Figure 1). Given this steering vector formulation that shall be referred to here as “formulation I”, the output power of the beamformer filter can be derived for the case : Because , the output is an estimate of the source strength. However, it becomes obvious that the condition (6) is met only approximately.

Condition (5) requires a local maximum of at . A necessary condition for this is that all partial derivatives of with respect to the elements of are zero. For this, it is sufficient to have This holds for the partial derivatives that can be computed when the steering vector given by (7) is applied in (4). Thus, the necessary condition to have a maximum response at the source location is met in this case.

2.2. Formulation II

Another formulation of the steering vector that is frequently used in the literature (e.g., [3, 5]) aims at compensating also for the amplitude: This formulation assures that condition (6) is met because . However, this comes at the cost that the derivatives in (9) do not vanish. There is no maximum at , and consequently condition (5) is not met.

2.3. Formulation III

The third formulation of the steering vector [23] that should be discussed is based on the idea that signals from the assumed source position should pass undistorted through the filter, so that . At the same time signals from all other positions should be attenuated as much as possible. This is equivalent to minimising the filter response to spatially white noise [24]. By solving this optimisation problem, the formulation arises. The steering vector here is parallel to when . Again, this formulation does not meet condition (5).

2.4. Formulation IV

Formulation IV introduces a steering vector that is also parallel to when but uses a normalisation to ensure that remains constant. This formulation is usually developed via a least square minimisation of the error between modelled and measured sound pressures at the microphones (e.g., [4, 8]). Using the normalisation , the formulation reads In this case condition (5) is met, but the response for is only an approximate measure of the source strength:

2.5. Comparison of Formulations

It can be concluded that in theory neither of this four formulations of the steering vector has both properties desirable for the application of the beamformer to acoustic source mapping. The first formulation (7) and the fourth formulation (12) provide the correct location but produce an error in the source strength and both the second formulation (10) and the third formulation (6) provide the correct source strength, but the maximum does not coincide with the correct location. However, it should be borne in mind that for practical application it suffices to get approximate estimates of both location and source strength with acceptable accuracy.

In contrast to the theoretical model used up to here, in a practical scenario there is often more than one source. The beamformer output is then a superposition of contributions from the individual sources. Thus, the presence of multiple sources has an impact on the performance of the beamformer filter (see [23]). If the signals from these sources are mutually uncorrelated, is the sum of their nonnegative contributions. In such a case the results will deviate somewhat from those of the mathematical analysis presented for the one source scenario. Any conclusions that were derived for the properties of the different formulations regarding conditions (5) and (6) are not rigorously valid, but only approximately. This becomes especially important when sources are closely spaced with distances less than the wavelength. As a consequence, the low-frequency application of source-mapping beamforming techniques leads to results of limited value. Appropriate deconvolution techniques may improve the results in this case and are frequently applied for this reason.

While for all four formulations of the steering vector meaningful practical results have been reported for acoustic source mapping applications, there are no results available yet that compare the mappings produced by the application of these different steering vectors. Moreover, the vast majority of these applications use two-dimensional mapping grids and assume a priori that all relevant sources are located in the mapping plane. A three-dimensional mapping grid does not need this assumption but will usually require a deconvolution technique to deal with the otherwise inadequate depth-wise resolution.

The large number of points in a three-dimensional mapping grid calls for an deconvolution technique that is computationally efficient. Computationally less demanding deconvolution techniques, like CLEAN-SC, rely on the correct estimation of the source location from the maximum in the map. Because of this requirement, it is important to assess the quality of the results that follow the application of the different formulations of the steering vector for three-dimensional source mapping. In what follows, the effect of using different steering vector formulations shall be analysed for both two- and three-dimensional acoustic source mapping on the basis of simulated measurements.

3. Results

Most practical applications are concerned with sources that are not compact but spatially extended. A usual assumption is to assume that the sources can be seen as spatial distributions of uncorrelated point sources. Thus, in order to analyze the different steering vectors, no extended source is considered here but a simple point source scenario. Results for practical cases with extended sources that use three-dimensional mapping can be found elsewhere [18, 19, 25].

The source mappings that should be discussed here are based on simulated data that was generated using the set-up shown in Figure 2. It uses a 64 microphone array to analyse three point sources. The microphone array is planar and has a layout that consists of 7 spiral arms that contain 9 logarithmically spaced microphones each and an additional microphone in the centre. The aperture of the array, defined by the diameter of the smallest circle containing all microphones, is used as a scaling parameter in the analysis. All coordinates and lengths in the analysis are nondimensionalised by the aperture. Frequencies are given in nondimensional form as Helmholtz numbers with being the wavelength and being the speed of sound.

The positions of the sources are chosen randomly to span a region with lateral dimensions comparable to the array itself. The distance to the array varies between 0.75 and 1.25 apertures. This choice was made because the application of threedimensional source mapping is most interesting when distances of the sources to the array are somewhat different. However, very large ratios of the distances (i.e., one source very close and another source far away) are not likely to appear in a practical application. Altogether the scenario is somewhat representative of a situation where the object under test has dimensions comparable to the aperture. Typically this would allow for the beamforming analysis at frequencies with wavelengths much smaller than the object. An aperture much larger than the object is desirable for the analysis at lower frequencies. When the application of large aperture arrays is not practicable, the analysis requires deconvolution methods. As results from deconvolution are of special interest here, a scenario representative for this case was chosen as illustrative example.

The simulated microphone signals were calculated using a transfer function similar to (2). The point sources were driven by simulated white noise signals from different gaussian random processes to ensure that they are not coherent. All three sources had the same power. Nevertheless, because of the different distances to the array plane and therefore also to the array centre , the relative sound pressure levels at due to the individual sources were different: 0 dB, 1.7 dB, and −2.2 dB for source A, B, and C, respectively.

The array was placed in the plane and two different grids were used. The first grid used for the three-dimensional source mapping covered a block-shaped region with , , and . It had a uniform grid spacing of 1/32 aperture and the overall number of grid points was 257, 725. The second grid for two-dimensional source mapping in a plane parallel to the array had the same spacing and had the same extent , , but for . The overall number of points in this case was 4225. The simulated microphone signals were sampled at a rate that corresponds to . A fast Fourier transform with prior von Hann weighting was applied for every channel to 1000 consecutive, 50% overlapping blocks of 1024 samples each. All 642 cross spectra were calculated and averaged over the 1000 blocks to produce the cross spectral matrix.

The quality of an acoustic source mapping can be determined by evaluating the errors in the source levels and source locations that are estimated using the mapping result. According to the definition in (1), the source level, of a certain source is defined as the sound pressure level calculated from the sound pressure at the array centre caused by that source ( is the reference sound pressure). The estimated source level is then and the error in this level is given by , where is taken to be the true level.

The source location is a vector quantity and involves three components. While the error here could be given as the distance between the true and the estimated location, this quantity is always positive and contains no information on the spatial arrangement. Instead, the error in the estimated source location shall be defined here as an error of the distance between source and the array centre. It is given by , where and are the estimated and true distances, respectively. As it is reasonable to assume that this error will also increase with the distance, it is used here in the normalised form . As the errors in source location and level also depend on the frequency, they are discussed here regarding their dependence on the Helmholtz number.

3.1. Single Source

In the first test case, only source A was operated. In this simple single source scenario it is feasible to use the classic beamforming approach (4) and to do without a deconvolution technique. In Figure 3, the results for all four formulations of the steering vector (I–IV) are compared for . The source mapping itself is three-dimensional. However, for clarity of presentation only a slice of the mapping along the plane (perpendicular to the array plane) is shown which contains the true source location. The four formulations obviously lead to different mapping results. In agreement with the theoretical analysis, both formulations I and IV meet the condition (5) that the maximum in the map coincides with the actual source position. For both formulations II and III the maximums in the map are situated somewhere between the actual source position and the array centre. Thus, it would not be possible to estimate the exact source position from the maximum in the source mapping. The actual source position is located on the 0 dB contour for both formulations II and III. In agreement with theory, this shows that condition (6) is met.

While for formulations I and IV the error in source location as shown in Figure 4(a) is zero regardless of , for formulations II and III; this error becomes less than 5% only above . However, the estimated distance between array centre and source is never larger than the actual distance.

In the present case, the only option to estimate the source level is to use the maximum in the map. Figure 4(b) shows the error in comparison for all four formulations. Formulations I and IV show constant, small errors that follow the theoretical analysis in (8) and (13), respectively. For low Helmholtz numbers, the formulations II and III lead to larger errors because of the error in the estimated source location (the value at the location of the maximum in the map taken as source level). The error vanishes for larger Helmholtz numbers which is in agreement with the theoretical analysis.

3.2. Three Sources, Two-Dimensional Mapping

The second test case where all three sources are operated is a slightly more realistic scenario. In this case, the strength and the location of all three sources are of interest. If a classic beamforming approach with the mapping plane parallel to the array is used, the two-dimensional beamforming maps (Figure 5) show only minor differences for the four formulations. While only source A is actually located in the mapping plane at , contributions from all three sources appear in the maps. Thus, without any further analysis the maps suggest that all sources are located within this plane.

With the exception of source A, the maximums do not coincide with the projected source positions for both Helmholtz numbers shown. The reason is that sources B and C are not located in the mapping plane. The results show the tendency to map the sources nearer to the array more into the direction of the projected array center, while sources with a distance greater than that of the mapping plane are mapped to a an apparent position further away from the center. If there is no information available about the true distance between source and array, there is no way to estimate the exact source positions from the mapping result. For the same reason, it is not feasible to estimate the source strength or to rank the sources using the result from the two-dimensional mapping. While all sources have the same source strength, the result shows the respective sound pressure level contribution at the array centre. If all sources are assumed to be within the mapping plane, B then appears to be the strongest and C appears to be the weakest source.

At the lower frequency shown (), the sources are less clearly distinguishable because of the large main lobe width at this frequency. If the source spacing would be smaller, the same would happen even at higher frequencies. Thus, the result can be improved if the real beamformer filter properties are taken into account by using a deconvolution technique. In the present case, the deconvolution method CLEAN-SC was applied (see [11] for more details) to the beamforming map. The result is a map that shows nonzero entries only at those grid points where a source is found. This corresponds to a negligible main lobe width regardless of frequency.

When applied to the two-dimensional beamforming results from Figure 5, CLEAN-SC delivers maps that allow for an easy separation of the sources even at the lower frequency (Figure 6). Similar to the result from classic beamforming, these maps show all sources as they where situated in the mapping plane. The estimated locations of sources B and C again do not coincide with the projected source positions. The four different steering vector formulations lead to some differences in the estimated source level but show no divergent effects otherwise. To summarise, these results allow to conclude that there are at least three sources, but the information about the (projected) location is limited and a ranking of the sources is not possible.

The effect on the estimated source level as a function of frequency can also be estimated. If the source positions are known, the source level can be estimated from the map by simply taking the values at the grid points that are located at source positions. To allow for small errors in the estimated source positions, in the present case the source level was estimated by integrating over small square regions of the map. These regions were centred at the nominal source positions and had a side length of 0.1 array apertures.

The results in Figures 7(a), 7(c), and 7(e) show that the errors of the estimated source levels tend to be larger at very low Helmholtz numbers and become smaller for . However, for the sources B and C that are not situated within the mapping plane, the error again increases above , obviously as a result of the wrong mapping. Similar to the result shown in Figure 6, the four different steering vector formulations show only small differences, with formulations I and IV giving somewhat smaller levels compared to formulations II and III.

3.3. Three Sources, Three-Dimensional Mapping

More information can be gathered when the deconvolution is applied to a three-dimensional beamforming map. To study the results in comparison to two-dimensional beamforming, the analysis of a slice from the three-dimensional result is one option. Figure 8 shows such a slice at that is equivalent to the mapping plane shown in Figure 2. Sources B and C that are not situated within this plane do not appear in any of the maps. Source A is within the plane but appears only for formulations I and IV and in case of formulation III for the higher frequency (). The reason for the absence of any source in the remaining maps is that the sources are mapped at the wrong position in the direction. This becomes obvious in the two-view orthographic projections (Figure 9) of the three-dimensional map that is another option for the graphical representation of the result.

The projections along the -axis (--plane) reveal that the sources are indeed mapped much closer to the array as they really are for formulations II and III and low frequencies (). For formulations I and IV, the error in the location is much smaller at this frequency but not zero (). This effect is present only for the multisource scenario. It can be attributed to the fact that the location of the maximum in the beamforming map for a certain source is slightly shifted by the influence of other sources. For , no error is visible in the maps with exception of source A and formulation II. This small error also vanishes at even higher frequencies. While the estimated source levels are somewhat different for the different formulations, the errors are small, and no formulation seems to produce distinctly smaller errors than the others.

Finally, the error of the estimated source level shall be examined. Again, the source level was estimated by integrating over regions with a side length of 0.1 array apertures centred at the source position, but this time the regions were cubic shaped. Figures 7(b), 7(d), and 7(f) show the errors of the estimated source levels for all three sources. The results for formulations I and IV are again very similar and show a small negative error over a wide range of Helmholtz numbers. This error is consistent with the results from the theoretical analysis in (8) and (13) and is negligible for most practical applications. In contrast to the theoretical analysis, the error does not vanish completely for formulations II and III, though it is also negligibly small. For lower Helmholtz numbers, the beamforming and deconvolution method maps the source to a location outside the sector used for the integration. Thus, the estimated source level becomes infinitely small and consequently . While this is the case for for all three sources and formulations I and IV, the estimated source level for formulations II and III vanishes already below . The error does not increase for higher Helmholtz numbers as it is the case for the levels estimated from two-dimensional mapping.

It can be concluded that the principal theoretical findings regarding the different formulations I–IV in a single-source scenario remain true for the multiple source case: formulations I and IV deliver the correct source locations already for low Helmholtz numbers but show a small systematic error in the estimated source level. Formulations II and III deliver a slightly less erroneous level, but only for higher Helmholtz numbers, when the error in the estimated source location is small enough to place the source within the integration sector. Thus, for the given scenario the practical differences between the different formulations are generally small for higher Helmholtz numbers. However, in applications where small Helmholtz numbers arise, only formulations I and IV can be applied.

Because the errors in source level are very small for all formulations, formulations I and IV seem to be preferable for the practical application of three-dimensional acoustic source mapping using a beamforming approach. Moreover, once the correct position of a source is estimated, the systematic errors in the source levels for formulations I and IV can be corrected for by taking into account the factors in (8) and (13), respectively. Finally, formulation I has the extraadvantage that the calculation of the steering vectors requires less arithmetic operations.

4. Conclusion

A crucial element for the three-dimensional application of beamforming source mapping techniques using a microphone array is the formulation of the steering vectors. It was shown here that four different formulations found in the literature lead to different results. In theory, no formulation produces both correct source location and strength. Two formulations lead to the correct location at the cost of a small error in the estimated source strength. The other two formulations estimate the correct strength but show an error in the estimated location of the source. Using simulated measurement data, it was shown that this error is relevant especially at low Helmholtz numbers based on the array aperture. In a simulated three-source scenario with CLEAN-SC deconvolution, all four formulations lead to small errors in the estimated strength of the sources. Unlike the systematic errors in source location, the systematic errors in the level can be corrected for in principle. Thus, the major conclusion is that for three-dimensional source mapping those steering vector formulations are preferable that enable the best estimation of the source location, for example, formulations I or IV.