Abstract

Compressive beamforming with planar microphone arrays is capable of estimating the two-dimensional direction-of-arrivals (DOAs) and quantifying the strengths of acoustic sources effectively. The multiple-snapshot grid-free method has recently been concerned due to the advantages that it can circumvent the basis mismatch conundrum of the conventional grid-based method and improve the performance of the single-snapshot grid-free method. The existing atomic norm minimization based strategy uses an off-the-peg interior point method (IPM) based solver to solve the positive semidefinite programming equivalent to the atomic norm minimization. We present an alternative algorithm based on alternating direction method of multipliers (ADMM) in this paper. Both simulations and experiments demonstrate that whether a standard uniform rectangular array or a non-uniform array constituted by a small number of microphones is employed, the two-dimensional multiple-snapshot grid-free compressive beamforming using our ADMM based algorithm can estimate the DOAs and quantify the strengths of acoustic sources well, and reaching the same or even better DOA estimation accuracy as the one using the IPM based solver, our ADMM based algorithm is distinctly faster.

1. Introduction

Compressive beamforming, or concretely compressive sensing [13] based beamforming, is an emerging and powerful approach to realize the direction-of-arrival (DOA) estimation and strength quantification of acoustic sources. By virtue of the superiorities of small number demand for microphones, strong anti-interference, unambiguous source imaging and so on, it has recently aroused much concern [48].

In conventional compressive beamforming, the DOA domain is gridded/discretized into a finite set of look directions and all the sources are assumed to fall in these look directions. A linear system of equations relating the signals measured by an array of microphones to the source distribution is established. The source distribution is retrieved from the linear system of equations by imposing a sparsity constraint, i.e., minimizing the -norm of the vector (for the single-snapshot case) or the -norm of the matrix (for the multiple-snapshot case) composed by the source strengths in all the look directions. Its results become inaccurate when the DOAs of sources do not conform with these look directions. The problem is called basis mismatch [4, 9] and can be often encountered in practical applications. Adopting finer grids mitigates basis mismatch at the cost of increased computational complexity. More seriously, grid refinement leads to increased coherence of the measuring process, which can cause offset in the estimates [4]. To conquer the conundrum fundamentally, the idea of utilizing a continuous grid-free setting [1012] is proposed. In 2015, Xenaki and Gerstoft [13] presented a one-dimensional grid-free compressive beamforming strategy for the measurements with linear microphone arrays based on the minimization of the atomic norm of source strength and the polynomial rooting method. In 2017-2018, the authors [14, 15] presented a two-dimensional grid-free compressive beamforming strategy for the measurements with planar microphone arrays based on the minimization of the atomic norm of microphone pressure induced by sources and the matrix enhancement and matrix pencil method [16]. Both these strategies are based on the single-snapshot data model and thus suitable to identify the transient or the moving sources. In 2018, Park et al. [17] extended Xenaki and Gerstoft’s strategy into the multiple-snapshot case through a group atomic norm. The same year, the authors [18] realize the two-dimensional multiple-snapshot grid-free compressive beamforming based on the minimization of the atomic norm of multiple-snapshot microphone pressure induced by sources and the matrix pencil and pairing (MaPP) method [19]. They all have demonstrated that for sources that are active across multiple snapshots, for example, stationary sources, the multiple-snapshot grid-free compressive beamforming is distinctly superior to the single-snapshot one.

The atomic norm minimization in the grid-free compressive beamforming is a convex optimization problem [20] and can be characterized as a positive semidefinite programming. In Refs. [1315, 17, 18], the positive semidefinite programming is solved by the off-the-peg SDPT3 solver in CVX toolbox [21]. The solver is based on interior point method (IPM) [20] and becomes time-consuming for problems with large dimensional matrices [22]. In the two-dimensional multiple-snapshot grid-free compressive beamforming with planar microphone arrays, the dimensions of the matrices are usually large because of the large number of microphones and snapshots. Therefore, it is necessary to develop a faster solving algorithm. A feasible algorithm based on alternating direction method of multipliers (ADMM) [23] is proposed in this paper.

The remainder of this paper is organized as follows. Section 2 deduces the formulations of ADMM that can solve the positive semidefinite programming equivalent to the minimization of the atomic norm of multiple-snapshot microphone pressure induced by sources, after illuminating the theory of the two-dimensional multiple-snapshot grid-free compressive beamforming. Section 3 and 4 compares the performance of the two-dimensional multiple-snapshot grid-free compressive beamforming using the IPM based SDPT3 solver and our ADMM based algorithm with simulations and experiments respectively. Section 5 summarizes this paper.

2. Theory

2.1. Two-Dimensional Multiple-Snapshot Grid-Free Compressive Beamforming

The two-dimensional multiple-snapshot grid-free compressive beamforming can employ a standard uniform rectangular microphone array, such as the one in Figure 1(a), to measure the sound signal. Denoting by and the row and the column number of microphones respectively, the number of snapshots, the set of complex numbers, the matrix of the pressure measured by each microphone under each snapshot, the matrix of the pressure induced by sources at each microphone under each snapshot, and the matrix of the noise borne by each microphone under each snapshot, we have

In the follow-up simulations, the noise is generated as independent and identically distributed complex Gaussian. The array signal-to-noise ratio (SNR) is defined as , where denotes the Frobenius norm. The noise Frobenius norm can be got by .

The first postprocessing step of the two-dimensional multiple-snapshot grid-free compressive beamforming is to denoise and thus reconstruct . This can be realized by imposing a sparsity constraint, i.e., minimizing the number of sources [18]. Establish a Cartesian coordinate system, whose origin is at one vertex of the array, plane is the array plane, and axis is perpendicular to the array plane. For a point in the space, form a line segment between the point and the origin. The elevation angle of the point is defined as the angle from the positive axis to the line segment. The azimuth angle of the point is defined as the angle from the positive axis to the orthogonal projection of the line segment on the plane. The hemisphere which the positive axis falls in is considered as the source region, i.e., and . Under the grid-free setting, the atomic norm of is utilized to measure the sparsity of sources. Its definition iswhere indexes the sources, is the -norm of the vector composed by the strength of the -th source under each snapshot, is the set of positive real numbers, indicates the DOA of the -th source, is the transfer matrix from to the microphone pressure, is the atomic set, and denotes the infimum. The reconstruction problem of can be formulated aswhere is the noise control parameter. Normally, let . Equation (3) can be characterized as the following positive semidefinite programming [18]:where both and are the auxiliary quantities, , is a Hermitian matrix, calculates the trace of a matrix, denotes the two-fold Toeplitz operator [14, 15, 18, 22], is a Hermitian two-fold Toeplitz matrix, denotes the conjugate operator, and means a positive semidefinite matrix.

The second postprocessing step of the two-dimensional multiple-snapshot grid-free compressive beamforming is to estimate the number and DOAs and quantify the strengths of sources. This can be realized by using the MaPP method to process the and obtained by equation (4). The detailed procedure can be seen in Ref. [18]. The array constructed by randomly choosing partial microphones from the standard uniform rectangular array is referred to as a non-uniform or sparse rectangular array, as shown in Figure 1(b). Employing the pressures measured by the partial microphones, the two-dimensional multiple-snapshot grid-free compressive beamforming also can obtain the , reconstruct the full microphone pressure induced by sources, and eventually realize the DOA estimation and strength quantification of acoustic sources [18]. In this case, , and in equation (1) become , and , and in equations (3) and (4) becomes , where represents the set of the indices of the chosen microphones, is the cardinality of , represents the matrix of the pressures measured by the chosen microphones, represents the matrix of the pressures induced by sources at the chosen microphones, and represents the matrix of the noises borne the chosen microphones.

2.2. Alternating Direction Method of Multipliers to Solve Positive Semidefinite Programming

In this section, we deduce the formulations of ADMM that can solve the positive semidefinite programming in equation (4). A detailed survey about ADMM can be seen in Ref. [23]. Because is a special case of with including the indices of the full microphones, we conduct the derivation on . To apply ADMM, we reformulate equation (4) aswhere is an auxiliary matrix and is a regularization parameter. The value of can be determined according to Ref. [24]. The augmented Lagrangian function of equation (5) iswhere the Hermitian matrix is the Lagrangian multiplier, is the penalty parameter and represents the inner product. The ADMM solves equation (5) iteratively. Initializing , the updates in -th iteration are as follows:

Introduce the partitions

Denote by and , respectively, the matrices of the rows in and corresponding to the chosen microphones, the set of the indices of the unchosen microphones, the cardinality of , the matrix of the pressures induced by sources at the unchosen microphones, and , respectively, the matrices of the rows in and corresponding to the unchosen microphones, and and , both, the identity matrices. It can be derived that the variable updates in equation (7) have the following closed forms:

The derivation can be seen in Appendix. Let , where forms a square matrix with off-diagonal elements being zeros and diagonal being the vector in the parentheses, represents the Kronecker product and is the set of real numbers. Denote by be the adjoint of . For a given matrix , . Here, is an elementary Toeplitz matrix with ones on the -th diagonal and zeros elsewhere, and is a halfspace [25] of . means when and when . Then,

The update of in equation (8) can be reformulated aswhich can be performed by conducting the eigenvalue decomposition of the Hermitian matrix and setting all negative eigenvalues to zero.

3. Simulations

In this section, we compare the performance of the two-dimensional multiple-snapshot grid-free compressive beamforming using the IPM based SDPT3 solver and our ADMM based algorithm with simulations. In our ADMM based algorithm, is set to 1 and the iteration is terminated if the relative changes of and at two consecutive iterations, i.e., and , both are less than 10−4 or the maximum number of iterations, set to 1000, is reached. Assume four sources. Their DOAs are (45°, 90°), (20°, 200°), (70°, 160°) and (45°, 270°), in turn. The frequency is 4000 Hz. Ten snapshots are adopted. The root mean square values of the source strength under each snapshot are 100 dB, 98 dB, 96 dB and 95 dB (referring to 2 × 10−5 Pa), in turn. Define an average Frobenius norm error and a normalized -norm error to measure the DOA estimation accuracy and the strength quantification accuracy. Thereinto, is the number of sources, and are the vectors of the estimated and the true elevation angles respectively, and are the vectors of the estimated and the true azimuth angles respectively, and and are the vectors of the quantified and the true root mean square strengths respectively. Change the SNR to conduct simulations. Under each SNR, these errors, together with the consuming time of the IPM based SDPT3 solver and our ADMM based algorithm, are averaged over 100 Monte Carlo runs. In each run, the source strength and the noise are generated randomly. All the simulations are carried out in Matlab R2014a on a PC with a Windows 10 system and a 2.2 GHz Intel(R) Core(TM) i5-5200U CPU.

Figure 2 presents the curves of , and consuming time vs. SNR when the standard uniform rectangular array with 64 microphones shown in Figure 1(a) is employed. Apparently, a low SNR brings relatively large DOA estimation and strength quantification errors. As shown in Figure 2(a), compared with the two-dimensional multiple-snapshot grid-free compressive beamforming using the IPM based SDPT3 solver, the one using our ADMM based algorithm has the smaller DOA estimation errors at low SNRs and almost the same DOA estimation errors at high SNRs. As shown in Figure 2(b), the two-dimensional multiple-snapshot grid-free compressive beamforming using our ADMM based algorithm has slightly larger strength quantification errors than the one using the IPM based SDPT3 solver. As shown in Figure 2(c), the consuming time of our ADMM based algorithm is evidently smaller than the one of the IPM based SDPT3 solver. Figure 3 presents the results when the non-uniform array with 30 microphones shown in Figure 1(b) is employed, which exhibit the same phenomena as in Figure 2. Changing the parameters, such as number and DOAs of sources, frequency and so on, to conduct simulations, similar phenomena can be obtained. These phenomena demonstrate that whether a standard uniform rectangular array or a non-uniform array constituted by a small number of microphones is employed, our ADMM based algorithm can bring evident efficiency improvement with no sacrificing or even ameliorating the DOA estimation accuracy and slightly sacrificing the strength quantification accuracy. The reason why our ADMM based algorithm is faster can be explained as follows. The per-iteration computational complexity of the IPM based SDPT3 solver is for the problem in equation (4). Hereinto, is the number of variables and is the dimension of the positive semidefinite matrix. The per-iteration computational complexity of our ADMM based algorithm is because of the eigenvalue decomposition. The latter is smaller than the former.

4. Experiments

For purpose of validating correctness of the simulation conclusion and effectiveness of the two-dimensional multiple-snapshot grid-free compressive beamforming using our ADMM based algorithm in practical applications, we perform an experimental measurement on two small loudspeakers in a semi-anechoic room with a rectangular array. Figure 4 presents the configuration. The array employs Brüel & Kjær Type 4958 microphones. The loudspeakers are excited by stationary white noises. From left to right, the Cartesian coordinates of the two loudspeakers are (2.24, 0, 5) m and (−2.24, 0, 5) m. Their DOAs are (24.13°, 0°) and (24.13°, 180°). Besides, there are two mirror sources because of the reflection effect of the ground. Their Cartesian coordinates are (2.24, −2.2, 5) m and (−2.24, −2.2, 5) m, and DOAs are (32.13°, 315.52°) and (32.13°, 224.48°). Pressure signals captured by microphones are acquired simultaneously by Brüel & Kjær PULSE Type 3560D Data Acquisition System and then transferred to Brüel & Kjær PULSE LABSHOP Software where their Fourier spectra are obtained. The sample frequency of 16384 Hz is utilized. Each snapshot has a length of 1 s and 214 samples. Ten snapshots are adopted. Eventually, the two-dimensional multiple-snapshot grid-free compressive beamforming using the IPM based SDPT3 solver and the one using our ADMM based algorithm are applied to map these sources. The parameter setting in our ADMM based algorithm keeps the same as in Section 3.

Figure 5 presents the loudspeaker source maps at 2000 Hz, 3000 Hz and 4000 Hz when the standard uniform rectangular array with 64 microphones shown in Figure 1(a) is employed. Obviously, whether the IPM based SDPT3 solver (Figures 5(a), 5(c), 5(e)) or our ADMM based algorithm (Figures 5(b), 5(d), 5(f)) is used, the two-dimensional multiple-snapshot grid-free compressive beamforming estimates the DOAs of these sources accurately. Table 1 lists the DOA estimation error corresponding to Figure 5 and the consuming time of the IPM based SDPT3 solver and our ADMM based algorithm. Apparently, the two-dimensional multiple-snapshot grid-free compressive beamforming using our ADMM based algorithm has almost the same DOA estimation error as the one using the IPM based SDPT3 solver. All these errors are very small. The consuming time of our ADMM based algorithm is only about 1/6 of the one of the IPM based SDPT3 solver. Figure 6 and Table 2 present the results when the non-uniform array with 30 microphones shown in Figure 1(b) is employed, which exhibit the same phenomena as in Figure 5 and Table 1. These phenomena demonstrate that in practical applications, the two-dimensional multiple-snapshot grid-free compressive beamforming using our ADMM based algorithm can reach the same DOA estimation accuracy as the one using the IPM based SDPT3 solver, and our ADMM based algorithm is distinctly faster. The experimental conclusion agrees with the simulated one, demonstrating that the conclusion is correct and the two-dimensional multiple-snapshot grid-free compressive beamforming using our ADMM based algorithm is effective in practical applications. It is worth mentioning that because we cannot obtain the strengths of the loudspeaker sources accurately, we do not compare the strength quantification accuracy here.

5. Conclusions

In the existing atomic norm minimization based two-dimensional multiple-snapshot grid-free compressive beamforming, the positive semidefinite programming equivalent to the atomic norm minimization is solved by an off-the-peg IPM based solver. The solver trends to be time-consuming for large-dimensionality problems. In this paper, we present an alternative algorithm based on ADMM and compare it against the solver both with simulations and experimentally. Some interesting conclusions have been drawn. Firstly, as with the two-dimensional multiple-snapshot grid-free compressive beamforming using the IPM based solver, the one using our ADMM based algorithm also can estimate the DOAs and quantify the strengths of acoustic sources well. Compared with the former, the latter have the same and even higher DOA estimation accuracy, and slightly lower strength quantification accuracy. Secondly, compared with the IPM based solver, our ADMM based algorithm enjoys distinct efficiency advantage. Finally, the above conclusions hold up, whether a standard uniform rectangular array or a non-uniform array constituted by a small number of microphones is employed.

Appendix

Deducing Equations (11)–(14) in Section 2.2

For a given matrix , holds. If is Hermitian, . For two given matrices and with the same dimensions, holds. If is Hermitian, . For two given matrices and , if the number of rows in equals the number of columns in and the number of columns in equals the number of rows in , holds. According to these properties, we have

Simultaneous equations (6) and (A.1)–(A.3) yield

According to the complex-valued matrix derivatives [26], we havewhere is the identity matrix that has the same dimension as , is an exponent, and and denote the transpose and the conjugate operator. Then,hold. Equation (A.6) all equal 0 when obtains its minimum. Consequently, Equations (11)–(14) hold.

Data Availability

Datasets generated and analyzed in the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the National Natural Science Foundation of China, under Grant Nos. 11874096 and 11704040, and the Natural Science Foundation of Chongqing, under Grant No. cstc2019jcyj-msxmX0399.