An adaptive beamforming algorithm for sound source localisation via hybrid compressive sensing reconstruction

. In this study, a hybrid compressive sensing reconstruction algorithm called SAMP-CoSaMP is proposed. The unique combination maintains the speed advantage of CoSaMP and the adaptive sparsity searching ability from the SAMP. Afterwards, an improved beamforming algorithm named SC-DAMAS for sound source localisation is created by integrating our hybrid algorithm with the classic DAMAS. Lastly, the reconstruction accuracy is compared between the SAMP-CoSaMP, SAMP, and CoSaMP algorithms in different signal-to-noise ratio scenarios. The results show that the SAMP-CoSaMP is balanced between running efficiency and reconstruction error. In addition, we perform comparative sound source localisation simulations and experiments by our SC-DAMAS with those of the conventional beamforming method and orthogonal matching pursuit algorithm-based deconvolution approach. SC-DAMAS is superior to the aforementioned counterparts in localisation performance without the need to predetermine the sparsity value.


Introduction
Beamforming is a signal processing technique based on microphone array measurement. In general, for acoustic beamforming, the spatial filtering operation of sound source data from a sensor array is first performed by a beamforming algorithm. Then, the imaging task of a sound source is completed by enhancing the output energy of the focal point. Due to its high accuracy in long-distance sound source localisation, beamforming has become more widely used in identifying moving sources, measuring wind tunnels, and interior acoustics [1][2].
There are two main components of acoustic beamforming, namely, signal processing and source localisation. Concerning signal processing, CS-based beamforming has attracted a large amount of attention [3]. In CS theory, signals are typically acquired at sub-Nyquist rates of traditional signal processing [4]. CS-based beamforming overcomes the limitation that the sampling frequency is at least twice the highest frequency of the original signal to accurately recover the information present in the signal [5]. In other words, fewer samples are required by the Shannon-Nyquist theorem in CS. For instance, based on sparse recovery in CS, Wu et al. proposed a multichannel deconvolution algorithm to enhance the source signal [6]. In addition, based on sparse representation, Wang et al. [7] combined the variational Bayesian expectation maximisation method to solve the equation and studied sound source localisation in a reverberant environment. Due to previous studies, a series of CS reconstruction algorithms have been developed for acoustic beamforming, such as orthogonal matching pursuit (OMP) [8], regularised OMP [9], compressive sampling matching pursuit (CoSaMP) [10], variable step-size gradient matching pursuit [11] and stochastic gradient matching pursuit [12]. In the application of the aforementioned conventional CS reconstruction technique, sparsity K is a necessary predetermined parameter, which corresponds to the number of sound sources. However, in engineering practices, the number of sound sources is generally unknown, which limits the application of compressive sensing to a certain extent. Therefore, Thong T. Do [13] proposed the sparsity adaptive matching pursuit (SAMP) algorithm, which adapts to sparsity K but has unstable reconstruction accuracy and sensitive operating speed to the iteration step size. More specifically, a step size that is too small can lead to insufficient sparsity estimation, which increases the number of algorithm iterations and thereby reduces the algorithm efficiency. On the other hand, a sparse overestimation problem is caused when the step size is too large and negatively affects the reconstruction accuracy of the signal further. Therefore, the improvement of the reconstruction algorithm in the case of unknown sparsity is one of the purposes of this study.
Specific source localisation generally involves beamforming. Recently, combined with different beamforming strategies, a large quantity of well-established acoustic beamforming algorithms have been reported, including functional beamforming [14], orthogonal beamforming [15], generalised inverse beamforming [16], and deconvolution beamforming [17]. Among them, deconvolution beamforming has increased interest in sound source localisation since it has the advantages of a smaller sidelobe than the conventional beamforming method. In recent decades, many deconvolution techniques have been developed for sound source localisation. For instance, the CLEAN algorithm was first introduced by Jan Hogbom et al. [18] for application in astronomy and further applied to the acoustic localisation field. In addition, to reduce the computational burden of the original algorithm, CLEAN-SC was proposed by Sijtsma et al. [19], which is now widely recognised as one of the most important branches of deconvolution beamforming. Moreover, Brooks and Humphreys proposed the classic DAMAS via Gauss-Seidel iteration, which is considered a breakthrough in improving sound source localisation resolution [20]. Subsequently, based on DAMAS, a series of algorithms, such as DAMAS2, DAMAS3 [21], and FISTA-DAMAS [22], were proposed, which verified the effectiveness of the original algorithm. Therefore, DAMAS beamforming is used to address the sound source localisation issues in the study.
In summary, a hybrid CS reconstruction algorithm called the CoSaMP-SAMP algorithm is proposed that retains the backtracking concept from the CoSaMP algorithm and the sparsity adaptability from the SAMP algorithm. Then, the hybrid algorithm is integrated with DAMAS beamforming for sound source localisation. The effectiveness and accuracy of the improved method are numerically analysed using the COMSOL Multiphysics software. The results show that the proposed method can achieve a high resolution of sound source localisation and has a wide applicable frequency domain.
The rest of the article is organised as follows. Section 2 describes the classic DAMAS and our hybrid reconstruction algorithms. Section 3 introduces the simulation model of sound source localisation. Section 4 and Section 5 present the comparative simulation and experimental results that prove the superiority of the hybrid and adaptive beamforming algorithms. Lastly, Section 6 discusses this study's conclusions.

Preliminary analysis of DAMAS
By introducing the inverse solution of the point spread function, DAMAS can obtain the surface sound pressure distribution of the sound source with higher spatial resolution than that of conventional beamforming. Suppose that a regular array of microphones is on the measurement plane. As the number of sound source points is usually limited (assumed to be ), they are sparsely distributed on the discretised sound source surface. Each focal point on that surface is regarded as a potential sound source. The sound pressure obtained from the measurement surface is the sum of the product of the sound source intensity and the transfer matrix. The mathematical expression can be written as:

=
. (1) The dimension of the matrix [ ] is equal to the total number of microphones in the measuring array. The × dimensional transfer matrix [ ] between the sound source surface and the measurement surface can be expressed as: The element in Eq. (2) can be defined as follows: Then, the sound pressure cross-spectrum function can be defined as follows: The specific expression of can be described as: The nondiagonal elements in can be neglected while the sound source is incoherent. In that case, can be further simplified as ∑ | | , and the cross-spectrum function can be expressed as: Based on the cross-spectrum function, the beamforming output--that is, the sound power distribution of each focal point--can be illustrated as: where represents the steering vector, which is associated with each microphone according to the chosen steering location [20]. Eq. (7) can be further rewritten as: where is defined as: The above formula describes the relationship between the cross-spectrum function, , and the source intensity of the sound source, which is a typical convolution process. The essence of deconvolution beamforming involves solving Equations = in reverse. In the equations, and are the conventional beamforming output and the aforementioned PSF, respectively. When the parameters and are known, the original sound source can be solved in reverse. In classic DAMAS, the Gauss Seidel iterative algorithm is applied to perform deconvolution and thus obtain the spatial distribution of sound sources, which features strong anti-interference and is widely applicable to sound source localisation issues.

SAMP-CoSaMP reconstruction algorithm
To combine the backtracking of the CoSaMP algorithm and adaptive characteristics of SAMP, the integration of the two algorithms is executed for the SAMP-CoSaMP reconstruction algorithm. The hybrid algorithm can automatically search for the number of sound source points without considering the original algorithm's dependence on sparsity. Hence, it belongs to the adaptive CS reconstruction algorithm. The steps to apply the SAMP-CoSaMP can be illustrated as follows: Step 1: Set the initial parameters , , and as = 1, = , and = Ф, respectively.
Step 2: Calculate the correlation coefficient by = = , , = 1,2, … , extract the index value with respect to the maximum values and store them in set F.
Step 6: Calculate the correlation coefficient by = = , , = 1,2, … and then store the corresponding index value in the index set if ≥ 0.5max| |.
Step 7: Combine the index value set by = ∪ , and calculate the correlation coefficient of the index pointed elements and margin by = = , , = 1,2, … . Then, extract the index values with respect to the k maximum values and store them in the new set .
Step 8: Calculate the estimated signal by = argmin − Φ and update the margin by = − Φ .
Step 9: Judge whether ‖ − ‖ ≤ ; stop the iteration if it is satisfied; otherwise, skip to step 10.

SC-DAMAS algorithm
Deconvolution beamforming for sound source imaging is the equation solution process of = . Compressive sensing is a reverse solution process of x, while measurement matrix Ф and observation vector are known. Both relate to the inverse solving of linear functions with two known variables, and compressive sensing provides a variety of reconstruction algorithms for that purpose. Therefore, and can be treated as the input of compressive sensing. The original signal can then be restored via the compressive sensing reconstruction algorithm. The proposed SAMP-CoSaMP is further combined with DAMAS. Afterward, an improved beamforming algorithm named SC-DAMAS can be realised for sound source localisation. The principle flow diagram of the SC-DAMAS is depicted in Fig. 1.

Simulation model building
Simulation analysis is performed to demonstrate the effectiveness of SC-DAMAS to address sound source localisation issues. First, a free sound field area with a size of 3×4×3 m in the COMSOL multiphysics simulation environment was constructed. Then, a monopole sound source was set at coordinates of 1.7 m, 3 m, and 1.7 m, from which the volume flow rate is e-3 m 3 /s. The medium in the entire area was set as air with a relative humidity of 50 %, and the fluid model was set to atmospheric attenuation. To avoid the reflection and absorption of the sound waves propagating to the wall, a perfect matching layer with a thickness of 0.2 m was added to the outside of the area. The measurement surface was placed 1 m away from the sound source surface. On the measurement surface of 1×1 m, the measurement points were 0.1 m apart, and the array of measurement elements was 11×11. To reduce computational work, the local area that contained the sound source and the measurement surface were considered the target areas. The mesh division and schematic diagram of the measuring point of the redefined target area are shown in Fig. 2 and Fig. 3.  For variable parameter analysis, Gaussian white noise with a signal-to-noise ratio (SNR) equal to 40 dB, 20 dB, and 10 dB was separately added to the sound pressure obtained from the measurement surface. For instance, the simulated surface sound pressure distribution under different SNRs and a sound source frequency of 2000 Hz is shown in Fig. 5. Fig. 5 shows that the sound pressure on the measurement surface is relatively concentrated under noise-free conditions. On the other hand, further analysis is needed for precise sound source localisation due to the sidelobe. After adding noise (SNR = 40 dB), the aggregation characteristics of the sound pressure field are less clear. Under SNR = 20 dB and SNR = 10 dB, the sound pressure distribution becomes chaotic, and the sound centre cannot be determined. In the following text, SC-DAMAS was applied for sound source localisation with the effects of white noise.

Evaluation of the SAMP-CoSaMP algorithm
First, to intuitively evaluate the reconstruction performance of SAMP-CoSaMP, relative comparative simulations with those of SAMP and CoSaMP were performed. Assume that there is a sparse signal, the number of measurement points is 128, and the signal length is 512. In the case of a noise-free environment and SNR = 20 dB, the three algorithms are employed for reconstruction, and each group of experiments is repeated 100 times and averaged using the Monte Carlo method. The running time and reconstruction error of each algorithm under variable sparsity are provided in Fig. 6 and Fig. 7 Fig. 6 and Fig. 7 shows that the SAMP algorithm has the longest running time under the same sparsity. SAMP-CoSaMP has the next longest running time, and then CoSaMP has the shortest running time. Regarding the reconstruction error, a clear sparsity threshold ( = 45) exists. Within the sparsity threshold, the reconstruction errors of all three algorithms are consistent and negligible (less than 0.01). Otherwise, when > 45, the reconstruction error of each algorithm will rise rapidly with respect to sparsity, and the average reconstruction error follows the sequence CoSaMP > SAMP-CoSaMP > SAMP. In summary, except for the adaptive characteristic to sparsity, the proposed SAMP-CoSaMP achieves a respectable tradeoff between fast running and reconstruction error.

Simulation of sound source localisation
In a practical environment, the target sound source signal could be submerged by noise, especially for low SNR cases, which negatively affects the information extraction and imaging of the target sound source. Therefore, sound source localisation issues under low SNR conditions are prioritised in engineering practice. In this study, the conventional beamforming method (CBF), orthogonal matching pursuit algorithm-based deconvolution approach (OMP-DAMAS), and SC-DAMAS are used to process the measured sound field data of different frequencies. First, to analyse efficiency, the three algorithms are individually run 10 times in different frequency scenarios. The average running times are listed in Table 1.  Table 1 shows that the sound frequency does not clearly affect the running time of the three algorithms. Under different frequencies, CBF has the shortest running time, and SC-DAMAS has the longest running time, which can be explained by the fact that the SC-DAMAS algorithm must search for the sparsity value. However, the duration of each algorithm is still within the acceptable range. More detailed sound pressure distribution from the simulation is shown in Fig. 8, Fig 9, and Fig. 10.
The centre coordinates of the sound source by the three algorithms are (0.7 m, 0.7 m). Although all of the algorithms can be used for sound source localisation, the CBF and OMP-DAMAS methods still have certain defects. More specifically, the imaging size of CBF is greatly affected by sound frequency, since the sidelobe becomes large as sound frequency decreases. In other words, a large source localisation error could be introduced by CBF for low-frequency sound. The OMP-DAMAS method has decent performance in a wide frequency range without sidelobe interference. However, the sparsity must be entered as a necessary parameter for the function solution, which limits its wide application in practice, as the number of sound sources is generally unknown before sound source localisation. SC-DAMAS is an adaptive method in which the sparsity does not need to be determined in advance. Moreover, its imaging results are similar to those of OMP-DAMAS in a wider frequency range, featuring strong robustness and high quality sound source localisation without clear sidelobes. The characteristics of the three methods for sound source localisation are summarised in Table 2

Validation of SC-DAMAS
To verify the effectiveness of the proposed SC-DAMAS in an actual low SNR environment, a sound source locaiation experiment is conducted. The workshop, standard signal source noise, and air compressor noise are employed as the experimental site, research object, and background noise, respectively. Before the experiment, the sound pressure level of the standard signal source and the working noise of the air compressor were measured to determine the SNR of the entire experimental environment. During the experiment, sound waves with frequencies of 1200 Hz and 2800 Hz are generated successively by an HD1910 standard signal generator, and its sound pressure level is maximised. In that case, the obtained sound pressure level is recorded as 69.2 dB. Afterwards, the signal generator was turned off and the air compressor was turned on. The sound pressure level was measured to be 74.8 dB.
In the experimental measurement, the signal generator and the centre of the sensor array are placed on the same horizontal line, and the distance between the two is 1.5 m horizontally and 1.1 m vertically above the ground. Therefore, the real centre of the sound source in the sensor array is (0,0). The signal generator and the air compressor are turned on simultaneously. Next, the sound array device is turned on to start collecting sound data. The location of the experimental setup and imaging result site are shown in Fig. 11 and Fig. 12. After the experiment, the measured real-time sound pressure data by the sound array are exported to the host computer, the sound pressure collected under the stable sound field is selected as the input, and the CBF, OMP-DAMAS, and SC-DAMAS algorithms are adopted to perform sound source imaging at 1200 Hz and 2800 Hz. The imaging results are shown in Fig. 13, Fig. 14, and Table 3. show that under the same SNR and sound frequency, the CBF imaging results are severely distorted. It failed to locate the sound source, as another sound source of interference was generated. The OMP-DAMAS method does not show other sources of interference and also cannot correctly locate the location of the noise source. According to Table 3, large localisation errors exist in OMP-DAMAS for the experimental data. The imaging results of the SC-DAMAS method have certain errors but the number of errors are within the acceptable range. In summary, under the conditions of the same SNR and different sound source frequencies, clear sidelobes, as well as disturbing sound centres, emerge in the sound source imaging of CBF. Although the OMP-DAMAS algorithm does not have the problem of having sidelobes, the sound source localisation error is large due to background noise. The imaging of the SC-DAMAS method will not create clear sidelobes and has high localisation accuracy. Therefore, the applicability of the reported algorithm for sound source localisation of low SNR is verified.

Conclusions
In this study, a novel deconvolution beamforming method is derived for addressing the sound source localisation issue of a low SNR. For its adaptive feature, a hybrid compressive sensing reconstruction algorithm called SAMP-CoSaMP is proposed first. Multiple comparative simulations demonstrate that the hybrid algorithm not only eliminates the dependence of the original CoSaMP on the sparsity but also reduces the SAMPS's inefficiency. Specifically, the algorithm efficiency is CoSaMP > SAMP-CoSaMP > SAMP, and the algorithm reconstruction error is SAMP > SAMP-CoSaMP > CoSaMP. Then, the adaptive sound source localisation approach is combined with DAMAS and termed SC-DAMAS. Its sound source imaging has a smaller sidelobe than the CBF algorithm in the low frequency range. In addition, it overcomes the limitation that the sparsity value must be preset when using the OMP-DAMAS technique. In summary, SC-DAMAS creates a decent tradeoff between operating efficiency and sound source imaging error. It can be used to address the localisation issue of sound sources with low SNRs over a wide frequency range, especially when the number of sound sources is unknown.