Extraction of Acoustic Sources Through the Processing of Sound Field Maps in the Ray Space

Our goal is to develop a model-based approach to acoustic source extraction from microphone array data, which is suitable for both near-field and far-field sources. A signal representation based on plane-wave (PW) decomposition is suitable for acoustic sources in the far field as the resulting spectrum turns out to be impulsive. When the source approaches the array, however, the curvature of the wavefront causes the spectrum of the PW components to depart from impulsive behavior, thus making source extraction harder to attain. In this paper, we adopt a sound field representation based on the local estimation of the plenacoustic function along the array line. This approach consists of dividing the array into subarrays, and applying the PW analysis on individual subarrays. This has the immediate result of extending the range of validity of the far-field hypothesis, as a source that enters the near-field range of the extended array is still in the far-field range of the subarrays. PW analysis on subarrays allows us to construct the so-called sound field map in a domain of acoustic visibility called ray space. The extraction of the desired source is accomplished through spatial filtering of the sound field map. The design of the spatial filter relies on a linear minimum mean square error criterion defined on the sound field map. The effectiveness of the proposed methodology is proven through an extensive simulation campaign as well as real experiments.

[1]  Ramani Duraiswami,et al.  Plane-Wave Decomposition of Acoustical Scenes Via Spherical and Cylindrical Microphone Arrays , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Methods for objective and subjective assessment of quality Perceptual evaluation of speech quality ( PESQ ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs , 2002 .

[3]  Jian Li,et al.  Source Localization and Sensing: A Nonparametric Iterative Adaptive Approach Based on Weighted Least Squares , 2010, IEEE Transactions on Aerospace and Electronic Systems.

[4]  Rodney A. Kennedy,et al.  Broadband nearfield beamforming using a radial beampattern transformation , 1998, IEEE Trans. Signal Process..

[5]  Emmanuel Vincent,et al.  Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jacob Benesty,et al.  An Integrated Solution for Online Multichannel Noise Tracking and Reduction , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Augusto Sarti,et al.  Estimation of the Radiation Pattern of a Violin During the Performance Using Plenacoustic Methods , 2015 .

[8]  Emanuel A. P. Habets,et al.  Speech Enhancement in the STFT Domain , 2011, Springer Briefs in Electrical and Computer Engineering.

[9]  Boaz Rafaely,et al.  Near-Field Spherical Microphone Array Processing With Radial Filtering , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Allan J. Zuckerwar,et al.  Acoustics: Sound Fields and Transducers , 1954 .

[11]  Martin Vetterli,et al.  Space-Time-Frequency Processing of Acoustic Wave Fields: Theory, Algorithms, and Applications , 2010, IEEE Transactions on Signal Processing.

[12]  Emmanuel Vincent,et al.  An experimental comparison of source separation and beamforming techniques for microphone array signal enhancement , 2013, 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP).

[13]  Are Hjørungnes,et al.  Complex-Valued Matrix Differentiation: Techniques and Key Results , 2007, IEEE Transactions on Signal Processing.

[14]  Augusto Sarti,et al.  Soundfield Imaging in the Ray Space , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Boaz Rafaely,et al.  Microphone Array Signal Processing , 2008 .

[16]  Augusto Sarti,et al.  Fast Tracing of Acoustic Beams and Paths Through Visibility Lookup , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Martin Vetterli,et al.  Near-field adaptive beamforming and source localization in the spacetime frequency domain , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Augusto Sarti,et al.  3D Beam Tracing Based on Visibility Lookup for Interactive Acoustic Modeling , 2016, IEEE Transactions on Visualization and Computer Graphics.

[19]  Oliver Thiergart,et al.  An informed LCMV filter based on multiple instantaneous direction-of-arrival estimates , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Filippo Maria Fazi,et al.  Representation of sound fields for audio recording and reproduction , 2012 .

[21]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[22]  Sven Nordholm,et al.  A Two-Stage Method for the Design of Near-Field Broadband Beamformer , 2011, IEEE Transactions on Signal Processing.

[23]  Augusto Sarti,et al.  A linear operator for the computation of soundfield maps , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  R. K. Cook,et al.  Measurement of Correlation Coefficients in Reverberant Sound Fields , 1955 .

[25]  Robert J. Mailloux,et al.  Phased Array Antenna Handbook , 1993 .

[26]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[27]  Augusto Sarti,et al.  Resolution issues in soundfield imaging: A multiresolution approach to multiple source localization , 2015, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[28]  Peter Vary,et al.  Numerical near field optimization of a non-uniform sub-band filter-and-sum beamformer , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29]  Blake D. Bryant,et al.  IEEE Transactions on Information Forensics and Security , 2018 .

[30]  DeLiang Wang,et al.  On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[31]  T. Ajdler,et al.  The Plenacoustic Function and Its Sampling , 2006, IEEE Transactions on Signal Processing.

[32]  Jesper Jensen,et al.  A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[33]  Israel Cohen,et al.  Adaptive System Identification in the Short-Time Fourier Transform Domain Using Cross-Multiplicative Transfer Function Approximation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[34]  Richard C. Hendriks,et al.  Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[35]  Petre Stoica,et al.  Spectral Analysis of Signals , 2009 .

[36]  Augusto Sarti,et al.  A plenacoustic approach to acoustic signal extraction , 2015, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).

[37]  Augusto Sarti,et al.  Plenacoustic Imaging in the Ray Space , 2012, IWAENC.

[38]  R. Kress,et al.  Inverse Acoustic and Electromagnetic Scattering Theory , 1992 .

[39]  Angelo Farina,et al.  Balloons of Directivity of Real and Artificial Mouth Used in Determining Speech Transmission Index , 2005 .

[40]  Andrea Ridolfi,et al.  On a Stochastic Version of the Plenacoustic Function , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[41]  Kazuhiro Kondo,et al.  Subjective Quality Measurement of Speech , 2012 .

[42]  Nikolaos Mitianoudis,et al.  Using beamforming in the audio source separation problem , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[43]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[44]  Augusto Sarti,et al.  Multiview Soundfield Imaging in the Projective Ray Space , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[45]  Augusto Sarti,et al.  Deconvolution of plenacoustic images , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[46]  Densil Cabrera,et al.  Effect of Acoustic Environment on the Sensitivity of Speech Transmission Index to Source Directivity , 2009 .

[47]  Israel Cohen,et al.  Relative transfer function identification using speech signals , 2004, IEEE Transactions on Speech and Audio Processing.

[48]  Emanuel A. P. Habets,et al.  An Informed Parametric Spatial Filter Based on Instantaneous Direction-of-Arrival Estimates , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.