Estimation of Object-Based Reverberation Using an Ad-Hoc Microphone Arrangement for Live Performance

We present a novel pipeline to estimate reverberant spatial audio object (RSAO) parameters given room impulse responses (RIRs) recorded by ad-hoc microphone arrangements. The proposed pipeline performs three tasks: direct-to-reverberant-ratio (DRR) estimation; microphone localization; RSAO parametrization. RIRs recorded at Bridgewater Hall by microphones arranged for a BBC Philharmonic Orchestra performance were parametrized. Objective measures of the rendered RSAO reverberation characteristics were evaluated and compared with reverberation recorded by a Soundfield microphone. Alongside informal listening tests, the results confirmed that the rendered RSAO gave a plausible reproduction of the hall, comparable to the measured response. The objectification of the reverb from in-situ RIR measurements unlocks customization and personalization of the experience for different audio systems, user preferences and playback environments.

[1]  Frank Melchior,et al.  Design and Implementation of an Interactive Room Simulation for Wave Field Synthesis , 2010 .

[2]  Ville Pulkki,et al.  HRIR Database with Measured Actual Source Direction Data , 2012 .

[3]  I. Bork,et al.  A comparison of room simulation software - The 2nd round robin on room acoustical computer simulation , 2000 .

[4]  Heinrich Kuttruff,et al.  Room acoustics , 1973 .

[5]  Vesa Välimäki,et al.  Fifty Years of Artificial Reverberation , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  F Martellotta,et al.  The just noticeable difference of center time and clarity index in large reverberant spaces. , 2010, The Journal of the Acoustical Society of America.

[7]  Pavel Zahorik,et al.  Direct-to-reverberant energy ratio sensitivity. , 2002, The Journal of the Acoustical Society of America.

[8]  Philip J. B. Jackson,et al.  Object-based reverberation encoding from first-order Ambisonic RIRs , 2017 .

[9]  Juha Merimaa,et al.  Spatial Impulse Response Rendering I: Analysis and Synthesis , 2005 .

[10]  Matti S. Hämäläinen,et al.  Passive Temporal Offset Estimation of Multichannel Recordings of an Ad-Hoc Microphone Array , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Jan Plogsties,et al.  MPEG-H 3D Audio—The New Standard for Coding of Immersive Spatial Audio , 2015, IEEE Journal of Selected Topics in Signal Processing.

[12]  Marc Olano,et al.  Normal Distribution Mapping , 1997 .

[13]  Philip J. B. Jackson,et al.  Estimation of Room Reflection Parameters for a Reverberant Spatial Audio Object , 2015 .

[14]  Jian Li,et al.  Exact and Approximate Solutions of Source Localization Problems , 2008, IEEE Transactions on Signal Processing.

[15]  Stefan Weinzierl,et al.  Perceptual Evaluation of Model- and Signal-Based Predictors of the Mixing Time in Binaural Room Impulse Responses * , 2012 .

[16]  Frank Melchior,et al.  Object-Based Reverberation for Spatial Audio , 2017 .

[17]  Martin Vetterli,et al.  Acoustic echoes reveal room shape , 2013, Proceedings of the National Academy of Sciences.

[18]  Philip J. B. Jackson,et al.  Acoustic Reflector Localization: Novel Image Source Reversion and Direct Localization Methods , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[19]  Zihou Meng,et al.  The Just Noticeable Difference of Noise Length and Reverberation Perception , 2006, 2006 International Symposium on Communications and Information Technologies.

[20]  Joshua D. Reiss,et al.  Over-Determined Source Separation and Localization Using Distributed Microphones , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[21]  Joshua D. Reiss,et al.  Perceptual Evaluation and Analysis of Reverberation in Multitrack Music Production , 2017 .

[22]  John S. Bradley,et al.  A just noticeable difference in C50 for speech , 1999 .

[23]  Reinhold Häb-Umbach,et al.  Acoustic Microphone Geometry Calibration: An overview and experimental evaluation of state-of-the-art algorithms , 2016, IEEE Signal Processing Magazine.

[24]  Volkan Cevher,et al.  Acoustic sensor network design for position estimation , 2009, TOSN.

[25]  Frank Melchior,et al.  Spatial Sound With Loudspeakers and Its Perception: A Review of the Current State , 2013, Proceedings of the IEEE.

[26]  Sridha Sridharan,et al.  Clustered Blind Beamforming From Ad-Hoc Microphone Arrays , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Emanuël A. P. Habets,et al.  3D Room Geometry Inference Based on Room Impulse Response Stacks , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[28]  Martin Vetterli,et al.  Euclidean Distance Matrices: Essential theory, algorithms, and applications , 2015, IEEE Signal Processing Magazine.

[29]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[30]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[31]  Emanuel A. P. Habets,et al.  Informed Spatial Filtering for Sound Extraction Using Distributed Microphone Arrays , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[32]  Pasi Pertilä,et al.  Self-localization of dynamic user-worn microphones from observed speech , 2017 .

[33]  Gerald Schuller,et al.  Close Miking Empirical Practice Verification: A Source Separation Approach , 2018, ArXiv.

[34]  Alessio Del Bue,et al.  Towards fully uncalibrated room reconstruction with sound , 2014, 2014 22nd European Signal Processing Conference (EUSIPCO).

[35]  Tapio Lokki,et al.  Spatial Decomposition Method for Room Impulse Responses , 2013 .

[36]  Mike Brookes,et al.  Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm , 2007, IEEE Transactions on Audio, Speech, and Language Processing.