On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

This paper deals with some of the different problems, strategies, and solutions of building true immersive audio systems oriented to future communication applications. The aim is to build a system where the acoustic field of a chamber is recorded using a microphone array and then is reconstructed or rendered again, in a different chamber using loudspeaker array-based techniques. Our proposal explores the possibility of using recent robust adaptive beamforming techniques for effectively estimating the original sources of the emitting room. A joint audio-video localization method needed in the estimation process as well as in the rendering engine is also presented. The estimated source signal and the source localization information drive a wave field synthesis engine that renders the acoustic field again at the receiving chamber. The system performance is tested using MUSHRA-based subjective tests.

[1]  Akihiko Sugiyama,et al.  A real time robust adaptive microphone array controlled by an SNR estimate , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2]  Jon A. Beracoechea-Álava,et al.  Coding Strategies and Quality Measure for Multichannel Audio , 2004 .

[3]  Walter Kellermann,et al.  EB-ESPRIT: 2D localization of multiple wideband acoustic sources using eigen-beams , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Meng Hwa Er,et al.  An extended generalized sidelobe canceller in time and frequency domain , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[5]  Aki Härmä,et al.  Coding Principles for Virtual Acoustic Openings , 2002 .

[6]  L. J. Griffiths,et al.  An alternative approach to linearly constrained adaptive beamforming , 1982 .

[7]  Sascha Spors,et al.  Joint Audio-Video Signal Processing for Object Localization and Tracking , 2001, Microphone Arrays.

[8]  Werner P. J. de Bruijn,et al.  Application of Wave Field Synthesis in Life-size Videoconferencing , 2003 .

[9]  Akihiko Sugiyama,et al.  A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters , 1999, IEEE Trans. Signal Process..

[10]  G. Glentis Implementation of adaptive generalized sidelobe cancellers using efficient complex valued arithmetic , 2003 .

[11]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[12]  B.D. Van Veen,et al.  Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[13]  Diemer de Vries Sound reinforcement by wavefield synthesis : Adaptation of the synthesis operator to the loudspeaker directivity characteristics , 1996 .

[14]  Alan D. Blumlein,et al.  British Patent Specification 394,325 (Improvements in and relating to Sound-transmission, Sound-recording and Sound-reproducing Systems) , 1958 .

[15]  Walter Kellermann,et al.  An integrated real-time system for immersive audio applications , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[16]  Marinus M. Boone Acoustic rendering with wave field synthesis , 2001 .

[17]  J. Hernando,et al.  Integrated adaptive beamforming and Wiener filtering for a robust microphone array , 2004, Processing Workshop Proceedings, 2004 Sensor Array and Multichannel Signal.

[18]  Zhiping Lin,et al.  Generalized sidelobe cancellers with leakage constraints , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[19]  M. Bellanger Adaptive filter theory: by Simon Haykin, McMaster University, Hamilton, Ontario L8S 4LB, Canada, in: Prentice-Hall Information and System Sciences Series, published by Prentice-Hall, Englewood Cliffs, NJ 07632, U.S.A., 1986, xvii+590 pp., ISBN 0-13-004052-5 025 , 1987 .

[20]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[21]  Zoran Saric,et al.  Adaptive microphone array based on pause detection , 2004 .

[22]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[23]  William B. Snow Basic Principles of Stereophonic Sound , 1953 .

[24]  Naoyuki Ichimura,et al.  Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface , 2004, EURASIP J. Adv. Signal Process..

[25]  Yahong Rosa Zheng,et al.  Adaptive beamforming using affine projection algorithms , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[26]  José Manuel Páez-Borrallo,et al.  On the implementation of a partitioned block frequency domain adaptive filter (PBFDAF) for long acoustic echo cancellation , 1992, Signal Process..

[27]  Walter Kellermann,et al.  Efficient frequency-domain realization of robust generalized, sidelobe cancellers , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).

[28]  Basilio Pueo,et al.  FOR THE SIMULATION , PERFORMANCE ANALISYS AND REAL-TIME IMPLEMENTATION OF WAVE FIELD SYNTHESIS SYSTEMS FOR 3 D-AUDIO , 2003 .

[29]  Werner P. J. de Bruijn,et al.  Improving Speech Intelligibility in Teleconferencing by using Wave Field Synthesis , 2003 .

[30]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[31]  Hong Wang,et al.  Voice source localization for automatic camera pointing system in videoconferencing , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[32]  Michael S. Brandstein,et al.  Robust Localization in Reverberant Rooms , 2001, Microphone Arrays.

[33]  Soledad Torres-Guijarro,et al.  Conjugate gradient techniques for multichannel acoustic echo cancellation , 2006 .

[34]  Ian R. Fasel,et al.  A generative framework for real time object detection and classification , 2005, Comput. Vis. Image Underst..

[35]  Bernard Widrow,et al.  A comparison of adaptive algorithms based on the methods of steepest descent and random search , 1976 .

[36]  Sven Nordholm,et al.  Subband generalized sidelobe canceller - a constrained region approach , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[37]  Walter Kellermann,et al.  Wave-domain adaptive filtering: acoustic echo cancellation for full-duplex systems based on wave-field synthesis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[38]  A. J. Berkhout,et al.  A Holographic Approach to Acoustic Control , 1988 .

[39]  Sascha Spors,et al.  A novel approach to active listening room compensation for wave field synthesis using wave-domain adaptive filtering , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[40]  José Antonio Apolinário,et al.  The constrained conjugate gradient algorithm , 2000, IEEE Signal Processing Letters.

[41]  Norbert Strobel,et al.  Speaker Localization Using A Steered Filter-And-Sum Beamformer , 1999 .