An efficient approach to dynamically weighted multizone wideband reproduction of speech soundfields

This paper proposes and evaluates an efficient approach for practical reproduction of multizone soundfields for speech sources. The reproduction method, based on a previously proposed approach, utilises weighting parameters to control the soundfield reproduced in each zone whilst minimising the number of loudspeakers required. Proposed here is an interpolation scheme for predicting the weighting parameter values of the multizone soundfield model that otherwise requires significant computational effort. It is shown that initial computation time can be reduced by a factor of 1024 with only -85dB of error in the reproduced soundfield relative to reproduction without interpolated weighting parameters. The perceptual impact on the quality of the speech reproduced using the method is also shown to be negligible. By using pre-saved soundfields determined using the proposed approach, practical reproduction of dynamically weighted multizone soundfields of wideband speech could be achieved in real-time.

[1]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[2]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  Ian S. Burnett,et al.  Reproduction of independent narrowband soundfields in a multizone surround system and its extension to speech signal sources , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  Marina Bosi,et al.  Introduction to Digital Audio Coding and Standards , 2004, J. Electronic Imaging.

[5]  Christof Faller,et al.  Reproducing Sound Fields Using MIMO Acoustic Channel Inversion , 2011 .

[6]  Mark A. Poletti,et al.  An Investigation of 2-D Multizone Surround Sound Systems , 2008 .

[7]  Thushara D. Abhayapala,et al.  Spatial multizone soundfield reproduction , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[8]  W. Bastiaan Kleijn,et al.  Multizone soundfield reproduction using orthogonal basis expansion , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Thushara D. Abhayapala,et al.  Theory and Design of Soundfield Reproduction Using Continuous Loudspeaker Concept , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  E. Williams,et al.  Fourier Acoustics: Sound Radiation and Nearfield Acoustical Holography , 1999 .

[11]  Thushara D. Abhayapala,et al.  Spatial Multizone Soundfield Reproduction: Theory and Design , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Jonathan G. Fiscus,et al.  DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[13]  Ian S. Burnett,et al.  Generation of Isolated Wideband Sound Fields Using a Combined Two-stage Lasso-LS Algorithm , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Thushara D. Abhayapala,et al.  Enhanced sound field reproduction within prioritized control region , 2014 .