A Quantitative Evaluation of Media Device Orchestration for Immersive Spatial Audio Reproduction

The challenge of installing and setting up dedicated spatial audio systems can make it difficult to deliver immersive listening experiences to the general public. However, the proliferation of smart mobile devices and the rise of the Internet of Things mean that there are increasing numbers of connected devices capable of producing audio in the home. \Media device orchestration" (MDO) is the concept of utilizing an ad hoc set of devices to deliver or augment a media experience. In this paper, the concept is evaluated by implementing MDO for augmented spatial audio reproduction using object-based audio with semantic metadata. A thematic analysis of positive and negative listener comments about the system revealed three main categories of response: perceptual, technical, and content-dependent aspects. MDO performed particularly well in terms of immersion/envelopment, but the quality of listening experience was partly dependent on loudspeaker quality and listener position. Suggestions for further development based on these categories are given.

[1]  Will Howie,et al.  Subjective Evaluation of Orchestral Music Recording Techniques for Three-Dimensional Audio , 2017 .

[2]  Methods for the subjective assessment of small impairments in audio systems , 2015 .

[3]  Aaron J. HELLER The Ambisonic Decoder Toolbox : Extensions for Partial-Coverage Loudspeaker Arrays , 2014 .

[4]  Hyunkook Lee,et al.  The Effect of Interchannel Time Difference on Localization in Vertical Stereophony , 2015 .

[5]  Stavros Paschalakis,et al.  BRIDGET: an approach at sustainable and efficient production of second screen media applications , 2015 .

[6]  Frank Melchior,et al.  An Audio-Visual System for Object-Based Audio: From Recording to Listening , 2018, IEEE Transactions on Multimedia.

[7]  Heiko Purnhagen,et al.  Immersive Audio Delivery Using Joint Object Coding , 2016 .

[8]  Takehiro Sugimoto,et al.  Downmixing Method for 22.2 Multichannel Sound Signal in 8K Super Hi-Vision Broadcasting , 2015 .

[9]  Etienne Parizet,et al.  Investigation on localisation accuracy for first and higher order ambisonics reproduced sound sources , 2013 .

[10]  James Woodcock,et al.  Personalized Object-Based Audio for Hearing Impaired TV Viewers , 2017 .

[11]  Mark A. Poletti Robust Two-Dimensional Surround Sound Reproduction for Nonuniform Loudspeaker Layouts , 2007 .

[12]  Jan Plogsties,et al.  Design, Coding and Processing of Metadata for Object-Based Interactive Audio , 2014 .

[13]  Frank Melchior,et al.  Media Device Orchestration for Immersive Spatial Audio Reproduction , 2017, Audio Mostly Conference.

[14]  V. Braun,et al.  Using thematic analysis in psychology , 2006 .

[15]  Michael J. Gerzon Periphony: With-Height Sound Reproduction , 1973 .

[16]  Helmut Wittek,et al.  Principles in Surround Recordings with Height , 2011 .

[17]  Adrian Hilton,et al.  Person Tracking Using Audio and Depth Cues , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[18]  Francis Rumsey,et al.  Relationships between experienced listener ratings of multichannel audio quality and naïve listener preferences. , 2005, The Journal of the Acoustical Society of America.

[19]  Mark D. Plumbley,et al.  Perceptual Evaluation of Source Separation for Remixing Music , 2017 .

[20]  Wieslaw Woszczyk,et al.  Sound Source Localization in a Five-Channel Surround Sound Reproduction System , 1999 .

[21]  Matti Karjalainen,et al.  Localization, Coloration, and Enhancement of Amplitude-Panned Virtual Sources , 1999 .

[22]  Frank Melchior,et al.  Categorization of broadcast audio objects in complex auditory scenes , 2016 .

[23]  Tim Brookes,et al.  Evaluation of Spatial Audio Reproduction Methods (Part 1): Elicitation of Perceptual Differences , 2017 .

[24]  Jean-Marc Jot,et al.  Digital Signal Processing Issues in the Context of Binaural and Transaural Stereophony , 1995 .

[25]  Russell Mason How Important Is Accurate Localization in Reproduced Sound , 2017 .

[26]  Francis Rumsey,et al.  Localization Curves for a Regularly-Spaced Octagon Loudspeaker Array , 2009 .

[27]  Frank Melchior,et al.  Presenting the S3A object-based audio drama dataset , 2016 .

[28]  Jan Plogsties,et al.  MPEG-H Audio—The New Standard for Universal Spatial / 3D Audio Coding , 2014 .

[29]  Frank Melchior,et al.  Object-Based Reverberation for Spatial Audio , 2017 .

[30]  Francis Rumsey,et al.  Spatial Audio Quality Perception (Part 1): Impact of Commonly Encountered Processes , 2015 .

[31]  Francis Rumsey,et al.  Evaluating the Sensation of Envelopment Arising from 5-Channel Surround Sound Recordings , 2008 .

[32]  Norbert Schnell,et al.  Soundworks – A playground for artists and developers to create collaborative mobile web performances , 2015 .

[33]  Koichiro Hiyama,et al.  Reproducing Spatial Impression With Multichannel Audio , 2003 .

[34]  José López Vicario,et al.  A Review of Pedestrian Indoor Positioning Systems for Mass Market Applications , 2017, Sensors.

[35]  Michael A. Gerzon,et al.  Criteria for Evaluating Surround-Sound Systems , 1977 .

[36]  Georg Plenge,et al.  Localization of Lateral Phantom Sources , 1976 .

[37]  Ville Pulkki,et al.  Virtual Sound Source Positioning Using Vector Base Amplitude Panning , 1997 .

[38]  Jan Berg,et al.  The Contrasting and Conflicting Definitions of Envelopment , 2009 .

[39]  Tim Brookes,et al.  Evaluation of spatial audio reproduction methods (part 2) : analysis of listener preference , 2017 .

[40]  Frank Melchior,et al.  Presenting the S 3 A object based audio drama , 2018 .

[41]  Frank Melchior,et al.  A Subjective Comparison of Discrete Surround Sound and Soundbar Technology by Using Mixed Methods , 2016 .

[42]  Takeshi Nakayama,et al.  Sound-Image Localization in Multichannel Matrix Reproduction , 1972 .