Influence of Visual Stimuli on Perceptual Attributes of Spatial Audio

Reproduced audio is often accompanied with visuals (i.e. television, virtual reality, gaming, and cinema). However, the audio technology for these systems is often researched and evaluated in isolation from the visual component. Previous research indicates that the auditory and visual modalities are not processed separately. For example, visual stimuli can influence ratings of audio quality and vice versa. This paper presents an experiment to investigate the influence of visual stimuli on a set of attributes relevant to the perception of spatial audio. Eighteen participants took part in a paired comparison listening test where they were asked to judge pairs of stimuli rendered to fourteen-, five-, and two-channel systems using ten perceptual attributes. The stimuli were presented in audio only and audio-visual conditions. The results show a significant and large main effect of the loudspeaker configuration for all the tested attributes other than overall spectral balance and depth of field. The effect of visual stimuli was found to be small and significant for the attributes realism, sense of space, and spatial clarity. These results suggest that evaluations of audio-visual technologies aiming to evoke a sense of realism or presence should consider the influence of both the audio and visual modalities.

[1]  Jari Takatalo,et al.  User Experience in 3D Stereoscopic Games , 2011 .

[2]  Nick Zacharov,et al.  Unraveling the perception of spatial sound reproduction: Techniques and experimental design , 2001 .

[3]  Paul Jennings,et al.  The development and application of the emotional dimensions of a soundscape , 2013 .

[4]  Tim Brookes,et al.  Evaluation of Spatial Audio Reproduction Methods (Part 1): Elicitation of Perceptual Differences , 2017 .

[5]  Matthew Lombard,et al.  At the Heart of It All: The Concept of Presence , 2006 .

[6]  Mike P. Hollier,et al.  Objective Performance Assessment: Video Quality as an Influence on Audio Perception , 1997 .

[7]  D. Burr,et al.  The Ventriloquist Effect Results from Near-Optimal Bimodal Integration , 2004, Current Biology.

[8]  Tony Scudiero Graphical processing units (GPU)-accelerated acoustic simulation for interactive experiences , 2017 .

[9]  Regina Dittrich,et al.  prefmod: An R Package for Modeling Preferences Based on Paired Comparisons, Rankings, or Ratings , 2012 .

[10]  Jan Plogsties,et al.  MPEG-H 3D Audio—The New Standard for Coding of Immersive Spatial Audio , 2015, IEEE Journal of Selected Topics in Signal Processing.

[11]  R. Bakeman Recommended effect size statistics for repeated measures designs , 2005, Behavior research methods.

[12]  Catherine Guastavino,et al.  Perceptual evaluation of multi-dimensional spatial audio reproduction. , 2004, The Journal of the Acoustical Society of America.

[13]  Friedrich Platz,et al.  When the Eye Listens: A Meta-analysis of How Audio-visual Presentation Enhances the Appreciation of Music Performance , 2012 .

[14]  S. Mateeff,et al.  Dynamic Visual Capture: Apparent Auditory Motion Induced by a Moving Visual Target , 1985, Perception.

[15]  Carlo Harvey,et al.  Audiovisual Resource Allocation for Bimodal Virtual Environments , 2018, Comput. Graph. Forum.

[16]  Takehiro Sugimoto,et al.  Downmixing Method for 22.2 Multichannel Sound Signal in 8K Super Hi-Vision Broadcasting , 2015 .

[17]  Shin-ichiro Iwamiya,et al.  Interactions between auditory and visual processing when listening to music in an audiovisual context : 1. Matching 2. Audio quality , 1994 .

[18]  Søren Bech,et al.  Interaction Between Audio-Visual Factors in a Home Theater System: Experimental Results , 1995 .

[19]  Tim Brookes,et al.  Evaluation of spatial audio reproduction methods (part 2) : analysis of listener preference , 2017 .

[20]  Nick Zacharov,et al.  Unravelling the Perception of Spatial Sound Reproduction: Analysis & External Preference Mapping , 2001 .

[21]  R. A. Bradley,et al.  RANK ANALYSIS OF INCOMPLETE BLOCK DESIGNS THE METHOD OF PAIRED COMPARISONS , 1952 .

[22]  Tomasz Letowski,et al.  Sound Quality Assessment: Concepts and Criteria , 1989 .

[23]  Vincent Koehl,et al.  Should Sound and Image Be Coherent during Live Performances , 2018 .

[24]  Algorithms to measure audio programme loudness and true-peak audio level , 2011 .

[25]  Sylvain Choisel,et al.  Extraction of auditory features and elicitation of attributes for the assessment of multichannel reproduced sound , 2006 .

[26]  C. Lavandier,et al.  Influence of visual setting on sound ratings in an urban environment , 2002 .

[27]  Søren Bech,et al.  Interaction Between Audio-Visual Factors in a Home Theater System: Definition of Subjective Attributes , 1995 .

[28]  S. Shimojo,et al.  Sound alters visual evoked potentials in humans , 2001, Neuroreport.

[29]  Sylvain Choisel,et al.  Evaluation of multichannel reproduced sound: scaling auditory attributes underlying listener preference. , 2007, The Journal of the Acoustical Society of America.

[30]  Nick Zacharov,et al.  Unravelling the Perception of Spatial Sound Reproduction: Language Development, Verbal Protocol Analysis and Listener Training , 2001 .

[31]  E. Parizet,et al.  Comparison of some listening test methods : A case study , 2005 .

[32]  Andrew Hogue,et al.  The Effect of Sound on Visual Fidelity Perception in Stereoscopic 3-D , 2013, IEEE Transactions on Cybernetics.

[33]  Francis Rumsey,et al.  Relationships between experienced listener ratings of multichannel audio quality and naïve listener preferences. , 2005, The Journal of the Acoustical Society of America.

[34]  J. Jeon,et al.  The effects of audio-visual factors on perceptions of environmental noise barrier performance , 2014 .

[35]  Zhang Bang-jun,et al.  The influence of the visibility of the source on the subjective annoyance due to its noise , 2003 .

[36]  Frank Melchior,et al.  An Audio-Visual System for Object-Based Audio: From Recording to Listening , 2018, IEEE Transactions on Multimedia.

[37]  Francis Rumsey,et al.  On the relative importance of spatial and timbral fidelities in judgments of degraded multichannel audio quality. , 2005, The Journal of the Acoustical Society of America.

[38]  Massimiliano Masullo,et al.  The influence of visual characteristics of barriers on railway noise perception. , 2013, The Science of the total environment.

[39]  Etienne Parizet,et al.  Paired comparison listening tests and circular error rates , 2002 .

[40]  David C. Waddington,et al.  A multidimensional evaluation of the perception and annoyance caused by railway induced groundborne vibration , 2014 .

[41]  Ag Armin Kohlrausch,et al.  Audio—Visual Interaction in the Context of Multi-Media Applications , 2005 .

[42]  H. McGurk,et al.  Hearing lips and seeing voices , 1976, Nature.

[43]  Jędrzej Kociński,et al.  Audio-visual interactions in environment assessment. , 2015, The Science of the total environment.

[44]  Bernhard E. Riecke,et al.  Moving sounds enhance the visually-induced self-motion illusion (circular vection) in virtual reality , 2009, TAP.

[45]  Nick Zacharov,et al.  Audio descriptive analysis & mapping of spatial sound displays , 2001 .

[46]  Jin Yong Jeon,et al.  Designing sound and visual components for enhancement of urban soundscapes. , 2013, The Journal of the Acoustical Society of America.

[47]  Jörg Lewald,et al.  The effect of gaze eccentricity on perceived sound direction and its relation to visual localization , 1998, Hearing Research.

[48]  P. Moran On the method of paired comparisons. , 1947, Biometrika.