Perceptual audio rendering of complex virtual environments

We propose a real-time 3D audio rendering pipeline for complex virtual scenes containing hundreds of moving sound sources. The approach, based on auditory culling and spatial level-of-detail, can handle more than ten times the number of sources commonly available on consumer 3D audio hardware, with minimal decrease in audio quality. The method performs well for both indoor and outdoor environments. It leverages the limited capabilities of audio hardware for many applications, including interactive architectural acoustics simulations and automatic 3D voice management for video games.Our approach dynamically eliminates inaudible sources and groups the remaining audible sources into a budget number of clusters. Each cluster is represented by one impostor sound source, positioned using perceptual criteria. Spatial audio processing is then performed only on the impostor sound sources rather than on every original source thus greatly reducing the computational cost.A pilot validation study shows that degradation in audio quality, as well as localization impairment, are limited and do not seem to vary significantly with the cluster budget. We conclude that our real-time perceptual audio rendering pipeline can generate spatialized audio for complex auditory environments without introducing disturbing changes in the resulting perceived soundfield.

[1]  Andreas Spanias,et al.  A review of algorithms for perceptual coding of digital audio signals , 1997, Proceedings of 13th International Conference on Digital Signal Processing.

[2]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[3]  Nicolas Tsingos,et al.  Soundtracks for Computer Animation: Sound Rendering in Dynamic Environment with Occlusions , 1997, Graphics Interface.

[4]  Allan D. Pierce,et al.  Acoustics , 1989 .

[5]  Tapio Lokki,et al.  Creating Interactive Virtual Acoustic Environments , 1999 .

[6]  William L. Martens,et al.  Principal Components Analysis and Resynthesis of Spectral Cues to Perceived Direction , 1987, ICMC.

[7]  Albert S. Bregman,et al.  The Auditory Scene. (Book Reviews: Auditory Scene Analysis. The Perceptual Organization of Sound.) , 1990 .

[8]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[9]  Carlo H. Séquin,et al.  Adaptive display algorithm for interactive frame rates during visualization of complex virtual environments , 1993, SIGGRAPH.

[10]  Pierre Poulin,et al.  A Light Hierarchy for Fast Rendering of Scenes with Many Lights , 1998, Comput. Graph. Forum.

[11]  Jens Herder,et al.  Optimization of Sound Spatialization Resource Management through Clustering , 1999 .

[12]  Karlheinz Brandenburg,et al.  MP3 and AAC Explained , 1999 .

[13]  Durand R. Begault,et al.  EARLY REFLECTION THRESHOLDS FOR VIRTUAL SOUND SOURCES , 2001 .

[14]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[15]  I. Hirsh The Influence of Interaural Phase on Interaural Summation and Inhibition , 1948 .

[16]  B. V. Van Veen,et al.  A spatial feature extraction and regularization model for the head-related transfer function. , 1995, The Journal of the Acoustical Society of America.

[17]  Dinesh K. Pai,et al.  Interactive Simulation of Complex Audiovisual Scenes , 2004, Presence: Teleoperators & Virtual Environments.

[18]  Daniel Västfjäll,et al.  ECOLOGICAL ACOUSTICS AND THE MULTI-MODAL PERCEPTION OF ROOMS: REAL AND UNREAL EXPERIENCES OF AUDITORY-VISUAL VIRTUAL ENVIRONMENTS , 2001 .

[19]  Christof Faller,et al.  Binaural Cue Coding Applied to Audio Compression with Flexible Rendering , 2002 .

[20]  Pasi Fränti,et al.  Randomised Local Search Algorithm for the Clustering Problem , 2000, Pattern Analysis & Applications.

[21]  Christer Grewin,et al.  Methods for Quality Assessment of Low Bit-Rate Audio Codecs , 1993 .

[22]  David B. Shmoys,et al.  A Best Possible Heuristic for the k-Center Problem , 1985, Math. Oper. Res..

[23]  Olli Nevalainen,et al.  On the splitting method for vector quantization codebook generation , 1997 .

[24]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[25]  Thomas A. Funkhouser,et al.  Real-time acoustic modeling for distributed virtual environments , 1999, SIGGRAPH.

[26]  Kenneth Steiglitz,et al.  A digital signal processing primer - with applications to digital audio and computer music , 1996 .

[27]  Thomas A. Funkhouser,et al.  Modeling acoustics in virtual environments using the uniform theory of diffraction , 2001, SIGGRAPH.

[28]  Dinesh K. Pai,et al.  MEASUREMENTS OF PERCEPTUAL QUALITY OF CONTACT SOUND MODELS , 2002 .

[29]  Yoshinori Dobashi,et al.  Real-time rendering of aerodynamic sound using sound textures based on computational fluid dynamics , 2003, ACM Trans. Graph..

[30]  Kenneth Steiglitz,et al.  A DSP primer : with applications to digital audio and computer music , 1996 .

[31]  Tapio Lokki,et al.  A case study of auditory navigation in virtual acoustic environments , 2000 .

[32]  T.F. Quatieri,et al.  A perceptual representation of audio for co-channel source separation , 1991, Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics.

[33]  Daniel Västfjäll,et al.  Better Presence and Performance in Virtual Environments by Improved Binaural Sound Rendering , 2002 .

[34]  J. Borish Extension of the image model to arbitrary polyhedra , 1984 .

[35]  Pat Hanrahan,et al.  Ray tracing on programmable graphics hardware , 2002, SIGGRAPH Courses.

[36]  Elizabeth M. Wenzel,et al.  A software-based system for interactive spatial sound synthesis , 2000 .

[37]  Durand R. Begault,et al.  3-D Sound for Virtual Reality and Multimedia Cambridge , 1994 .

[38]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .

[39]  Fabio Pellacini,et al.  A Perceptually-Based Texture Caching Algorithm for Hardware-Based Rendering , 2001, Rendering Techniques.

[40]  Thomas Baer,et al.  A model for the prediction of thresholds, loudness, and partial loudness , 1997 .

[41]  Mathieu Lagrange,et al.  Real-Time Additive Synthesis of Sound by Taking Advantage of Psychoacoustics , 2001 .

[42]  Russell L. Storms Auditory-visual cross-modal perception phenomena , 1998 .

[43]  Daniel Patrick,et al.  A PERCEPTUAL REPRESENTATION OF AUDIO , 1992 .

[44]  Kellogg S. Booth,et al.  Report from the chair , 1986 .

[45]  James K. Hahn,et al.  Perceptuallly based scheduling algorithms for real-time synthesis of complex sonic environments , 1997 .

[46]  Derek Brock,et al.  An extensible toolkit for creating virtual sonic environments , 2000 .