An Empirical Study of Hear-Through Augmented Reality: Using Bone Conduction to Deliver Spatialized Audio

Augmented reality (AR) is the mixing of computer-generated stimuli with real-world stimuli. In this paper, we present results from a controlled, empirical study comparing three ways of delivering spatialized audio for AR applications: a speaker array, headphones, and a bone-conduction headset. Analogous to optical-see-through AR in the visual domain, hear-through AR allows users to receive computer-generated audio using the bone-conduction headset, and real-world audio using their unoccluded ears. Our results show that subjects achieved the best accuracy using a speaker array physically located around the listener when stationary sounds were played, but that there was no difference in accuracy between the speaker array and the bone-conduction device for sounds that were moving, and that both devices outperformed standard headphones for moving sounds. Subjective comments by subjects following the experiment support this performance data.

[1]  Chris Schmandt,et al.  Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments , 2000, TCHI.

[2]  Paula P. Henry,et al.  Spatial audio through a bone conduction interface , 2006, International journal of audiology.

[3]  Benjamin B. Bederson,et al.  Audio augmented reality: a prototype automated tour guide , 1995, CHI 95 Conference Companion.

[4]  William G. Gardner,et al.  3D Audio and Acoustic Environment Modeling , 1999 .

[5]  Perry R. Cook,et al.  N ≫ 2: multi-speaker display systems for virtual reality and spatial audio projection , 1998 .

[6]  Elizabeth M. Wenzel,et al.  Localization with non-individualized virtual acoustic display cues , 1991, CHI.

[7]  Raymond M. Stanley,et al.  Lateralization of Sounds Using Bone-Conduction Headsets , 2006 .

[8]  Yoshinobu Tonomura,et al.  Whisper: a wristwatch style wearable handset , 1999, CHI '99.

[9]  Robert W. Lindeman,et al.  A classification scheme for multi-sensory augmented reality , 2007, VRST '07.

[10]  Tapio Lokki,et al.  Techniques and Applications of Wearable Augmented Reality Audio , 2003 .

[11]  Céline Loscos,et al.  Automatic generation of consistent shadows for augmented reality , 2005, Graphics Interface.

[12]  Tapio Takala,et al.  Sound rendering , 1992, SIGGRAPH.

[13]  Robert W. Lindeman,et al.  Controlling the Perceived Vibrational Frequency and Amplitude of a Voice-Coil-Type Tactor , 2006, 2006 14th Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems.

[14]  Ramani Duraiswami,et al.  EXTRACTING SIGNIFICANT FEATURES FROM THE HRTF , 2003 .

[15]  Robert W. Lindeman,et al.  Hear-Through and Mic-Through Augmented Reality: Using Bone Conduction to Display Spatialized Audio , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[16]  Raymond M. Stanley,et al.  THRESHOLDS OF AUDIBILITY FOR BONE-CONDUCTION HEADSETS , 2005 .

[17]  F L Wightman,et al.  Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[18]  Bruce N. Walker,et al.  Navigation performance in a virtual environment with bonephones , 2005 .

[19]  Jonathan Berger,et al.  Estimating Transfer Function from Air to bone conduction using singing voice , 2005, ICMC.