Optimally truncating head-related impulse response by dynamic programming with its applications

We propose a method to optimally truncate the head-related impulse responses (HRIRs) in this paper. The truncated HRIR consists of a portion of the original HRIR and a flat line. An algorithm based on dynamic programming is used to optimally select the portions of the original HRIRs and the constants of the flat lines to minimize the modeling errors. The truncated HRIRs can be used to reproduce multi-channel sound for headphones with a significantly lower computational cost. The proposed method is compared with another approximation method, the CAPZ (Common-Acoustical-Pole and Zero) approach. The experimental results show that the proposed method yields lower composition as well as modeling errors for the same amount of computation. Compared with the direct implementation, the proposed approach requires about 35 % of the computational cost while maintaining acceptable composition errors.

[1]  Bing-Fei Wu,et al.  An efficient implementation of a low-complexity MP3 algorithm with a stream cipher , 2007, Multimedia Tools and Applications.

[2]  Nobuhiko Kitawaki,et al.  Common-acoustical-pole and zero modeling of head-related transfer functions , 1999, IEEE Trans. Speech Audio Process..

[3]  G. Le Touze,et al.  Source localization on a single hydrophone , 2008, OCEANS 2008.

[4]  Abhijit Kulkarni,et al.  Infinite-impulse-response models of the head-related transfer function. , 1995, The Journal of the Acoustical Society of America.

[5]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[6]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[7]  Eliza Varney Advanced Television Systems Committee, Inc , 2010 .

[8]  Shingchern D. You,et al.  Efficient quantization algorithm for real-time MP-3 encoders , 2008, Multimedia Tools and Applications.

[9]  NORIAKI SAKAMOTO,et al.  Single DSP Implementation of Realtime 3D Sound Synthesis Algorithm , 2003, J. Circuits Syst. Comput..

[10]  W. G. Gardner,et al.  HRTF measurements of a KEMAR , 1995 .

[11]  Gregory H. Wakefield,et al.  Efficient model fitting using a genetic algorithm: pole-zero approximations of HRTFs , 2002, IEEE Trans. Speech Audio Process..

[12]  Yang-Chih Shen,et al.  Rendering spatial sound on headsets for five-channel audio , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[13]  Richard O. Duda,et al.  A structural model for binaural sound synthesis , 1998, IEEE Trans. Speech Audio Process..

[14]  Sungmok Hwang,et al.  Interpretations on principal components analysis of head-related impulse responses in the median plane. , 2008, The Journal of the Acoustical Society of America.

[15]  Information technology — Coding of audio-visual objects — Part 3 : Audio Technologies de l ' information — Codage des objets audiovisuels — Partie , 1999 .

[16]  I. Kale,et al.  Low-order modeling of head-related transfer functions using balanced model truncation , 1997, IEEE Signal Processing Letters.

[17]  Gregory H. Wakefield,et al.  Pole-zero approximations for head-related transfer functions using a logarithmic error criterion , 1997, IEEE Trans. Speech Audio Process..