Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition

An important aspect of a humanoid robot is audition. Previous work has presented robot systems capable of sound localization and source segregation based on microphone arrays with various configurations. However, no theoretical framework for the design of these arrays has been presented. In the current paper, a design framework is proposed based on a novel array quality measure. The measure is based on the effective rank of a matrix composed of the generalized head related transfer functions (GHRTFs) that account for microphone positions other than the ears. The measure is shown to be theoretically related to standard array performance measures such as beamforming robustness and DOA estimation accuracy. Then, the measure is applied to produce sample designs of microphone arrays. Their performance is investigated numerically, verifying the advantages of array design based on the proposed theoretical framework.

[1]  Petre Stoica,et al.  MUSIC, maximum likelihood, and Cramer-Rao bound , 1989, IEEE Transactions on Acoustics, Speech, and Signal Processing.

[2]  José Santos-Victor,et al.  Sound Localization for Humanoid Robots - Building Audio-Motor Maps based on the HRTF , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Martin Vetterli,et al.  The effective rank: A measure of effective dimensionality , 2007, 2007 15th European Signal Processing Conference.

[4]  Boaz Rafaely,et al.  Optimal Real-Weighted Beamforming With Application to Linear and Spherical Arrays , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Klaus Diepold,et al.  A Novel Humanoid Binaural 3D Sound Localization and Separation Algorithm , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[6]  Tetsuya Ogata,et al.  Improvement in listening capability for humanoid robot HRP-2 , 2010, 2010 IEEE International Conference on Robotics and Automation.

[7]  Hiroshi G. Okuno,et al.  A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition , 2013, Adv. Robotics.

[8]  Boaz Rafaely,et al.  Coherent signals direction-of-arrival estimation using a spherical microphone array: Frequency smoothing approach , 2009, 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

[9]  Thomas M. Cover,et al.  Elements of information theory (2. ed.) , 2006 .

[10]  Patrick Danès,et al.  Optimal positioning of a binaural sensor on a humanoid head for sound source localization , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[11]  John R Buck,et al.  Designing nonuniform linear arrays to maximize mutual information for bearing estimation. , 2010, The Journal of the Acoustical Society of America.

[12]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[13]  Hiroshi G. Okuno,et al.  Improved binaural sound localization and tracking for unknown time-varying number of speakers , 2013, Adv. Robotics.

[14]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[15]  Brian F. G. Katz,et al.  Round Robin Comparison of HRTF Simulation Systems: Preliminary Results , 2007 .

[16]  Hagit Messer,et al.  Order statistics approach for determining the number of sources using an array of sensors , 1999, IEEE Signal Processing Letters.

[17]  Stephen P. Boyd,et al.  Sensor Selection via Convex Optimization , 2009, IEEE Transactions on Signal Processing.

[18]  Robert H. Halstead,et al.  Matrix Computations , 2011, Encyclopedia of Parallel Computing.

[19]  H. A. Schenck Improved Integral Formulation for Acoustic Radiation Problems , 1968 .

[20]  Vittorio Murino,et al.  Synthesis of unequally spaced arrays by simulated annealing , 1996, IEEE Trans. Signal Process..

[21]  S. Unnikrishna Pillai,et al.  An algorithm for near-optimal placement of sensor elements , 1990, IEEE Trans. Inf. Theory.

[22]  Harry L. Van Trees,et al.  Optimum Array Processing: Part IV of Detection, Estimation, and Modulation Theory , 2002 .

[23]  B F Katz,et al.  Acoustic absorption measurement of human hair and skin within the audible frequency range. , 2000, The Journal of the Acoustical Society of America.

[24]  John R Buck,et al.  Optimum array design to maximize Fisher information for bearing estimation. , 2011, The Journal of the Acoustical Society of America.

[25]  Karim Abed-Meraim,et al.  Adaptive blind source separation with HRTFs beamforming preprocessing , 2012, 2012 IEEE 7th Sensor Array and Multichannel Signal Processing Workshop (SAM).

[26]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[27]  Boaz Rafaely,et al.  Theoretical framework for the design of microphone arrays for robot audition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[28]  Ruth Y. Litovsky,et al.  Spatial Release from Masking , 2012 .

[29]  Thomas Kailath,et al.  Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..

[30]  N. J. A. Sloane,et al.  McLaren’s improved snub cube and other new spherical designs in three dimensions , 1996, Discret. Comput. Geom..

[31]  Simon N. Chandler-Wilde,et al.  Boundary element methods for acoustics , 2007 .

[32]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[33]  Thushara D. Abhayapala,et al.  Range and bearing estimation of wideband sources using an orthogonal beamspace processing structure , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[34]  Md Akhtaruzzaman,et al.  Evolution of Humanoid Robot and contribution of various countries in advancing the research and development of the platform , 2010, ICCAS 2010.

[35]  Luiz C. Wrobel,et al.  Boundary Element Method, Volume 1: Applications in Thermo-Fluids and Acoustics , 2003 .

[36]  Gernot Kubin,et al.  Improving Beamforming for Distant Speech Recognition in Reverberant Environments Using a Genetic Algorithm for Planar Array Synthesis , 2012, ITG Conference on Speech Communication.

[37]  Eric Michielssen,et al.  Genetic algorithm optimization applied to electromagnetics: a review , 1997 .

[38]  E. Shaw Transformation of sound pressure level from the free field to the eardrum in the horizontal plane. , 1974, The Journal of the Acoustical Society of America.

[39]  Lawrence D. Rosenblum,et al.  Hearing space: Identifying rooms by reflected sound , 2005 .