An Algorithm for Determining Talker Location using a Linear Microphone Array and Optimal Hyperbolic Fit

One of the problems for all speech input is the necessity for the talker to be encumbered by a head-mounted, hand-held, or fixed position microphone. An intelligent, electronically-aimed unidirectional microphone would overcome this problem. Array techniques hold the best promise to bring such a system to practicality. The development of a robust algorithm to determine the location of a talker is a fundamental issue for a microphone-array system. Here, a two-step talker-location algorithm is introduced. Step 1 is a rather conventional filtered cross-correlation method; the cross-correlation between some pair of microphones is determined to high accuracy using a some-what novel, fast interpolation on the sampled data. Then, using the fact that the delays for a point source should fit a hyperbola, a best hyperbolic fit is obtained using nonlinear optimization. A method which fits the hyperbola directly to peak-picked delays is shown to be far less robust than an algorithm which fits the hyperbola in the cross-correlation space. An efficient, global nonlinear optimization technique, Stochastic region Contraction (SRC) is shown to yield highly accurate (>90%), and computationally efficient, results for a normal ambient.

[1]  T.B. Martin,et al.  Practical applications of voice input to machines , 1976, Proceedings of the IEEE.

[2]  Roger Fletcher,et al.  A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..

[3]  William H. Press,et al.  Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[4]  Ralph Otto Schmidt,et al.  A signal subspace approach to multiple emitter location and spectral estimation , 1981 .

[5]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[6]  Julius S. Bendat,et al.  Engineering Applications of Correlation and Spectral Analysis , 1980 .

[7]  Stanislav B. Kesler,et al.  Bias and resolution of the MUSIC and the modified FBLP algorithms in the presence of coherent plane waves , 1988, IEEE Trans. Acoust. Speech Signal Process..

[8]  S. Biyiksiz,et al.  Multirate digital signal processing , 1985, Proceedings of the IEEE.

[9]  R. Schmidt,et al.  Multiple source DF signal processing: An experimental system , 1986 .

[10]  James L. Flanagan Bandwidth design for speech-seeking microphone arrays , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[12]  Harvey F. Silverman,et al.  Microphone array optimization by stochastic region contraction , 1991, IEEE Trans. Signal Process..

[13]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[14]  G. Carter Coherence and time delay estimation , 1987, Proceedings of the IEEE.

[15]  T. Kailath,et al.  Optimum localization of multiple sources by passive arrays , 1983 .

[16]  V. M. Alvarado,et al.  Talker Localization and Optimal Placement of Microphones for a Linear Microphone Array Using Stochastic Region Contraction. , 1990 .