论文信息 - An Algorithm for Determining Talker Location using a Linear Microphone Array and Optimal Hyperbolic Fit

An Algorithm for Determining Talker Location using a Linear Microphone Array and Optimal Hyperbolic Fit

One of the problems for all speech input is the necessity for the talker to be encumbered by a head-mounted, hand-held, or fixed position microphone. An intelligent, electronically-aimed unidirectional microphone would overcome this problem. Array techniques hold the best promise to bring such a system to practicality. The development of a robust algorithm to determine the location of a talker is a fundamental issue for a microphone-array system. Here, a two-step talker-location algorithm is introduced. Step 1 is a rather conventional filtered cross-correlation method; the cross-correlation between some pair of microphones is determined to high accuracy using a some-what novel, fast interpolation on the sampled data. Then, using the fact that the delays for a point source should fit a hyperbola, a best hyperbolic fit is obtained using nonlinear optimization. A method which fits the hyperbola directly to peak-picked delays is shown to be far less robust than an algorithm which fits the hyperbola in the cross-correlation space. An efficient, global nonlinear optimization technique, Stochastic region Contraction (SRC) is shown to yield highly accurate (>90%), and computationally efficient, results for a normal ambient.

Harvey F. Silverman | H. Silverman

[1] T.B. Martin,et al. Practical applications of voice input to machines , 1976, Proceedings of the IEEE.

[2] Roger Fletcher,et al. A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..

[3] William H. Press,et al. Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[4] Ralph Otto Schmidt,et al. A signal subspace approach to multiple emitter location and spectral estimation , 1981 .

[5] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .

[6] Julius S. Bendat,et al. Engineering Applications of Correlation and Spectral Analysis , 1980 .

[7] Stanislav B. Kesler,et al. Bias and resolution of the MUSIC and the modified FBLP algorithms in the presence of coherent plane waves , 1988, IEEE Trans. Acoust. Speech Signal Process..

[8] S. Biyiksiz,et al. Multirate digital signal processing , 1985, Proceedings of the IEEE.

[9] R. Schmidt,et al. Multiple source DF signal processing: An experimental system , 1986 .

[10] James L. Flanagan. Bandwidth design for speech-seeking microphone arrays , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11] J. Flanagan,et al. Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[12] Harvey F. Silverman,et al. Microphone array optimization by stochastic region contraction , 1991, IEEE Trans. Signal Process..

[13] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .

[14] G. Carter. Coherence and time delay estimation , 1987, Proceedings of the IEEE.

[15] T. Kailath,et al. Optimum localization of multiple sources by passive arrays , 1983 .

[16] V. M. Alvarado,et al. Talker Localization and Optimal Placement of Microphones for a Linear Microphone Array Using Stochastic Region Contraction. , 1990 .