A two-stage algorithm for determining talker location from linear microphone array data

Abstract A microphone array system for speech data input must include a robust algorithm for determining the location of the desired talker. Here, a two-stage talker location algorithm based on filtered cross-correlation is introduced. At each stage, maximization of a sum-of-independent-cross-correlations functional is used to establish talker position. Suitable accuracy is obtained at low cost by using multirate interpolation. Experimental evidence has shown that a two-stage procedure improves performance; in the first stage, closely spaced microphone pairs are used to determine the x location of the talker (x0), and more broadly spaced pairs are used in the second stage to find y0 for a restricted range of x. Substantive results, based on real data, are presented to indicate performance. An efficient, global, non-linear optimization technique, stochastic region contraction (SRC), is briefly introduced and is shown to make this algorithm feasible in real time.

[1]  Harvey F. Silverman,et al.  Some analysis of microphone arrays for speech data acquisition , 1987, IEEE Trans. Acoust. Speech Signal Process..

[2]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[3]  B.D. Van Veen,et al.  Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[4]  Stanislav B. Kesler,et al.  Bias and resolution of the MUSIC and the modified FBLP algorithms in the presence of coherent plane waves , 1988, IEEE Trans. Acoust. Speech Signal Process..

[5]  Harvey F. Silverman,et al.  Microphone array optimization by stochastic region contraction , 1991, IEEE Trans. Signal Process..

[6]  T. Kailath,et al.  Optimum localization of multiple sources by passive arrays , 1983 .

[7]  Harvey F. Silverman,et al.  Experimental results showing the effects of optimal spacing between elements of a linear microphone array , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[8]  V. M. Alvarado,et al.  Talker Localization and Optimal Placement of Microphones for a Linear Microphone Array Using Stochastic Region Contraction. , 1990 .

[9]  G. Carter Coherence and time delay estimation , 1987, Proceedings of the IEEE.

[10]  J. F. Tang,et al.  Automatic design of optical thin-film systems—merit function and numerical optimization method , 1982 .

[11]  R. Schmidt,et al.  Multiple source DF signal processing: An experimental system , 1986 .

[12]  James L. Flanagan Bandwidth design for speech-seeking microphone arrays , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[14]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .