Localization of Sound Sources by Means of Recurrent Neural Networks

The issue of localization of sound sources for videoconferencing is discussed in the paper. A new algorithm for estimating speaker locations, based on recurrent neural networks (RNN), is introduced and described. The scheme of experiments carried out in an acoustically adopted chamber, exploiting the engineered method is detailed.

[1]  Andrzej Czyzewski,et al.  Neuro-rough control of masking thresholds for audio signal enhancement , 2001, Neurocomputing.

[2]  R. Hanau,et al.  A further report on physics teaching in Java , 1958 .

[3]  Hong Wang,et al.  Voice source localization for automatic camera pointing system in videoconferencing , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[5]  C. Lee Giles,et al.  Using recurrent neural networks to learn the structure of interconnection networks , 1995, Neural Networks.

[6]  Hava T. Siegelmann,et al.  Computational capabilities of recurrent NARX neural networks , 1997, IEEE Trans. Syst. Man Cybern. Part B.

[7]  Shyh-Neng Lin,et al.  An adaptive time delay estimation with direct computation formula , 1994 .

[8]  Paulo J. G. Lisboa,et al.  Special Issue on Neural Networks , 1993 .

[9]  Ming Zhang,et al.  An alternative algorithm for estimating and tracking talker location by microphone arrays , 1996 .

[10]  A. Moiseff,et al.  An artificial neural network for sound localization using binaural cues. , 1996, The Journal of the Acoustical Society of America.

[11]  John Sum,et al.  A Note on the Equivalence of NARX and RNN , 1999, Neural Computing & Applications.

[12]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[13]  William M. Hartmann,et al.  How we localize sound , 1999 .

[14]  C. Lee Giles,et al.  Rule Revision With Recurrent Neural Networks , 1996, IEEE Trans. Knowl. Data Eng..

[15]  Michael R. Davenport,et al.  Continuous-time temporal back-propagation with adaptable time delays , 1993, IEEE Trans. Neural Networks.

[16]  Gaetano Scarano,et al.  Discrete time techniques for time delay estimation , 1993, IEEE Trans. Signal Process..

[17]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[18]  André Gilloire,et al.  Microphone array for sound pickup in teleconference systems , 1994 .

[19]  Miriam A. Doron,et al.  On direction finding of an emitting source from time delays , 1999 .

[20]  Srimat T. Chakradhar,et al.  First-order versus second-order single-layer recurrent neural networks , 1994, IEEE Trans. Neural Networks.

[21]  M. S. Brandstein A pitch-based approach to time-delay estimation of reverberant speech , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[22]  Yannick Mahieux,et al.  A Microphone Array for Multimedia Workstations , 1996 .

[23]  Man-Wai Mak,et al.  A conjugate gradient learning algorithm for recurrent neural networks , 1999, Neurocomputing.

[24]  Man-Wai Mak,et al.  On the improvement of the real time recurrent learning algorithm for recurrent neural networks , 1999, Neurocomputing.

[25]  C. Lee Giles,et al.  An experimental comparison of recurrent neural networks , 1994, NIPS.

[26]  M. Bodden Modeling human sound-source localization and the cocktail-party-effect , 1993 .

[27]  Ilan Ziskind,et al.  Maximum likelihood localization of multiple sources by alternating projection , 1988, IEEE Trans. Acoust. Speech Signal Process..

[28]  Sun-Yuan Kung,et al.  A delay damage model selection algorithm for NARX neural networks , 1997, IEEE Trans. Signal Process..

[29]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.