Nonlinear speech processing with oscillatory neural networks for speaker segregation

Nonlinear masking of space-time representations of speech is a universal technique for speech processing. In the present work we use an AM representation of cochlear filterbank signals in combination with a mask that is derived from a network of oscillatory neurons. The proposed approach does not need any training or learning and the mask takes into account the dependence between points from the auditory derived representation. A potential application is illustrated in the context of speaker segregation.

[1]  Jean Rouat,et al.  A new approach for wavelet speech enhancement , 2001, INTERSPEECH.

[2]  Guy J. Brown,et al.  Separation of speech from interfering sounds based on oscillatory correlation , 1999, IEEE Trans. Neural Networks.

[3]  DeLiang Wang,et al.  Image Segmentation Based on Oscillatory Correlation , 1997, Neural Computation.

[4]  J. Rouat,et al.  Wavelet speech enhancement based on the Teager energy operator , 2001, IEEE Signal Processing Letters.

[5]  Liang Zhao,et al.  A network of dynamically coupled chaotic maps for scene segmentation , 2001, IEEE Trans. Neural Networks.

[6]  Jean Rouat,et al.  A pitch determination and voiced/unvoiced decision algorithm for noisy speech , 1995, Speech Commun..

[7]  Guy J. Brown,et al.  A Neural Oscillator Model of Auditory Attention , 2001, ICANN.

[8]  Guy J. Brown,et al.  A comparison of auditory and blind separation techniques for speech segregation , 2001, IEEE Trans. Speech Audio Process..

[9]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[10]  Ch. von der Malsburg,et al.  A neural cocktail-party processor , 1986, Biological Cybernetics.

[11]  Petros Maragos,et al.  Energy separation in signal modulations with application to speech analysis , 1993, IEEE Trans. Signal Process..

[12]  Jean Rouat,et al.  Nonlinear operators for speech analysis , 1993 .

[13]  Jean Rouat Spatio-Temporal Pattern Recognition with Neural Networks: Application to Speech , 1997, ICANN.

[14]  Sid Deutsch,et al.  Understanding the nervous system : an engineering perspective , 1993 .

[15]  Jean Rouat,et al.  A spectro-temporal analysis of speech based on nonlinear operators , 1992, ICSLP.

[16]  DeLiang Wang,et al.  Relaxation Oscillators and Networks , 1999 .

[17]  DeLiang Wang,et al.  Locally excitatory globally inhibitory oscillator networks , 1995, IEEE Transactions on Neural Networks.