Speech intelligibility improvement using convolutive blind source separation assisted by denoising algorithms

The present study is concerned with the blind source separation (BSS) of speech and speech-shaped noise sources. All recordings were carried out in an anechoic chamber using a dummy head (two microphones, one in each ear). The program which implements the algorithm for BSS of convolutive mixtures introduced by Parra and Spence [Parra, L., Spence, C., 2000a. Convolutive blind source separation of non-stationary sources. IEEE Trans. Speech Audio Process. 8(3), 320-327 (US Patent US6167417)] was used to separate out the signals. In the postprocessing phase two different denoising algorithms were used. The first was based on a minimum mean-square error log-spectral amplitude estimator [Ephraim, E., Malah, D., 1985. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Speech Audio Process. ASSP-33(2), 443-445], while the second one was based on Wiener filter in which the concept of an a priori signal-to-noise estimation presented by Ephraim (as mentioned above) was applied [Scalart, P., Filho, J.V., 1996. Speech enhancement based on a priori signal to noise estimation. IEEE Internat. Conf. Acoust. Speech Signal Process. 1, 629-632]. Non-sense word tests were used as a target speech in both cases while one or two disturbing sources were used as interferences. The speech intelligibility before and after the BSS was measured for three subjects with audiologically normal hearing. Next the speech signal after BSS was denoised and presented to the same listeners. The results revealed some ambiguities caused by the insufficient number of microphones compared to the number of sound sources. For one disturbance only, the intelligibility improvement was significant. However, when there were two disturbances and the target speech, the separation was much poorer. The additional denoising, as could be expected, raises the intelligibility slightly. Although the BSS method requires more research on optimization, the results of the investigation imply that it may be applied to hearing aids in the future.

[1]  Pierre Comon,et al.  Blind separation of sources, part II: Problems statement , 1991, Signal Process..

[2]  Scott C. Douglas,et al.  Convolutive blind separation of speech mixtures using the natural gradient , 2003, Speech Commun..

[3]  Aleksander Sek,et al.  Speech intelligibility in various spatial configurations of backgroundnoise , 2005 .

[4]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[5]  Pascal Scalart,et al.  Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  S. Amari,et al.  Approximate maximum likelihood source separation using the natural gradient , 2001, 2001 IEEE Third Workshop on Signal Processing Advances in Wireless Communications (SPAWC'01). Workshop Proceedings (Cat. No.01EX471).

[7]  Paris Smaragdis,et al.  Blind separation of convolved mixtures in the frequency domain , 1998, Neurocomputing.

[8]  Jacob Benesty,et al.  Audio Signal Processing for Next-Generation Multimedia Communication Systems , 2004 .

[9]  Daniel W. E. Schobben,et al.  A frequency domain blind signal separation method based on decorrelation , 2002, IEEE Trans. Signal Process..

[10]  B. Moore An Introduction to the Psychology of Hearing , 1977 .

[11]  Hiroshi Sawada,et al.  Frequency-Domain Blind Source Separation , 2007, Blind Speech Separation.

[12]  H S Colburn,et al.  Binaural sluggishness in the perception of tone sequences and speech in noise. , 2000, The Journal of the Acoustical Society of America.

[13]  Yi Zhou,et al.  Blind source separation in frequency domain , 2003, Signal Process..

[14]  R Plomp,et al.  The effect of a hearing aid on the speech-reception threshold of hearing-impaired listeners in quiet and in noise. , 1983, The Journal of the Acoustical Society of America.

[15]  Christian Jutten,et al.  Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..

[16]  Schuster,et al.  Separation of a mixture of independent signals using time delayed correlations. , 1994, Physical review letters.

[17]  Moeness G. Amin,et al.  New approach for blind source separation using time-frequency distributions , 1996, Optics & Photonics.

[18]  Gillian Davis,et al.  Noise Reduction in Speech Applications , 2002 .

[19]  Andrzej Cichocki,et al.  Second Order Nonstationary Source Separation , 2002, J. VLSI Signal Process..

[20]  Walter Kellermann,et al.  Blind Source Separation for Convolutive Mixtures: A Unified Treatment , 2004 .

[21]  Andreas Ziehe,et al.  Artifact Reduction in Magnetoneurography Based on Time-Delayed Second Order Correlations , 1998 .

[22]  Jean-Francois Cardoso,et al.  Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[23]  Kiyohiro Shikano,et al.  Blind Source Separation Combining Independent Component Analysis and Beamforming , 2003, EURASIP J. Adv. Signal Process..

[24]  Hiroshi Sawada,et al.  Blind Source Separation of Convolutive Mixtures of Speech in Frequency Domain , 2005, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[25]  Paris Smaragdis,et al.  Efficient blind separation of convolved sound mixtures , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.

[26]  Shoko Araki,et al.  ON-LINE TIME-DOMAIN BLIND SOURCE SEPARATION OF NONSTATIONARY CONVOLVED SIGNALS , 2003 .

[27]  Lucas C. Parra,et al.  On-line Convolutive Blind Source Separation of Non-Stationary Signals , 2000, J. VLSI Signal Process..

[28]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[29]  Lucas C. Parra,et al.  On-line Blind Source Separation of Non-Stationary Signals , 2001 .

[30]  Noboru Ohnishi,et al.  A method of blind separation for convolved non-stationary signals , 1998, Neurocomputing.

[31]  E. Oja,et al.  Independent Component Analysis , 2013 .

[32]  Andrzej Cichocki,et al.  Adaptive Blind Signal and Image Processing - Learning Algorithms and Applications , 2002 .

[33]  Lucas C. Parra,et al.  An Adaptive Beamforming Perspective on Convolutive Blind Source Separation , 2002, Noise Reduction in Speech Applications.

[34]  Masataka Goto,et al.  Real-time sound source localization and separation system and its application to automatic speech recognition , 2001, INTERSPEECH.

[35]  S.C. Douglas,et al.  Multichannel blind deconvolution and equalization using the natural gradient , 1997, First IEEE Signal Processing Workshop on Signal Processing Advances in Wireless Communications.

[36]  Kiyotoshi Matsuoka,et al.  A neural net for blind separation of nonstationary signals , 1995, Neural Networks.

[37]  Soo-Young Lee Blind Source Separation and Independent Component Analysis: A Review , 2005 .

[38]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..