Speech Analysis With the Strong Uncorrelating Transform

The strong uncorrelating transform (SUT) provides estimates of independent components from linear mixtures using only second-order information, provided that the components have unique circularity coefficients. We propose a processing framework for generating complex-valued subbands from real-valued mixtures of speech and noise where the objective is to control the likely values of the sample circularity coefficients of the underlying speech and noise components in each subband. We show how several processing parameters affect the noncircularity of speech-like and noise components in the subband, ultimately informing parameter choices that allow for estimation of each of the components in a subband using the SUT. Additionally, because the speech and noise components will have unique sample circularity coefficients, this statistic can be used to identify time-frequency regions that contain voiced speech. We give an example of the recovery of the circularity coefficients of a real speech signal from a two-channel noisy mixture at -25 dB SNR, which demonstrates how the estimates of noncircularity can reveal the time-frequency structure of a speech signal in very high levels of noise. Finally, we present the results of a voice activity detection (VAD) experiment showing that two new circularity-based statistics, one of which is derived from the SUT processing, can achieve improved performance over state-of-the-art VADs in real-world recordings of noise.

[1]  Sridha Sridharan,et al.  The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms , 2010, INTERSPEECH.

[2]  Bernard C. Picinbono,et al.  On circularity , 1994, IEEE Trans. Signal Process..

[3]  Jean Pierre Delmas,et al.  Asymptotic distribution of circularity coefficients estimate of complex random variables , 2009, Signal Process..

[4]  Tülay Adali,et al.  Optimization and Estimation of Complex-Valued Signals: Theory and applications in filtering and blind source separation , 2014, IEEE Signal Processing Magazine.

[5]  Jacob Benesty,et al.  A Widely Linear Distortionless Filter for Single-Channel Noise Reduction , 2010, IEEE Signal Processing Letters.

[6]  Bart De Moor,et al.  On the blind separation of non-circular sources , 2002, 2002 11th European Signal Processing Conference.

[7]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[8]  Les E. Atlas,et al.  Voice activity detection using subband noncircularity , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Ronald Phlypo,et al.  An efficient entropy rate estimator for complex-valued signal processing: Application to ICA , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[11]  Sridha Sridharan,et al.  Noise robust voice activity detection using features extracted from the time-domain autocorrelation function , 2010, INTERSPEECH.

[12]  Arie Yeredor,et al.  Performance Analysis of the Strong Uncorrelating Transformation in Blind Separation of Complex-Valued Sources , 2012, IEEE Transactions on Signal Processing.

[13]  Kevin P. Murphy,et al.  Machine learning - a probabilistic perspective , 2012, Adaptive computation and machine learning series.

[14]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[15]  Jean-François Cardoso,et al.  Dependence, Correlation and Gaussianity in Independent Component Analysis , 2003, J. Mach. Learn. Res..

[16]  Seungjin Choi Blind Source Separation and Independent Component Analysis : A Review , 2004 .

[17]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[18]  Soo-Young Lee Blind Source Separation and Independent Component Analysis: A Review , 2005 .

[19]  Les E. Atlas,et al.  Estimating the noncircularity of latent components within complex-valued subband mixtures with applications to speech processing , 2014, 2014 48th Asilomar Conference on Signals, Systems and Computers.

[20]  Les E. Atlas,et al.  Existence and estimation of impropriety in real rhythmic signals , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Jacob Benesty,et al.  On widely linear Wiener and tradeoff filters for noise reduction , 2010, Speech Commun..

[22]  Fabien Millioz,et al.  Circularity of the STFT and Spectral Kurtosis for Time-Frequency Segmentation in Gaussian Environment , 2011, IEEE Transactions on Signal Processing.

[23]  L. Scharf,et al.  Statistical Signal Processing of Complex-Valued Data: Notation , 2010 .

[24]  Javier Ramírez,et al.  Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..

[25]  Donald B. Percival,et al.  Spectral Analysis for Physical Applications , 1993 .

[26]  Danilo P. Mandic,et al.  Complex Valued Nonlinear Adaptive Filters , 2009 .

[27]  Jacob Benesty,et al.  A Perspective on Single-Channel Frequency-Domain Speech Enhancement , 2011, A Perspective on Single-Channel Frequency-Domain Speech Enhancement.

[28]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[29]  Esa Ollila,et al.  On the Circularity of a Complex Random Variable , 2008, IEEE Signal Processing Letters.

[30]  Alan V. Oppenheim,et al.  Discrete-time signal processing (2nd ed.) , 1999 .

[31]  Tülay Adali,et al.  Complex-Valued Signal Processing: The Proper Way to Deal With Impropriety , 2011, IEEE Transactions on Signal Processing.

[32]  Visa Koivunen,et al.  Complex random vectors and ICA models: identifiability, uniqueness, and separability , 2005, IEEE Transactions on Information Theory.

[33]  V. Koivunen,et al.  Ieee Workshop on Machine Learning for Signal Processing Complex-valued Ica Using Second , 2022 .

[34]  D. Mandic,et al.  Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models , 2009 .