Estimating the noncircularity of latent components within complex-valued subband mixtures with applications to speech processing

This paper describes an approach that estimates the circularity coefficients of multiple underlying components within complex subbands of an additive mixture of voiced speech and noise via the strong uncorrelating transform (SUT). For the SUT to be effective, the latent source signals must have unique nonzero circularity coefficients; this requirement is satisfied by using narrow filters to impose a degree of noncircularity upon what would typically be circular noise. The circularity coefficient estimates are then used for voice activity detection, pitch tracking, and enhancement.

[1]  Richard Wright,et al.  The vocal joystick data collection effort and vowel corpus , 2006, INTERSPEECH.

[2]  D. Mandic,et al.  Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models , 2009 .

[3]  L. Scharf,et al.  Statistical Signal Processing of Complex-Valued Data: Notation , 2010 .

[4]  Alan V. Oppenheim,et al.  Discrete-time signal processing (2nd ed.) , 1999 .

[5]  Bernard C. Picinbono,et al.  On circularity , 1994, IEEE Trans. Signal Process..

[6]  Visa Koivunen,et al.  Complex random vectors and ICA models: identifiability, uniqueness, and separability , 2005, IEEE Transactions on Information Theory.

[7]  D. J. Hermes,et al.  Measurement of pitch by subharmonic summation. , 1988, The Journal of the Acoustical Society of America.

[8]  V. Koivunen,et al.  Ieee Workshop on Machine Learning for Signal Processing Complex-valued Ica Using Second , 2022 .

[9]  Danilo P. Mandic,et al.  Complex Valued Nonlinear Adaptive Filters , 2009 .

[10]  Shlomo Dubnov,et al.  Maximum a-posteriori probability pitch tracking in noisy environments using harmonic model , 2004, IEEE Transactions on Speech and Audio Processing.

[11]  Shlomo Dubnov,et al.  Generalized Likelihood Ratio Test for Voiced-Unvoiced Decision in Noisy Speech Using the Harmonic Model , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Bart De Moor,et al.  On the blind separation of non-circular sources , 2002, 2002 11th European Signal Processing Conference.

[13]  L. F. Willems,et al.  Measurement of pitch in speech: an implementation of Goldstein's theory of pitch perception. , 1982, The Journal of the Acoustical Society of America.

[14]  DeLiang Wang,et al.  On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.