Co-channel speaker identification using usable speech extraction based on multi-pitch tracking

Recently, usable speech criteria have been proposed to extract minimally corrupted speech for speaker identification (SID) in co-channel speech. In this paper, we propose a new usable speech extraction method to improve the SID performance under the co-channel situation based on the pitch information obtained from a robust multi-pitch tracking algorithm [2]. The idea is to retain the speech segments that have only one pitch detected and remove the others. The system is evaluated on co-channel speech and results show a significant improvement across various target to interferer ratios (TIR) for speaker identification.

[1]  Stanley J. Wenndt,et al.  Spectral autocorrelation ratio as a usability measure of speech segments under co-channel conditions , 2000 .

[2]  Robert E. Yantorno Co-Channel Speech and Speaker Identification Study , 1998 .

[3]  Guy J. Brown,et al.  A multi-pitch tracking algorithm for noisy speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Stanley J. Wenndt,et al.  Developing usable speech criteria for speaker identification technology , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Steven M. Kay,et al.  Cochannel speaker separation by harmonic enhancement and suppression , 1997, IEEE Trans. Speech Audio Process..

[6]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[7]  D. Reynolds Automatic Speaker Recognition Using Gaussian Mixture Speaker Models , 1995 .