CU 2 C : A Dual-condition Cantonese Speech Database for Speaker Recognition Applications

This paper describes the development of CU2C, a dual-condition Cantonese speech database for speaker recognition research. It is a task-oriented database. The speech contents include Hong Kong ID numbers, Cantonese digit strings and sentences. They enable the development of speaker recognition systems for various applications. CU2C is special in that it contains parallel data collected under different acoustic conditions, i.e. public fixed-line telephone channel and wideband desktop microphone. These data are useful for the study of channel effects in speaker recognition. A total of 84 target speakers and 23 impostors were recorded. Each speaker has 18 sessions of recordings, which were collected over 4 – 9 months. Results of preliminary evaluation show that the baseline performance attained with CU2C is comparable to those with similar databases of other languages.

[1]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[2]  Joseph P. Campbell,et al.  Testing with the YOHO CD-ROM voice verification corpus , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ronald A. Cole,et al.  The CSLU speaker recognition corpus , 1998, ICSLP.

[4]  Tan Lee,et al.  Spoken language resources for Cantonese speech processing , 2002, Speech Commun..

[5]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[6]  Chao Qin,et al.  Cantonese verbal information verification system using GMM-based anti-model , 2004, 2004 International Symposium on Chinese Spoken Language Processing.

[7]  Douglas A. Reynolds,et al.  Corpora for the evaluation of speaker recognition systems , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[9]  Dominique Genoud,et al.  POLYCOST: A telephone-speech database for speaker recognition , 2000, Speech Commun..