An overview of the CAVE project research activities in speaker verification

This article presents an overview of the research activities carried out in the European CAVE project, which focused on text-dependent speaker verification on the telephone network using whole word Hidden Markov Models. It documents in detail various aspects of the technology and the methodology used within the project. In particular, it addresses the issue of model estimation in the context of limited enrollment data and the problem of a posteriori decision threshold setting. Experiments are carried out on the realistic telephone speech database SESP. State-of-the-art performance levels are obtained, which validates the technical approaches developed and assessed during the project as well as the working infrastructure which facilitated cooperation between the partners.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  S. Furui,et al.  Cepstral analysis technique for automatic speaker verification , 1981 .

[3]  Dominique Genoud,et al.  A comparison of a priori threshold setting procedures for speaker verification in the CAVE project , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Sadaoki Furui,et al.  An Overview of Speaker Recognition Technology , 1996 .

[5]  Aaron E. Rosenberg,et al.  Connected word talker verification using whole word hidden Markov models , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Lawrence G. Bahler,et al.  Speaker verification using randomized phrase prompting , 1991, Digit. Signal Process..

[7]  Gérard Chollet,et al.  Assessment of speaker verification systems , 1995 .

[8]  Frédéric Bimbot,et al.  Speaker verification in the telephone network: research activities in the cave project , 1997, EUROSPEECH.

[9]  B.S. Atal,et al.  Automatic recognition of speakers from their voices , 1976, Proceedings of the IEEE.

[10]  Steve Young,et al.  The HTK book , 1995 .

[11]  Dominique Genoud,et al.  Likelihood ratio adjustment for the compensation of model mismatch in speaker verification , 1997, EUROSPEECH.

[12]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[13]  Michael J. Carey,et al.  A speaker verification system using alpha-nets , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[14]  Frédéric Bimbot,et al.  Techniques for a priori decision threshold estimation in speaker verification , 1998 .

[15]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.