Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures

In some situations the quality of the signals involved in a speaker verification trial is not as good as needed to take a reliable decision. In this work, we use Bayesian networks to model the relations between the speaker verification score, a set of speech quality measures and the trial reliability. We use this model to detect and discard unreliable trials. We present results on the NIST SRE2010 dataset artificially degraded with different types and levels of additive noise and reverberation. We show that a speaker verification system, that is well calibrated for clean speech, produces an unacceptable actual DCF on the degraded dataset. We show how this method can be used to reduce the actual DCF to values lower than 1. We compare results using different quality measures and Bayesian network configurations.

[1]  Eduardo Lleida,et al.  Detecting Replay Attacks from Far-Field Recordings on Speaker Verification Systems , 2011, BIOID.

[2]  Moshe Koppel,et al.  Considering speech quality in speaker verification fusion , 2005, INTERSPEECH.

[3]  Hirotaka Nakasone,et al.  Forensic automatic speaker recognition , 2001, Odyssey.

[4]  Krzysztof Kryszczuk,et al.  Reliability-Based Decision Fusion in Multimodal Biometric Verification Systems , 2007, EURASIP J. Adv. Signal Process..

[5]  Mark C. Huggins,et al.  Confidence metrics for speaker identification , 2002, INTERSPEECH.

[6]  Simon King,et al.  Sixth International Conference on Spoken Language Processing (ICSLP 2000) , 2000 .

[7]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[8]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[9]  Jonas Richiardi,et al.  Speaker Verification with Confidence and Reliability Measures , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Samy Bengio,et al.  Improving Fusion with Margin-Derived Confidence in Biometric Authentication Tasks , 2005, AVBPA.

[11]  William M. Campbell,et al.  Estimating and evaluating confidence for forensic speaker recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12]  Jonas Richiardi,et al.  A probabilistic measure of modality reliability in speaker verification , 2005 .

[13]  Josef Kittler,et al.  Audio- and Video-Based Biometric Person Authentication, 5th International Conference, AVBPA 2005, Hilton Rye Town, NY, USA, July 20-22, 2005, Proceedings , 2005, AVBPA.

[14]  Samy Bengio,et al.  Confidence measures for multimodal identity verification , 2002, Inf. Fusion.

[15]  Jana Dittmann,et al.  Biometrics and ID Management , 2011, Lecture Notes in Computer Science.

[16]  Stan Z. Li,et al.  Advances in Biometrics, International Conference, ICB 2007, Seoul, Korea, August 27-29, 2007, Proceedings , 2007, ICB.

[17]  Julian Fiérrez,et al.  Analysis of the Utility of Classical and Novel Speech Quality Measures for Speaker Verification , 2009, ICB.

[18]  Jonas Richiardi,et al.  Confidence and reliability measures in speaker verification , 2006, J. Frankl. Inst..

[19]  Технология Springer Science+Business Media , 2013 .