Confidence measures in multiple pronunciations modeling for speaker verification

The paper investigates the use of multiple pronunciations modeling for user-customized password speaker verification (UCP-SV). The main characteristic of UCP-SV is that the system does not have any a priori knowledge about the password used by the speaker. Our aim is to exploit the information about how the speaker pronounces a password in the decision process. This information is extracted automatically using a speaker-independent speech recognizer. We investigate and compare several techniques. Some of them are based on the combination of confidence scores estimated by different models. In this context, we propose a new confidence measure that uses acoustic information extracted during speaker enrollment and based on a log likelihood ratio measure. These techniques show significant improvement (15.7% relative improvement in terms of equal error rate) compared to a UCP-SV baseline system where the speaker is modeled by only one model (corresponding to one utterance).