Speaker verification using text-constrained Gaussian Mixture Models

In this paper we present an approach to close the gap between text-dependent and text-independent speaker verification performance. Text-constrained GMM-UBM systems are created using word segmentations produced by a LVCSR system on conversational speech allowing the system to focus on speaker differences over a constrained set of acoustic units. Results on the 2001 NIST extended data task show this approach can be used to produce an equal error rate of < 1 %.