RBM-PLDA subsystem for the NIST i-vector challenge

This paper presents the Speech Technology Center (STC) system submitted to NIST i-vector challenge. The system includes different subsystems based on TV-PLDA, TV-SVM, and RBM-PLDA. In this paper we focus on examining the third RBM-PLDA subsystem. Within this subsystem, we present our RBM extractor of the pseudo i-vector. Experiments performed on the test dataset of NIST-2014 demonstrate that although the RBM-PLDA subsystem is inferior to the former two subsystems in terms of absolute minDCF, during the final fusion it provides a substantial input into the efficiency of the resulting STC system reaching 0.241 at the minDCF point.

[1]  Sergey Novoselov,et al.  STC Speaker Recognition System for the NIST i-Vector Challenge , 2014, Odyssey.

[2]  James H. Elder,et al.  Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Daniel Garcia-Romero,et al.  Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.

[4]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[5]  Themos Stafylakis,et al.  Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition , 2014, Odyssey.

[6]  Patrick Kenny,et al.  Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.

[7]  Yoshua Bengio,et al.  Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[8]  Aleksandr Sizov,et al.  Comparison between supervised and unsupervised learning of probabilistic linear discriminant analysis mixture models for speaker verification , 2013, Pattern Recognit. Lett..

[9]  Patrick Kenny,et al.  Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification , 2009, INTERSPEECH.

[10]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11]  Alexander Kozlov,et al.  Speaker Recognition System for The NIST SRE , 2013 .

[12]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[13]  David G. Stork,et al.  Pattern Classification , 1973 .