论文信息 - RBM-PLDA subsystem for the NIST i-vector challenge

RBM-PLDA subsystem for the NIST i-vector challenge

This paper presents the Speech Technology Center (STC) system submitted to NIST i-vector challenge. The system includes different subsystems based on TV-PLDA, TV-SVM, and RBM-PLDA. In this paper we focus on examining the third RBM-PLDA subsystem. Within this subsystem, we present our RBM extractor of the pseudo i-vector. Experiments performed on the test dataset of NIST-2014 demonstrate that although the RBM-PLDA subsystem is inferior to the former two subsystems in terms of absolute minDCF, during the final fusion it provides a substantial input into the efficiency of the resulting STC system reaching 0.241 at the minDCF point.

Sergey Novoselov | Timur Pekhovsky | Andrey Shulipa | Konstantin Simonchik

[1] Sergey Novoselov,et al. STC Speaker Recognition System for the NIST i-Vector Challenge , 2014, Odyssey.

[2] James H. Elder,et al. Probabilistic Linear Discriminant Analysis for Inferences About Identity , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3] Daniel Garcia-Romero,et al. Analysis of i-vector Length Normalization in Speaker Recognition Systems , 2011, INTERSPEECH.

[4] David G. Stork,et al. Pattern classification, 2nd Edition , 2000 .

[5] Themos Stafylakis,et al. Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition , 2014, Odyssey.

[6] Patrick Kenny,et al. Bayesian Speaker Verification with Heavy-Tailed Priors , 2010, Odyssey.

[7] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[8] Aleksandr Sizov,et al. Comparison between supervised and unsupervised learning of probabilistic linear discriminant analysis mixture models for speaker verification , 2013, Pattern Recognit. Lett..

[9] Patrick Kenny,et al. Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification , 2009, INTERSPEECH.

[10] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[11] Alexander Kozlov,et al. Speaker Recognition System for The NIST SRE , 2013 .

[12] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[13] David G. Stork,et al. Pattern Classification , 1973 .