论文信息 - Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification

Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification

MFCC is one of the most popular features used in speaker verification, it involves not only speaker information, but also information of contents and channels. A session-aware Fbank weighting approach has been proposed, where the Fbanks that are more sensitive to session variance are de-weighted so that speaker discriminative banks are given prominence. Most of the current researches on Fbank weighting are within the GMM-UBM framework. In this paper, we study the contribution of Fbank weighting in the state-of-the-art i-vector architecture. We found that, due to the unsupervised learned loading matrix in the i-vector model, Fbank weighting shows no advantages in i-vector systems, if the simple cosine-distance scoring is used. However, when discriminative models such as LDA/PLDA are applied, the advantage of Fbank weighting can be recovered, which leads to significant performance improvement. Meanwhile we verified that weighting parameters are well generalizable: the parameters trained with a small bilingual database can be applied successfully in another i-vector system trained with a large multi-channel database.

Thomas Fang Zheng | Jun Wang | Dong Wang | Lantian Li