Advanced Subspace Techniques for Modeling Channel and Session Variability in a Speaker Recognition System

Abstract : The robustness of any speaker recognition system is dependent on its capability for managing the variability in the recording environment. A better ability to quantify that variation may lead to the development of improved methods for reducing the non-speaker influences on performance. In this study, subspace decomposition in combination with three pattern classification techniques was investigated to assess its appropriateness for performing speaker recognition on the MultiRoom8 corpus, a data set with several room and microphone conditions. A partial least squares decomposition of the GMM supervector in combination with a nearest neighbor classifier was consistently a top-performer on the 100 experimental setups consider in this study, which may suggest an approach for mitigating the effects of room and microphone variability in a speaker recognition system through projections to a lower-dimensional feature space.