Directly Modeling of Correlation Matrices for GMM in Speaker Identification

In this paper, we present a new framework to model full covariance matrices of Gaussian components. In this framework, directly modeling the full correlation matrix instead of the full covariance matrix is our purpose, as the correlation matrix is the direct description of the correlation of inter-feature elements. In order to model full correlation matrices, we share linear transformations among components' full correlation matrices. Thus, the full correlation matrix of each component is represented by a shared linear transformation and a component-specific diagonal correlation matrix. The transformation is used to help the diagonal correlation matrix to model the correlation of inter feature-vector elements more precisely. We evaluate our new framework on a Mandarin speaker identification task. Experiments show that above 35% reduction in speaker identification error rate is achieved compared with the best diagonal covariance models. Furthermore, our algorithm achieved better performance than STC does