论文信息 - Analysis of Speaker Variability

Analysis of Speaker Variability

Analysis and modeling of speaker variability, such as gender, accent, age, speech rate, and phones realizations, are important issues in speech recognition. It is known that existing feature representations describing speaker variations can be of very high dimension. In this paper, we introduce two powerful multivariate statistical analysis methods, namely, principal component analysis (PCA) and independent component analysis (ICA), as tools for analysis of such variability and extraction of low dimensional feature representation. Our findings are the following: (1) the first two principal components correspond to the gender and accent, respectively. The result that the second component corresponding to the accent has never been reported before, to the best of our knowledge. (2) It is shown that ICA based features yield better classification performance than PCA ones. Using 2dimensional ICA representation, we achieved about 6.1% and 13.3% error rate in gender and accent classification, respectively, for 980 speakers.

[1] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .

[2] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[3] Hynek Hermansky,et al. Towards decomposing the sources of variability in speech , 1997, EUROSPEECH.

[4] Zhihong Hu. Understanding and adapting to speaker variability using correlation-based principal component analysis , 1999 .

[5] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[6] Chao Huang,et al. Large vocabulary Mandarin speech recognition with different approaches in modeling tones , 2000, INTERSPEECH.

[7] Roland Kuhn,et al. Rapid speaker adaptation in eigenvoice space , 2000, IEEE Trans. Speech Audio Process..