论文信息 - Line spectral frequencies modeling by a mixture of von Mises-Fisher distributions

Line spectral frequencies modeling by a mixture of von Mises-Fisher distributions

Efficient quantization of the linear predictive coding (LPC) parameters plays a key role in parametric speech coding. The line spectral frequency (LSF) representation of the LPC parameters has found its applications in speech model quantization. In practical implementations of vector quantization (VQ), probability density function optimized VQ has been shown to be more efficient than the VQ based on training data. In this paper, we present the LSF parameters by a unit vector form, which has directional characteristics. The underlying distribution of this unit vector variable is modeled by a von Mises-Fisher mixture model (VMM). An optimal inter-component bit allocation strategy is proposed based on high rate theory and a distortion-rate (D-R) relation is derived for the VMM based-VQ (VVQ). Experimental results show that the VVQ outperforms the recently introduced Dirichlet mixture model-based VQ and the conventional Gaussian mixture model-based VQ, in terms of modeling performance and D-R relation. HighlightsA new representation of LSF parameters, the square-root Δ LSF ( SR Δ LSF ), is presented.Based on the directional property, we model the SR Δ LSF by a von Mises-Fisher mixture model (VMM).A VMM-based vector quantization (VVQ) scheme is proposed.The proposed VVQ outperforms Gaussian mixture model- and Dirichlet mixture model-based VQ.

Jun Guo | Jalil Taghia | Arne Leijon | Zhanyu Ma | W. Bastiaan Kleijn

[1] Arne Leijon,et al. PDF-optimized LSF vector quantization based on beta mixture models , 2010, INTERSPEECH.

[2] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[3] Jun Guo,et al. Dirichlet mixture modeling to estimate an empirical lower bound for LSF quantization , 2014, Signal Process..

[4] Jon Hamkins,et al. Gaussian source coding with spherical codes , 2002, IEEE Trans. Inf. Theory.

[5] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.

[6] Kuldip K. Paliwal,et al. Speech Coding and Synthesis , 1995 .

[7] Suvrit Sra,et al. A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of Is(x) , 2012, Comput. Stat..

[8] Irene A. Stegun,et al. Handbook of Mathematical Functions. , 1966 .

[9] David L. Neuhoff,et al. Bennett's integral for vector quantizers , 1995, IEEE Trans. Inf. Theory.

[10] John Riedl,et al. Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[11] Kuldip K. Paliwal,et al. Efficient vector quantization of LPC parameters at 24 bits/frame , 1993, IEEE Trans. Speech Audio Process..

[12] Arne Leijon,et al. Vector quantization of LSF parameters with a mixture of dirichlet distributions , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[13] Jan Skoglund,et al. Vector quantization based on Gaussian mixture models , 2000, IEEE Trans. Speech Audio Process..

[14] Robert M. Gray,et al. Asymptotic Performance of Vector Quantizers with a Perceptual Distortion Measure , 1997, IEEE Trans. Inf. Theory.

[15] Jalil Taghia,et al. Bayesian Estimation of the von-Mises Fisher Mixture Model with Variational Inference , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Jonathan G. Fiscus,et al. DARPA TIMIT:: acoustic-phonetic continuous speech corpus CD-ROM, NIST speech disc 1-1.1 , 1993 .

[17] F. Itakura. Line spectrum representation of linear predictor coefficients of speech signals , 1975 .

[18] Jonas Samuelsson,et al. Bounded support Gaussian mixture modeling of speech spectra , 2003, IEEE Trans. Speech Audio Process..

[19] P. Alku,et al. On line spectral frequencies , 2003, IEEE Signal Processing Letters.

[20] Jalil Taghia,et al. On von-mises fisher mixture model in text-independent speaker identification , 2013, INTERSPEECH.

[21] Inderjit S. Dhillon,et al. Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[22] Milton Abramowitz,et al. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[23] Bhaskar D. Rao,et al. PDF optimized parametric vector quantization of speech line spectral frequencies , 2003, IEEE Trans. Speech Audio Process..

[24] Thippur V. Sreenivas,et al. Low complexity wideband LSF quantization using GMM of uncorrelated Gaussian mixtures , 2008, 2008 16th European Signal Processing Conference.

[25] Arne Leijon,et al. Modelling speech line spectral frequencies with dirichlet mixture models , 2010, INTERSPEECH.