On the Convergence of Extended Variational Inference for Non-Gaussian Statistical Models

Variational inference (VI) is a widely used framework in Bayesian estimation. For most of the non-Gaussian statistical models, it is infeasible to find an analytically tractable solution to estimate the posterior distributions of the parameters. Recently, an improved framework, namely the extended variational inference (EVI), has been introduced and applied to derive analytically tractable solution by employing lower-bound approximation to the variational objective function. Two conditions required for EVI implementation, namely the weak condition and the strong condition, are discussed and compared in this paper. In practical implementation, the convergence of the EVI depends on the selection of the lower-bound approximation, no matter with the weak condition or the strong condition. In general, two approximation strategies, the single lower-bound (SLB) approximation and the multiple lower-bounds (MLB) approximation, can be applied to carry out the lower-bound approximation. To clarify the differences between the SLB and the MLB, we will also discuss the convergence properties of the aforementioned two approximations. Extensive comparisons are made based on some existing EVI-based non-Gaussian statistical models. Theoretical analysis are conducted to demonstrate the differences between the weak and the strong conditions. Qualitative and quantitative experimental results are presented to show the advantages of the SLB approximation.

[1]  Honggang Zhang,et al.  Variational Bayesian Matrix Factorization for Bounded Support Data , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jun Guo,et al.  Cross-modal subspace learning for sketch-based image retrieval: A comparative study , 2016, 2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC).

[3]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[4]  Arne Leijon,et al.  Vector quantization of LSF parameters with a mixture of dirichlet distributions , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Arne Leijon,et al.  Human skin color detection in RGB space with Bayesian estimation of beta mixture models , 2010, 2010 18th European Signal Processing Conference.

[6]  Chen Shen,et al.  A Robust Particle Filtering Algorithm With Non-Gaussian Measurement Noise Using Student-t Distribution , 2014, IEEE Signal Processing Letters.

[7]  Jaehoon Jung,et al.  Capacity and Error Probability Analysis of Diversity Reception Schemes Over Generalized- $K$ Fading Channels Using a Mixture Gamma Distribution , 2014, IEEE Transactions on Wireless Communications.

[8]  Arne Leijon,et al.  Human audio-visual consonant recognition analyzed with three bimodal integration models , 2009, INTERSPEECH.

[9]  Jun Guo,et al.  Dirichlet mixture modeling to estimate an empirical lower bound for LSF quantization , 2014, Signal Process..

[10]  Sekino Masashi Probabilistic Matrix Factorization based on Features , 2010 .

[11]  Rainer Martin,et al.  Spectral Domain Speech Enhancement Using HMM State-Dependent Super-Gaussian Priors , 2013, IEEE Signal Processing Letters.

[12]  Jianhua Zhang,et al.  Data scheme-based wireless channel modeling method: motivation, principle and performance , 2017, Journal of Communications and Information Networks.

[13]  Zhanyu Ma,et al.  A variational Bayes beta Mixture Model for Feature Selection in DNA methylation Studies , 2013, J. Bioinform. Comput. Biol..

[14]  Nizar Bouguila,et al.  Practical Bayesian estimation of a finite beta mixture through gibbs sampling and its applications , 2006, Stat. Comput..

[15]  Jun Guo,et al.  Variational Bayesian Learning for Dirichlet Process Mixture of Inverted Dirichlet Distributions in Non-Gaussian Image Feature Modeling , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Arne Leijon,et al.  Modelling speech line spectral frequencies with dirichlet mixture models , 2010, INTERSPEECH.

[17]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[18]  Zhen Yang,et al.  Decorrelation of Neutral Vector Variables: Theory and Applications , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[19]  Arne Leijon,et al.  Bayesian Estimation of Beta Mixture Models with Variational Inference , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Q. M. Jonathan Wu,et al.  A Nonsymmetric Mixture Model for Unsupervised Image Segmentation , 2013, IEEE Transactions on Cybernetics.

[21]  Arne Leijon,et al.  PDF-optimized LSF vector quantization based on beta mixture models , 2010, INTERSPEECH.

[22]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[23]  John S. Thompson,et al.  Spatial Fading Correlation model using mixtures of Von Mises Fisher distributions , 2009, IEEE Transactions on Wireless Communications.

[24]  Perry R. Cook,et al.  Bayesian Nonparametric Matrix Factorization for Recorded Music , 2010, ICML.

[25]  Jun Guo,et al.  Spoofing Detection in Automatic Speaker Verification Systems Using DNN Classifiers and Dynamic Acoustic Features , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Markus Flierl,et al.  Probabilistic Multiview Depth Image Enhancement Using Variational Inference , 2015, IEEE Journal of Selected Topics in Signal Processing.

[27]  Michael Unser,et al.  On the Linearity of Bayesian Interpolators for Non-Gaussian Continuous-Time AR(1) Processes , 2013, IEEE Transactions on Information Theory.

[28]  Markus Flierl,et al.  Bayesian estimation of Dirichlet mixture model with variational inference , 2014, Pattern Recognit..

[29]  Erchin Serpedin,et al.  Gaussian Assumption: The Least Favorable but the Most Useful [Lecture Notes] , 2012, IEEE Signal Processing Magazine.

[30]  Qie Sun,et al.  Statistical analysis of energy consumption patterns on the heat demand of buildings in district heating systems , 2014 .

[31]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[32]  Nizar Bouguila,et al.  Variational Learning for Finite Dirichlet Mixture Models and Applications , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Zhanyu Ma Non-Gaussian Statistical Modelsand Their Applications , 2011 .

[34]  Jun Guo,et al.  Line spectral frequencies modeling by a mixture of von Mises-Fisher distributions , 2015, Signal Process..

[35]  Zhanyu Ma Bayesian estimation of the Dirichlet distribution with expectation propagation , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[36]  Jalil Taghia,et al.  Variational Inference for Watson Mixture Model , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[38]  Arne Leijon,et al.  Nonnegative HMM for Babble Noise Derived From Speech HMM: Application to Speech Enhancement , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Zhongwei Si,et al.  Learning Deep Features for DNA Methylation Data Analysis , 2016, IEEE Access.

[40]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[41]  Jalil Taghia,et al.  Bayesian Estimation of the von-Mises Fisher Mixture Model with Variational Inference , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Jun Guo,et al.  The Role of Data Analysis in the Development of Intelligent Energy Networks , 2017, IEEE Network.

[43]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[44]  Michael E. Tipping Bayesian Inference: An Introduction to Principles and Practice in Machine Learning , 2003, Advanced Lectures on Machine Learning.

[45]  Ali Taylan Cemgil,et al.  Bayesian inference in hierarchical non‐negative matrix factorisation models of musical sounds , 2008 .

[46]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[47]  Markus Flierl,et al.  Multiview depth map enhancement by variational bayes inference estimation of Dirichlet mixture models , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[48]  Arne Leijon,et al.  Expectation propagation for estimating the parameters of the beta distribution , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[49]  M. Wand,et al.  Gaussian Variational Approximate Inference for Generalized Linear Mixed Models , 2012 .

[50]  Michael I. Jordan,et al.  Probabilistic models of text and images , 2004 .

[51]  Zhiguang Xu,et al.  Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Jun Guo,et al.  Cross-modal subspace learning for fine-grained sketch-based image retrieval , 2017, Neurocomputing.

[53]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[54]  Yuan Ji,et al.  Applications of beta-mixture models in bioinformatics , 2005, Bioinform..

[55]  Jun Guo,et al.  Feature selection for neutral vector in EEG signal classification , 2016, Neurocomputing.

[56]  Stephen M. Stigler,et al.  Thomas Bayes's Bayesian Inference , 1982 .

[57]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Arne Leijon,et al.  Predictive Distribution of the Dirichlet Mixture Model by Local Variational Inference , 2014, J. Signal Process. Syst..

[59]  Zhanyu Ma,et al.  A probabilistic principal component analysis based hidden Markov model for audio-visual speech recognition , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[60]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[61]  Stephen J. Roberts,et al.  A tutorial on variational Bayesian inference , 2012, Artificial Intelligence Review.

[62]  Leonardo Zao,et al.  Generation of coloured acoustic noise samples with non-Gaussian distributions , 2012, IET Signal Process..

[63]  Honggang Zhang,et al.  Nonlinear estimation of missing ΔLSF parameters by a mixture of Dirichlet distributions , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).