论文信息 - Parallel Stochastic Gradient Markov Chain Monte Carlo for Matrix Factorisation Models

Parallel Stochastic Gradient Markov Chain Monte Carlo for Matrix Factorisation Models

For large matrix factorisation problems, we develop a distributed Markov Chain Monte Carlo (MCMC) method based on stochastic gradient Langevin dynamics (SGLD) that we call Parallel SGLD (PSGLD). PSGLD has very favourable scaling properties with increasing data size and is comparable in terms of computational requirements to optimisation methods based on stochastic gradient descent. PSGLD achieves high performance by exploiting the conditional independence structure of the MF models to sub-sample data in a systematic manner as to allow parallelisation and distributed computation. We provide a convergence proof of the algorithm and verify its superior performance on various architectures such as Graphics Processing Units, shared memory multi-core systems and multi-computer clusters.

[1] Tianqi Chen,et al. Stochastic Gradient Hamiltonian Monte Carlo , 2014, ICML.

[2] Ahn. Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC , 2015 .

[3] Yee Whye Teh,et al. Consistency and Fluctuations For Stochastic Gradient Langevin Dynamics , 2014, J. Mach. Learn. Res..

[4] P. Smaragdis,et al. Non-negative matrix factorization for polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[5] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.

[6] Peter J. Haas,et al. Large-scale matrix factorization with distributed stochastic gradient descent , 2011, KDD.

[7] Ruslan Salakhutdinov,et al. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[8] Hiroshi Nakagawa,et al. Approximation Analysis of Stochastic Gradient Langevin Dynamics by using Fokker-Planck Equation and Ito Process , 2014, ICML.

[9] Andrzej Cichocki,et al. Nonnegative Matrix and Tensor Factorization T , 2007 .

[10] Ryan Babbush,et al. Bayesian Sampling Using Stochastic Gradient Thermostats , 2014, NIPS.

[11] Karthik Devarajan,et al. Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology , 2008, PLoS Comput. Biol..

[12] Radford M. Neal. MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[13] Babak Shahbaba,et al. Distributed Stochastic Gradient MCMC , 2014, ICML.

[14] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[15] Christopher Ré,et al. Parallel stochastic gradient algorithms for large-scale matrix completion , 2013, Mathematical Programming Computation.

[16] Ahn. Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring , 2012 .