Stochastic Learning for Sparse Discrete Markov Random Fields with Controlled Gradient Approximation Error

We study the L 1-regularized maximum likelihood estimator/estimation (MLE) problemfor discrete Markov random fields (MRFs), where efficient and scalable learning requires both sparse regularization and approximate inference. To address these challenges, we consider a stochastic learning framework called stochastic proximal gradient (SPG; Honorio 2012a, Atchade etal. 2014, Miasojedow and Rejchel 2016). SPG is an inexact proximal gradient algorithm [Schmidt et al., 2011], whose inexactness stems from the stochastic oracle (Gibbs sampling) for gradient approximation - exact gradient evaluation is infeasible in general due to the NP-hard inference problem for discrete MRFs [Koller and Friedman, 2009]. Theoretically, we provide novel verifiable bounds to inspect and control the quality of gradient approximation. Empirically, we propose the tighten asymptotically (TAY) learning strategy based on the verifiable bounds to boost the performance of SPG.

[1]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[2]  É. Moulines,et al.  On stochastic proximal gradient algorithms , 2014 .

[3]  Massimiliano Pontil,et al.  Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[4]  Wojciech Rejchel,et al.  Sparse Estimation in Ising Model via Penalized Monte Carlo Methods , 2016, J. Mach. Learn. Res..

[5]  David Page,et al.  Temporal Poisson Square Root Graphical Models , 2018, ICML.

[6]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[7]  David Page,et al.  Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms , 2013, AMIA.

[8]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[9]  Yoshua Bengio,et al.  Justifying and Generalizing Contrastive Divergence , 2009, Neural Computation.

[10]  Ioannis Mitliagkas,et al.  Improving Gibbs Sampler Scan Quality with DoGS , 2017, ICML.

[11]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[12]  Christian Igel,et al.  Bounding the Bias of Contrastive Divergence Learning , 2011, Neural Computation.

[13]  Justin Domke,et al.  Projecting Markov Random Field Parameters for Fast Mixing , 2014, NIPS.

[14]  Asja Fischer,et al.  Training Restricted Boltzmann Machines , 2015, KI - Künstliche Intelligenz.

[15]  Mark W. Schmidt,et al.  Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization , 2011, NIPS.

[16]  David Page,et al.  Learning Heterogeneous Hidden Markov Random Fields , 2014, AISTATS.

[17]  Sinong Geng,et al.  An Efficient Pseudo-likelihood Method for Sparse Binary Pairwise Markov Network Estimation , 2017, 1702.08320.

[18]  Jean Honorio Lipschitz Parametrization of Probabilistic Graphical Models , 2011, UAI.

[19]  Chunming Zhang,et al.  Multiple testing under dependence via graphical models , 2016 .

[20]  Jean Honorio,et al.  Convergence Rates of Biased Stochastic Optimization for Learning Sparse Ising Models , 2012, ICML.

[21]  B. Schölkopf,et al.  High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2007 .

[22]  David Page,et al.  Bayesian Estimation of Latently-grouped Parameters in Undirected Graphical Models , 2013, NIPS.

[23]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[24]  Robert Tibshirani,et al.  Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods , 2009, J. Mach. Learn. Res..

[25]  Grégoire Rey,et al.  Empirical comparison study of approximate methods for structure selection in binary graphical models , 2014, Biometrical journal. Biometrische Zeitschrift.

[26]  Elizabeth L. Wilmer,et al.  Markov Chains and Mixing Times , 2008 .

[27]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[28]  E. Burnside,et al.  New Genetic Variants Improve Personalized Breast Cancer Diagnosis , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.