Tensor Variable Elimination for Plated Factor Graphs

A wide class of machine learning algorithms can be reduced to variable elimination on factor graphs. While factor graphs provide a unifying notation for these algorithms, they do not provide a compact way to express repeated structure when compared to plate diagrams for directed graphical models. To exploit efficient tensor algebra in graphs with plates of variables, we generalize undirected factor graphs to plated factor graphs and variable elimination to a tensor variable elimination algorithm that operates directly on plated factor graphs. Moreover, we generalize complexity bounds based on treewidth and characterize the class of plated factor graphs for which inference is tractable. As an application, we integrate tensor variable elimination into the Pyro probabilistic programming language to enable exact inference in discrete latent variable models with repeated structure. We validate our methods with experiments on both directed and undirected graphical models, including applications to polyphonic music modeling, animal movement modeling, and latent sentiment analysis.

[1]  Stefan Woltran,et al.  Improving the Efficiency of Dynamic Programming on Tree Decompositions via Machine Learning , 2015, IJCAI.

[2]  John F. Stanton,et al.  A massively parallel tensor contraction framework for coupled-cluster computations , 2014, J. Parallel Distributed Comput..

[3]  Erik Cambria,et al.  Sentic LSTM: a Hybrid Network for Targeted Aspect-Based Sentiment Analysis , 2018, Cognitive Computation.

[4]  Uri Shalit,et al.  Structured Inference Networks for Nonlinear State Space Models , 2016, AAAI.

[5]  Andreas Krause,et al.  Differentiable Learning of Submodular Models , 2017, NIPS 2017.

[6]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[8]  Paul Smolensky,et al.  Information processing in dynamical systems: foundations of harmony theory , 1986 .

[9]  Richard E. Turner,et al.  Neural Adaptive Sequential Monte Carlo , 2015, NIPS.

[10]  Wray L. Buntine Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..

[11]  A. Raftery A model for high-order Markov chains , 1985 .

[12]  Venkat Chandrasekaran,et al.  Complexity of Inference in Graphical Models , 2008, UAI.

[13]  Roland Langrock,et al.  Sex-specific and individual preferences for hunting strategies in white sharks , 2016 .

[14]  Frank Plumpton Ramsey,et al.  On a Problem of Formal Logic , 1930 .

[15]  Noah D. Goodman,et al.  Pyro: Deep Universal Probabilistic Programming , 2018, J. Mach. Learn. Res..

[16]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17]  Noah D. Goodman,et al.  Nonstandard Interpretations of Probabilistic Programs for Efficient Inference , 2011, NIPS.

[18]  A. Proofs Tensor Variable Elimination for Plated Factor Graphs , 2019 .

[19]  Adnan Darwiche,et al.  A differential approach to inference in Bayesian networks , 2000, JACM.

[20]  Brett T. McClintock,et al.  Combining individual animal movement and ancillary biotelemetry data to investigate population-level activity budgets , 2013 .

[21]  Jeff A. Bilmes,et al.  Dynamic Graphical Models , 2010, IEEE Signal Processing Magazine.

[22]  Jason Eisner,et al.  Inside-Outside and Forward-Backward Algorithms Are Just Backprop (tutorial paper) , 2016, SPNLP@EMNLP.

[23]  Christian Osendorfer,et al.  Learning Stochastic Recurrent Networks , 2014, NIPS 2014.

[24]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[25]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[26]  Yoshua Bengio,et al.  Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.

[27]  Johnnie Gray,et al.  opt\_einsum - A Python package for optimizing contraction order for einsum-like expressions , 2018, J. Open Source Softw..

[28]  Alexander A. Stepanov,et al.  Generic Programming , 1988, ISSAC.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Timothy Baldwin,et al.  Recurrent Entity Networks with Delayed Memory Update for Targeted Aspect-Based Sentiment Analysis , 2018, NAACL.

[31]  Guillaume Bouchard,et al.  SentiHood: Targeted Aspect Based Sentiment Analysis Dataset for Urban Neighbourhoods , 2016, COLING.

[32]  Yuji Matsumoto,et al.  An Algebraic Formalization of Forward and Forward-backward Algorithms , 2017, ArXiv.

[33]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[34]  W. Zucchini,et al.  Hidden Markov Models for Time Series: An Introduction Using R , 2009 .

[35]  Jesse Davis,et al.  Lifted Variable Elimination: Decoupling the Operators from the Constraint Language , 2013, J. Artif. Intell. Res..

[36]  Zhe Gan,et al.  Deep Temporal Sigmoid Belief Networks for Sequence Modeling , 2015, NIPS.

[37]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[38]  Johan Kwisthout,et al.  The Necessity of Bounded Treewidth for Efficient Inference in Bayesian Networks , 2010, ECAI.

[39]  Luc De Raedt,et al.  Semiring Programming: A Declarative Framework for Generalized Sum Product Problems. , 2016, 1609.06954.

[40]  J. Zico Kolter,et al.  OptNet: Differentiable Optimization as a Layer in Neural Networks , 2017, ICML.

[41]  Nic Wilson,et al.  Semiring induced valuation algebras: Exact and approximate local computation algorithms , 2008, Artif. Intell..

[42]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[43]  Geoffrey E. Hinton,et al.  The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.

[44]  S. Hirata Tensor Contraction Engine: Abstraction and Automated Parallel Implementation of Configuration-Interaction, Coupled-Cluster, and Many-Body Perturbation Theories , 2003 .

[45]  Brett T. McClintock,et al.  momentuHMM: R package for generalized hidden Markov models of animal movement , 2017, 1710.03786.

[46]  Atri Rudra,et al.  FAQ: Questions Asked Frequently , 2015, PODS.