Constructing Composite Likelihoods in General Random Fields

We propose a simple estimator based on composite likelihoods for parameter learning in random field models. The estimator can be applied to all discrete graphical models such as Markov random fields and conditional random fields, including ones with higher-order energies. It is computationally efficient because it requires only inference over treestructured subgraphs of the original graph, and it is consistent, that is, it asymptotically gives the optimal parameter estimate in the model class. We verify these conceptual advantages in synthetic experiments and demonstrate the difficulties encountered by popular alternative estimation approaches.

[1]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[2]  Josiane Zerubia,et al.  Estimation of Markov random field prior parameters using Markov chain Monte Carlo maximum likelihood , 1999, IEEE Trans. Image Process..

[3]  J. Besag Efficiency of pseudolikelihood estimation for simple Gaussian fields , 1977 .

[4]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[5]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[6]  N. Reid,et al.  AN OVERVIEW OF COMPOSITE LIKELIHOOD METHODS , 2011 .

[7]  Guy Lebanon,et al.  Statistical and Computational Tradeoffs in Stochastic Composite Likelihood , 2009, AISTATS.

[8]  Michael I. Jordan,et al.  Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.

[9]  Andrew McCallum,et al.  Piecewise Training for Undirected Models , 2005, UAI.

[10]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[11]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[12]  Daphne Koller,et al.  Non-Local Contrastive Objectives , 2010, ICML.

[13]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[14]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[15]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[16]  Sebastian Nowozin,et al.  Putting MAP Back on the Map , 2011, DAGM-Symposium.

[17]  Gökhan BakIr,et al.  Generalization Bounds and Consistency for Structured Labeling , 2007 .

[18]  Carlo Gaetan,et al.  Composite likelihood methods for space-time data , 2006 .

[19]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[20]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[21]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[22]  Thomas B. Fomby Maximum Likelihood Estimation of Misspecified Models , 2003 .

[23]  Michael I. Jordan,et al.  An asymptotic analysis of generative, discriminative, and pseudolikelihood estimators , 2008, ICML '08.

[24]  Andrew McCallum,et al.  Piecewise pseudolikelihood for efficient training of conditional random fields , 2007, ICML '07.

[25]  Michael I. Jordan,et al.  Optimization of Structured Mean Field Objectives , 2009, UAI.

[26]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[27]  Padhraic Smyth,et al.  Learning with Blocks: Composite Likelihood and Contrastive Divergence , 2010, AISTATS.

[28]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[29]  Justin Domke,et al.  Learning Graphical Model Parameters with Approximate Marginal Inference , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[31]  Aapo Hyv Estimation of Non-Normalized Statistical Models by Score Matching , 2005 .

[32]  Philip H. S. Torr,et al.  Efficient piecewise learning for conditional random fields , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Joachim M. Buhmann,et al.  Spanning Tree Approximations for Conditional Random Fields , 2009, AISTATS.