Learning Graphical Model Parameters with Approximate Marginal Inference

Likelihood-based learning of graphical models faces challenges of computational complexity and robustness to model misspecification. This paper studies methods that fit parameters directly to maximize a measure of the accuracy of predicted marginals, taking into account both model and inference approximations at training time. Experiments on imaging problems suggest marginalization-based learning performs better than likelihood-based approximations on difficult problems where the model being fit is approximate in nature.

[1]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[2]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[3]  Tomaso Poggio,et al.  Probabilistic Solution of Ill-Posed Problems in Computational Vision , 1987 .

[4]  Lalit R. Bahl,et al.  A new algorithm for the estimation of hidden Markov model parameters , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[6]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Yee Whye Teh,et al.  An Alternate Objective Function for Markovian Fields , 2002, ICML.

[9]  Ken P. Chong,et al.  Approximate Solution Methods in Engineering Mechanics , 2002 .

[10]  Song-Chun Zhu,et al.  Learning in Gibbsian Fields: How Accurate and How Fast Can It Be? , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  R. Zemel,et al.  Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[12]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[13]  Yee Whye Teh,et al.  Linear Response Algorithms for Approximate Inference in Graphical Models , 2004, Neural Computation.

[14]  Martial Hebert,et al.  Exploiting Inference for Approximate Parameter Learning in Discriminative Fields: An Empirical Study , 2005, EMMCVPR.

[15]  Andrew McCallum,et al.  Piecewise Training for Undirected Models , 2005, UAI.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Miguel Á. Carreira-Perpiñán,et al.  On Contrastive Divergence Learning , 2005, AISTATS.

[18]  Jitendra Malik,et al.  Figure/Ground Assignment in Natural Images , 2006, ECCV.

[19]  Mark W. Schmidt,et al.  Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[20]  Olga Russakovsky,et al.  Training Conditional Random Fields for Maximum Labelwise Accuracy , 2006, NIPS.

[21]  Martin J. Wainwright,et al.  Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting , 2006, J. Mach. Learn. Res..

[22]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[23]  Bill Triggs,et al.  Scene Segmentation with CRFs Learned from Partially Labeled Images , 2007, NIPS.

[24]  Ping Zhong,et al.  Using Combination of Statistical Models and Multilevel Structural Information for Detecting Urban Areas From a Single Gray-Level Image , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Christopher Joseph Pal,et al.  Learning Conditional Random Fields for Stereo , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  In-So Kweon,et al.  Robust model-based scene interpretation by multilayered context information , 2007, Comput. Vis. Image Underst..

[27]  M. Nikolova Model distortions in Bayesian MAP reconstruction , 2007 .

[28]  Jitendra Malik,et al.  Learning Probabilistic Models for Contour Completion in Natural Images , 2008, International Journal of Computer Vision.

[29]  Richard S. Zemel,et al.  Learning Flexible Features for Conditional Random Fields , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, International Journal of Computer Vision.

[31]  Justin Domke Learning Convex Inference of Marginals , 2008, UAI.

[32]  Christopher Joseph Pal,et al.  Efficiently Learning Random Fields for Stereo Vision with Sparse Message Passing , 2008, ECCV.

[33]  Osamu Hasegawa,et al.  Random Field Model for Integration of Local Information and Global Information , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Bo Zhang,et al.  Scene understanding with discriminative structured prediction , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[36]  Michael J. Black,et al.  Fields of Experts , 2009, International Journal of Computer Vision.

[37]  Derek Hoiem,et al.  Learning CRFs Using Graph Cuts , 2008, ECCV.

[38]  Pushmeet Kohli,et al.  Measuring uncertainty in graph cut solutions , 2008, Comput. Vis. Image Underst..

[39]  B. Triggs,et al.  Scene Segmentation via Low-dimensional Semantic Representation and Conditional Random Field , 2009 .

[40]  Zoubin Ghahramani,et al.  Choosing a Variable to Clamp , 2009, International Conference on Artificial Intelligence and Statistics.

[41]  Gabriela Csurka,et al.  Hierarchical Image-Region Labeling via Structured Learning , 2009, BMVC.

[42]  Amir Globerson,et al.  Convergent message passing algorithms - a unifying view , 2009, UAI.

[43]  Charless C. Fowlkes,et al.  Discriminative Models for Multi-Class Object Layout , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[44]  N. Andrei Accelerated conjugate gradient algorithm with finite difference Hessian/vector product approximation for unconstrained optimization , 2009 .

[45]  Sebastian Nowozin,et al.  On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation , 2010, ECCV.

[46]  Justin Domke,et al.  Implicit Differentiation by Perturbation , 2010, NIPS.

[47]  Veselin Stoyanov,et al.  Empirical Risk Minimization of Graphical Model Parameters Given Approximate Inference, Decoding, and Model Structure , 2011, AISTATS.

[48]  Justin Domke,et al.  Parameter learning with truncated message-passing , 2011, CVPR 2011.

[49]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[50]  George Konidaris,et al.  Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.

[51]  Veselin Stoyanov,et al.  Minimum-Risk Training of Approximate CRF-Based NLP Systems , 2012, NAACL.

[52]  Harry Joe,et al.  Composite Likelihood Methods , 2012 .

[53]  Sanjiv Kumar,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.