论文信息 - Structured Learning and Prediction in Computer Vision

Structured Learning and Prediction in Computer Vision

Powerful statistical models that can be learned efficiently from large amounts of data are currently revolutionizing computer vision. These models possess a rich internal structure reflecting task-specific relations and constraints. This monograph introduces the reader to the most popular classes of structured models in computer vision. Our focus is discrete undirected graphical models which we cover in detail together with a description of algorithms for both probabilistic inference and maximum a posteriori inference. We discuss separately recently successful techniques for prediction in general structured models. In the second part of this monograph we describe methods for parameter learning where we distinguish the classic maximum likelihood based methods from the more recent prediction-based parameter learning methods. We highlight developments to enhance current models and discuss kernelized models and latent variable models. To make the monograph more practical and to provide links to further study we provide examples of successful application of many methods in the computer vision literature.

Sebastian Nowozin | Christoph H. Lampert | S. Nowozin | Sebastian Nowozin

[1] R. Kikuchi. A Theory of Cooperative Phenomena , 1951 .

[2] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .

[3] N. Metropolis,et al. Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[4] J. E. Kelley,et al. The Cutting-Plane Method for Solving Convex Programs , 1960 .

[5] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[6] W. K. Hastings,et al. Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[7] G. Nemhauser,et al. Integer Programming , 2020 .

[8] Martin A. Fischler,et al. The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[9] Michael H. Kutner. Applied Linear Statistical Models , 1974 .

[10] J. Besag. Statistical Analysis of Non-Lattice Data , 1975 .

[11] V. Barnett,et al. Applied Linear Statistical Models , 1975 .

[12] H. P. Williams,et al. Model Building in Mathematical Programming , 1979 .

[13] Kenneth Steiglitz,et al. Combinatorial Optimization: Algorithms and Complexity , 1981 .

[14] Maurice Queyranne,et al. On the structure of all minimum cuts in a network and applications , 1982, Math. Program..

[15] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.

[16] Josef Kittler,et al. Contextual classification of multispectral pixel data , 1984, Image Vis. Comput..

[17] Donald Geman,et al. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Naum Zuselevich Shor,et al. Minimization Methods for Non-Differentiable Functions , 1985, Springer Series in Computational Mathematics.

[19] J. Besag. On the Statistical Analysis of Dirty Pictures , 1986 .

[20] Monique Guignard-Spielberg,et al. Lagrangean decomposition: A model yielding stronger lagrangean bounds , 1987, Math. Program..

[21] R. Fletcher. Practical Methods of Optimization , 1988 .

[22] Jorge Nocedal,et al. On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[23] Charles J. Geyer,et al. Practical Markov Chain Monte Carlo , 1992 .

[24] Solomon Eyal Shimony,et al. Finding MAPs for Belief Networks is NP-Hard , 1994, Artif. Intell..

[25] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[26] Michael I. Jordan,et al. Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.

[27] Brendan J. Frey,et al. A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[28] John N. Tsitsiklis,et al. Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[29] Panos M. Pardalos,et al. Network Optimization , 1997 .

[30] Carlo Tomasi,et al. A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[31] Olga Veksler,et al. Markov random fields with efficient approximations , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[32] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.

[33] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[34] Daniel P. Huttenlocher,et al. Efficient matching of pictorial structures , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[35] P. Bartlett,et al. Probabilities for SV Machines , 2000 .

[36] Francisco Barahona,et al. The volume algorithm: producing primal solutions with a subgradient method , 2000, Math. Program..

[37] Brendan J. Frey,et al. Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[38] Marie-Pierre Jolly,et al. Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[39] Marie-Pierre Jolly,et al. Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images , 2001, ICCV.

[40] D. Bertsekas,et al. Convergen e Rate of In remental Subgradient Algorithms , 2000 .

[41] Claude Lemaréchal,et al. Lagrangian Relaxation , 2000, Computational Combinatorial Optimization.

[42] Bernhard Schölkopf,et al. Learning with kernels , 2001 .

[43] Olga Veksler,et al. Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[44] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..

[45] Tim Hesterberg,et al. Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[46] Abraham P. Punnen,et al. A survey of very large-scale neighborhood search techniques , 2002, Discret. Appl. Math..

[47] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[48] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[49] Alan L. Yuille,et al. CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[50] Olle Häggström. Finite Markov Chains and Algorithmic Applications , 2002 .

[51] Martial Hebert,et al. Discriminative Fields for Modeling Spatial Dependencies in Natural Images , 2003, NIPS.

[52] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[53] Vijay V. Vazirani,et al. Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[54] Michael I. Jordan,et al. A generalized mean field algorithm for variational inference in exponential families , 2002, UAI.

[55] M. Guignard. Lagrangean relaxation , 2003 .

[56] William T. Freeman,et al. Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[57] Tong Zhang. Statistical behavior and consistency of classification methods based on convex risk minimization , 2003 .

[58] Hiroshi Ishikawa,et al. Exact Optimization for Markov Random Fields with Convex Priors , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[59] Jitendra Malik,et al. Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[60] Zhuowen Tu,et al. Image Parsing: Unifying Segmentation, Detection, and Recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[61] Y. Singer,et al. Ultraconservative online algorithms for multiclass problems , 2003 .

[62] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[63] J. Clausen,et al. Branch and Bound Algorithms-Principles and Examples , 2003 .

[64] R. Zemel,et al. Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[65] Joshua Goodman,et al. Exponential Priors for Maximum Entropy Models , 2004, NAACL.

[66] Vladimir Kolmogorov,et al. An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[67] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[68] R. Zabih,et al. What energy functions can be minimized via graph cuts , 2004 .

[69] Christian P. Robert,et al. Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[70] William T. Freeman,et al. Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[71] Alan L. Yuille,et al. The Convergence of Contrastive Divergences , 2004, NIPS.

[72] G. Wahba,et al. Multicategory Support Vector Machines , Theory , and Application to the Classification of Microarray Data and Satellite Radiance Data , 2004 .

[73] Song-Chun Zhu,et al. Filters, Random Fields and Maximum Entropy (FRAME): Towards a Unified Theory for Texture Modeling , 1998, International Journal of Computer Vision.

[74] Daniel P. Huttenlocher,et al. Efficient Belief Propagation for Early Vision , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[75] Alan L. Yuille,et al. A common framework for image segmentation , 1990, International Journal of Computer Vision.

[76] Patrick Pérez,et al. Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[77] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[78] Greg Mori,et al. Guiding model search using segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[79] Christian P. Robert,et al. Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[80] Adrian Barbu,et al. Generalizing Swendsen-Wang to sampling arbitrary posterior probabilities , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[81] Charles M. Bishop,et al. Variational Message Passing , 2005, J. Mach. Learn. Res..

[82] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..

[83] Antonio Frangioni,et al. About Lagrangian Methods in Integer Optimization , 2005, Ann. Oper. Res..

[84] William T. Freeman,et al. Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[85] Thomas P. Minka,et al. Divergence measures and message passing , 2005 .

[86] W. Gilks. Markov Chain Monte Carlo , 2005 .

[87] Daniel Freedman,et al. Energy minimization via graph cuts: settling what is possible , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[88] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[89] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.

[90] Luc Van Gool,et al. European conference on computer vision (ECCV) , 2006, eccv 2006.

[91] J. Rosenthal,et al. Markov Chain Monte Carlo , 2018 .

[92] Bjoern H. Menze,et al. Bayesian Estimation of Smooth Parameter Maps for Dynamic Contrast-Enhanced MR Images with Block-ICM , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[93] Ian McGraw,et al. Residual Belief Propagation: Informed Scheduling for Asynchronous Message Passing , 2006, UAI.

[94] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[95] Mark W. Schmidt,et al. Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[96] 睦憲柳浦,et al. Combinatorial Optimization : Theory and Algorithms (3rd Edition), B. Korte and J. Vygen 著, 出版社 Springer, 発行 2006年, 全ページ 597頁, 価格 53.45ユーロ, ISBN 3-540-25684-9 , 2006 .

[97] Tom Heskes,et al. Convexity Arguments for Efficient Minimization of the Bethe and Kikuchi Free Energies , 2006, J. Artif. Intell. Res..

[98] Vladimir Kolmogorov,et al. Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[99] Gökhan BakIr,et al. Generalization Bounds and Consistency for Structured Labeling , 2007 .

[100] Brian Potetz,et al. Efficient Belief Propagation for Vision Using Linear Constraint Nodes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[101] Fernando Pereira,et al. Structured Learning with Approximate Inference , 2007, NIPS.

[102] Pushmeet Kohli,et al. P3 & Beyond: Solving Energies with Higher Order Cliques , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[103] Nikos Komodakis,et al. MRF Optimization via Dual Decomposition: Message-Passing Revisited , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[104] Vladimir Kolmogorov,et al. Minimizing Nonsubmodular Functions with Graph Cuts-A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[105] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.

[106] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.

[107] Daniel Cremers,et al. Fast Matching of Planar Shapes in Sub-cubic Runtime , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[108] Christian P. Robert,et al. The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[109] Dmitry M. Malioutov,et al. Lagrangian Relaxation for MAP Estimation in Graphical Models , 2007, ArXiv.

[110] Jens Vygen,et al. The Book Review Column1 , 2020, SIGACT News.

[111] Nikos Komodakis,et al. Fast, Approximately Optimal Solutions for Single and Dynamic MRFs , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[112] Trevor Darrell,et al. Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[113] Tomás Werner,et al. A Linear Programming Approach to Max-Sum Problem: A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[114] Vladimir Kolmogorov,et al. Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[115] Nikos Komodakis,et al. Beyond Loose LP-Relaxations: Optimizing MRFs by Repairing Cycles , 2008, ECCV.

[116] Pushmeet Kohli,et al. Reduce, reuse & recycle: Efficiently solving multi-label MRFs , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[117] Tomás Werner,et al. High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF) , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[118] Christoph H. Lampert,et al. Beyond sliding windows: Object localization by efficient subwindow search , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[119] Tommi S. Jaakkola,et al. Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[120] Pushmeet Kohli,et al. Exact inference in multi-label CRFs with higher order cliques , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[121] Vladimir Kolmogorov,et al. Feature Correspondence Via Graph Matching: Models and Global Optimization , 2008, ECCV.

[122] Vladimir Kolmogorov,et al. Graph cut based image segmentation with connectivity priors , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[123] Christopher Joseph Pal,et al. Efficiently Learning Random Fields for Stereo Vision with Sparse Message Passing , 2008, ECCV.

[124] Tommi S. Jaakkola,et al. Clusters and Coarse Partitions in LP Relaxations , 2008, NIPS.

[125] Christoph H. Lampert,et al. Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[126] Pushmeet Kohli,et al. Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[127] Andrew Blake,et al. Image Segmentation by Branch-and-Mincut , 2008, ECCV.

[128] Thorsten Joachims,et al. Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[129] Richard Szeliski,et al. A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[130] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[131] Rahul Gupta,et al. Accurate max-margin training for structured output spaces , 2008, ICML '08.

[132] Derek Hoiem,et al. Learning CRFs Using Graph Cuts , 2008, ECCV.

[133] Nicol N. Schraudolph,et al. Efficient Exact Inference in Planar Ising Models , 2008, NIPS.

[134] Andrew McCallum,et al. Piecewise training for structured prediction , 2009, Machine Learning.

[135] Tomas Werner,et al. Revisiting the Decomposition Approach to Inference in Exponential Families and Graphical Models , 2009 .

[136] Christoph H. Lampert,et al. Structured prediction by joint kernel support estimation , 2009, Machine Learning.

[137] Cristian Sminchisescu,et al. Structured output-associative regression , 2009, CVPR.

[138] Vladimir Kolmogorov,et al. A global perspective on MAP inference for low-level vision , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[139] Toby Sharp,et al. Image segmentation with a bounding box prior , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[140] Gustavo Camps-Valls,et al. Structured output SVM for remote sensing image classification , 2009 .

[141] Stefano Soatto,et al. Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[142] Amir Globerson,et al. Convergent message passing algorithms - a unifying view , 2009, UAI.

[143] Thorsten Joachims,et al. Learning structural SVMs with latent variables , 2009, ICML '09.

[144] M. Mézard,et al. Information, Physics, and Computation , 2009 .

[145] Sebastian Nowozin,et al. Global connectivity potentials for random field models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[146] Vladimir Kolmogorov,et al. Joint optimization of segmentation and appearance models , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[147] Christoph H. Lampert,et al. Efficient Subwindow Search: A Branch and Bound Framework for Object Localization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[148] Laurence A. Wolsey,et al. Two “well-known” properties of subgradient optimization , 2009, Math. Program..

[149] Sebastian Nowozin,et al. Solution stability in linear programming relaxations: graph partitioning and unsupervised learning , 2009, ICML '09.

[150] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..

[151] Eric P. Xing,et al. Polyhedral outer approximations with application to natural language parsing , 2009, ICML '09.

[152] Bernt Schiele,et al. Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[153] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .

[154] Dima Damen,et al. Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[155] Thorsten Joachims,et al. Cutting-plane training of structural SVMs , 2009, Machine Learning.

[156] Daniel Cremers,et al. Efficient planar graph cuts with applications in Computer Vision , 2009, CVPR.

[157] Sebastian Nowozin,et al. On Parameter Learning in CRF-Based Approaches to Object Class Image Segmentation , 2010, ECCV.

[158] D. Sontag. 1 Introduction to Dual Decomposition for Inference , 2010 .

[159] Eric P. Xing,et al. Augmenting Dual Decomposition for MAP Inference , 2010 .

[160] Cristian Sminchisescu,et al. Constrained parametric min-cuts for automatic object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[161] Tsuhan Chen,et al. Beyond trees: MRF inference via outer-planar decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[162] Alexander J. Smola,et al. Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[163] Philip H. S. Torr,et al. Efficient piecewise learning for conditional random fields , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[164] Stephen Gould,et al. Accelerated dual decomposition for MAP inference , 2010, ICML.

[165] Joachim M. Buhmann,et al. Entropy and Margin Maximization for Structured Output Learning , 2010, ECML/PKDD.

[166] Fredrik Kahl,et al. Parallel and distributed graph cuts by dual decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[167] R. Carroll,et al. Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples , 2010 .

[168] Arthur M. Geoffrion,et al. Lagrangian Relaxation for Integer Programming , 2010, 50 Years of Integer Programming.

[169] Padhraic Smyth,et al. Learning with Blocks: Composite Likelihood and Contrastive Divergence , 2010, AISTATS.

[170] Qi Gao,et al. A generative perspective on MRFs in low-level vision , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[171] Christoph Schnörr,et al. A study of Nesterov's scheme for Lagrangian decomposition and MAP labeling , 2011, CVPR 2011.

[172] Peter Rossmanith,et al. Simulated Annealing , 2008, Taschenbuch der Algorithmen.

[173] Sebastian Nowozin,et al. Tighter Relaxations for MAP-MRF Inference: A Local Primal-Dual Gap based Separation Algorithm , 2011, AISTATS.

[174] Sebastian Nowozin,et al. Variable grouping for energy minimization , 2011, CVPR 2011.

[175] Tommi S. Jaakkola,et al. Introduction to dual composition for inference , 2011 .

[176] Tomas Fencl,et al. Network Optimization , 2011, Lecture Notes in Computer Science.

[177] Christoph H. Lampert,et al. Enforcing topological constraints in random field image segmentation , 2011, CVPR 2011.

[178] David Barber,et al. Bayesian reasoning and machine learning , 2012 .

[179] Pushmeet Kohli,et al. MAP inference in Discrete Models , 2012, BMVC.

[180] Kathryn A. Dowsland,et al. Simulated Annealing , 1989, Encyclopedia of GIS.

[181] Cutting-Plane Methods in Machine Learning , 2013 .