Decision tree fields

This paper introduces a new formulation for discrete image labeling tasks, the Decision Tree Field (DTF), that combines and generalizes random forests and conditional random fields (CRF) which have been widely used in computer vision. In a typical CRF model the unary potentials are derived from sophisticated random forest or boosting based classifiers, however, the pairwise potentials are assumed to (1) have a simple parametric form with a pre-specified and fixed dependence on the image data, and (2) to be defined on the basis of a small and fixed neighborhood. In contrast, in DTF, local interactions between multiple variables are determined by means of decision trees evaluated on the image data, allowing the interactions to be adapted to the image content. This results in powerful graphical models which are able to represent complex label structure. Our key technical contribution is to show that the DTF model can be trained efficiently and jointly using a convex approximate likelihood function, enabling us to learn over a million free model parameters. We show experimentally that for applications which have a rich and complex label structure, our model achieves excellent results.

[1]  Jorge Nocedal,et al.  Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization , 1997, TOMS.

[2]  Ben Taskar,et al.  Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[6]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[8]  Mark W. Schmidt,et al.  Accelerated training of conditional random fields with stochastic gradient methods , 2006, ICML.

[9]  Toby Sharp,et al.  Implementing Decision Trees and Forests on a GPU , 2008, ECCV.

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[12]  Derek Hoiem,et al.  Learning CRFs Using Graph Cuts , 2008, ECCV.

[13]  J. Besag Efficiency of pseudolikelihood estimation for simple Gaussian fields , 1977 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Marie-Pierre Jolly,et al.  Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images , 2001, ICCV.

[17]  Andrew W. Fitzgibbon,et al.  Learning Class-Specific Edges for Object Detection and Segmentation , 2006, ICVGIP.

[18]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[19]  Richard Szeliski,et al.  A content-aware image prior , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[21]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[22]  Bernt Schiele,et al.  Automatic discovery of meaningful object parts with latent CRFs , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Vladimir Kolmogorov,et al.  What metrics can be approximated by geo-cuts, or global optimization of length/area and flux , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[24]  Sebastian Nowozin,et al.  Global connectivity potentials for random field models , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[26]  Sinisa Todorovic,et al.  (RF)^2 - Random Forest Random Field , 2010, NIPS.

[27]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Michael I. Jordan Graphical Models , 1998 .

[29]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[30]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[31]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..

[32]  Michael J. Black,et al.  Steerable Random Fields , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[33]  Tsuhan Chen,et al.  Learning class-specific affinities for image labelling , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields for Relational Learning , 2007 .

[35]  Sabine Glesner,et al.  Constructing Flexible Dynamic Belief Networks from First-Order Probalistic Knowledge Bases , 1995, ECSQARU.

[36]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[38]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.