(RF)^2 - Random Forest Random Field

We combine random forest (RF) and conditional random field (CRF) into a new computational framework, called random forest random field (RF)2. Inference of (RF)2 uses the Swendsen-Wang cut algorithm, characterized by Metropolis-Hastings jumps. A jump from one state to another depends on the ratio of the proposal distributions, and on the ratio of the posterior distributions of the two states. Prior work typically resorts to a parametric estimation of these four distributions, and then computes their ratio. Our key idea is to instead directly estimate these ratios using RF. RF collects in leaf nodes of each decision tree the class histograms of training examples. We use these class histograms for a non-parametric estimation of the distribution ratios. We derive the theoretical error bounds of a two-class (RF)2. (RF)2 is applied to a challenging task of multiclass object recognition and segmentation over a random field of input image regions. In our empirical evaluation, we use only the visual information provided by image regions (e.g., color, texture, spatial layout), whereas the competing methods additionally use higher-level cues about the horizon location and 3D layout of surfaces in the scene. Nevertheless, (RF)2 outperforms the state of the art on benchmark datasets, in terms of accuracy and computation time.

[1]  Adrian Barbu,et al.  Graph partition by Swendsen-Wang cuts , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Sinisa Todorovic,et al.  From a Set of Shapes to Object Discovery , 2010, ECCV.

[3]  R. Zemel,et al.  Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[4]  Andrea Vedaldi,et al.  Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Yi Lin,et al.  Random Forests and Adaptive Nearest Neighbors , 2006 .

[7]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, CVPR.

[8]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Gert R. G. Lanckriet,et al.  Multi-class object localization by combining local contextual interactions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, CVPR.

[12]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[13]  Narendra Ahuja,et al.  Unsupervised Category Modeling, Recognition, and Segmentation in Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  B. Triggs,et al.  Scene segmentation with Conditional Random Fields learned from partially labeled images , 2007, NIPS 2007.

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Thomas G. Dietterich,et al.  Dictionary-free categorization of very similar objects via stacked evidence trees , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Alexei A. Efros,et al.  Unsupervised discovery of visual object class hierarchies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Thomas G. Dietterich,et al.  Dictionary-free categorization of very similar objects via stacked evidence trees , 2009, CVPR.

[20]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[21]  Lior Wolf,et al.  A Unified System For Object Detection, Texture Recognition, and Context Analysis Based on the Standard Model Feature Set , 2005, BMVC.

[22]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[23]  Stephen Gould,et al.  Region-based Segmentation and Object Detection , 2009, NIPS.

[24]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Jitendra Malik,et al.  Context by region ancestry , 2009, 2009 IEEE 12th International Conference on Computer Vision.