Learning Dynamic Hybrid Markov Random Field for Image Labeling

Using shape information has gained increasing concerns in the task of image labeling. In this paper, we present a dynamic hybrid Markov random field (DHMRF), which explicitly captures middle-level object shape and low-level visual appearance (e.g., texture and color) for image labeling. Each node in DHMRF is described by either a deformable template or an appearance model as visual prototype. On the other hand, the edges encode two types of intersections: co-occurrence and spatial layered context, with respect to the labels and prototypes of connected nodes. To learn the DHMRF model, an iterative algorithm is designed to automatically select the most informative features and estimate model parameters. The algorithm achieves high computational efficiency since a branch-and-bound schema is introduced to estimate model parameters. Compared with previous methods, which usually employ implicit shape cues, our DHMRF model seamlessly integrates color, texture, and shape cues to inference labeling output, and thus produces more accurate and reliable results. Extensive experiments validate its superiority over other state-of-the-art methods in terms of recognition accuracy and implementation efficiency on: the MSRC 21-class dataset, and the lotus hill institute 15-class dataset.

[1]  Anat Levin,et al.  Learning to Combine Bottom-Up and Top-Down Segmentation , 2006, International Journal of Computer Vision.

[2]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Song-Chun Zhu,et al.  Learning Active Basis Model for Object Detection and Recognition , 2010, International Journal of Computer Vision.

[4]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[5]  Christopher K. I. Williams,et al.  Combining Belief Networks and Neural Networks for Scene Segmentation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  James M. Rehg,et al.  Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[9]  Jiebo Luo,et al.  Probabilistic spatial context models for scene content understanding , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[10]  Song-Chun Zhu,et al.  Minimax Entropy Principle and Its Application to Texture Modeling , 1997, Neural Computation.

[11]  Zhuowen Tu,et al.  Image Parsing: Unifying Segmentation, Detection, and Recognition , 2005, International Journal of Computer Vision.

[12]  Jitendra Malik,et al.  Cue Integration for Figure/Ground Labeling , 2005, NIPS.

[13]  Song-Chun Zhu,et al.  Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo-Toward a 'Trichromacy' Theory of Texture , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[15]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  R. Jennrich,et al.  Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[17]  Stephen Gould,et al.  Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[19]  Benjamin Z. Yao,et al.  Introduction to a Large-Scale General Purpose Ground Truth Database: Methodology, Annotation Tool and Benchmarks , 2007, EMMCVPR.

[20]  Harry Shum,et al.  Image segmentation by data driven Markov chain Monte Carlo , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[21]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[22]  Bernt Schiele,et al.  A Dynamic Conditional Random Field Model for Joint Labeling of Object and Scene Classes , 2008, ECCV.

[23]  Anil K. Jain,et al.  Random field models in image analysis , 1989 .

[24]  Martial Hebert,et al.  Discriminative random fields: a discriminative framework for contextual interaction in classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[25]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Alexei A. Efros,et al.  Closing the loop in scene interpretation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[28]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[29]  Song-Chun Zhu,et al.  Learning mixed templates for object recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Adrian Barbu,et al.  Generalizing Swendsen-Wang to sampling arbitrary posterior probabilities , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Serge J. Belongie,et al.  Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Richard S. Zemel,et al.  Learning and Incorporating Top-Down Cues in Image Segmentation , 2006, ECCV.

[35]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[37]  Long Zhu,et al.  Recursive Segmentation and Recognition Templates for 2D Parsing , 2008, NIPS.

[38]  Ayman El-Baz,et al.  Accurate Automatic Analysis of Cardiac Cine Images , 2012, IEEE Transactions on Biomedical Engineering.

[39]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[41]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[42]  Yi Yang,et al.  Layered object detection for multi-class segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Philip H. S. Torr,et al.  What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[44]  Lin Yang,et al.  Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Martial Hebert,et al.  A hierarchical field framework for unified context-based classification , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[46]  Bill Triggs,et al.  Region Classification with Markov Field Aspect Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Shimon Ullman,et al.  Combined Top-Down/Bottom-Up Segmentation , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Andrea Vedaldi,et al.  Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[49]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[50]  Stephen Gould,et al.  Region-based Segmentation and Object Detection , 2009, NIPS.

[51]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[52]  B. Triggs,et al.  Scene segmentation with Conditional Random Fields learned from partially labeled images , 2007, NIPS 2007.