Scene Labeling Through Knowledge-Based Rules Employing Constrained Integer Linear Programing

Scene labeling task is to segment the image into meaningful regions and categorize them into classes of objects which comprised the image. Commonly used methods typically find the local features for each segment and label them using classifiers. Afterward, labeling is smoothed in order to make sure that neighboring regions receive similar labels. However, they ignore expressive and non-local dependencies among regions due to expensive training and inference. In this paper, we propose to use high-level knowledge regarding rules in the inference to incorporate dependencies among regions in the image to improve scores of classification. Towards this aim, we extract these rules from data and transform them into constraints for Integer Programming to optimize the structured problem of assigning labels to super-pixels (consequently pixels) of an image. In addition, we propose to use soft-constraints in some scenarios, allowing violating the constraint by imposing a penalty, to make the model more flexible. We assessed our approach on three datasets and obtained promising results.

[1]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Scenes and Its Applications , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Pushmeet Kohli,et al.  Graph Cut Based Inference with Co-occurrence Statistics , 2010, ECCV.

[3]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[4]  Rob Fergus,et al.  Nonparametric image parsing using adaptive neighbor sets , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[6]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[8]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[9]  Jean Ponce,et al.  Sparse Modeling for Image and Vision Processing , 2014, Found. Trends Comput. Graph. Vis..

[10]  Chong Ho Lee,et al.  Improving Accuracy for Image Parsing Using Spatial Context and Mutual Information , 2013, ICONIP.

[11]  Allen R. Hanson,et al.  Experiments in Schema-Driven Interpretation of a Natural Scene , 1981 .

[12]  Jana Kosecka,et al.  Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Philip H. S. Torr,et al.  What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[14]  Derek Hoiem,et al.  Labeling Complete Surfaces in Scene Understanding , 2014, International Journal of Computer Vision.

[15]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[16]  Andrea Vedaldi,et al.  Objects in Context , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  Vivek Srikumar,et al.  Soft Constraints in Integer Linear Programs , 2013 .

[18]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[19]  Ming-Wei Chang,et al.  Structured learning with constrained conditional models , 2012, Machine Learning.

[20]  Marian George,et al.  Image parsing with a wide range of classes and scene-level context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Ce Liu,et al.  Scene Collaging: Analysis and Synthesis of Natural Images with Semantic Layers , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Sinisa Todorovic,et al.  Scene Labeling Using Beam Search under Mutex Constraints , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Svetlana Lazebnik,et al.  Scene Parsing with Object Instances and Occlusion Ordering , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  James J. Little,et al.  CollageParsing: Nonparametric Scene Parsing by Adaptive Overlapping Windows , 2014, ECCV.

[25]  Yann LeCun,et al.  Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers , 2012, ICML.

[26]  Geoffrey E. Hinton Products of experts , 1999 .

[27]  Stephen Gould,et al.  PatchMatchGraph: Building a Graph of Dense Patch Correspondences for Label Transfer , 2012, ECCV.

[28]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[29]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[30]  Svetlana Lazebnik,et al.  Superparsing , 2010, International Journal of Computer Vision.

[31]  Svetlana Lazebnik,et al.  Finding Things: Image Parsing with Regions and Per-Exemplar Detectors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.