SuperPixel Based Angular Differences as a Mid-level Image Descriptor

This paper focuses on the object recognition task and aims at improving the accuracy with an emphasis on the feature extraction step. Feature extraction is widely used in image classification as an initial step in the pipeline. In this paper, we propose a method to explore the conventional feature extraction techniques from the perspective that mid-level information could be incorporated in order to obtain a superior scene description. We hypothesize that the commonly used pixel based low-level descriptions are useful but can be improved with the introduction of mid-level region information. Hence, we investigate super pixel based image representation to acquire such mid-level information in order to improve the classification accuracy. Detailed experimental evaluations on classification and retrieval tasks are performed in order to validate the proposed hypothesis. A consistent increase is observed in the mean average precision (MAP) score for different experimental scenarios and image categories.

[1]  Brendan J. Frey,et al.  Epitomic analysis of appearance and shape , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Svetlana Lazebnik,et al.  Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[3]  David D. Cox,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[4]  Peter Lambert,et al.  Bridging the Semantic Gap using Human Vision System Inspired Features , 2010 .

[5]  Nicolas Le Roux,et al.  Ask the locals: Multi-way local pooling for image recognition , 2011, 2011 International Conference on Computer Vision.

[6]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jitendra Malik,et al.  Learning a classification model for segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Devi Parikh Recognizing jumbled images: The role of local and global information in image classification , 2011, 2011 International Conference on Computer Vision.

[9]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[11]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[13]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[15]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Zhuowen Tu,et al.  Detecting Object Boundaries Using Low-, Mid-, and High-level Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Nicolas Pinto,et al.  Comparing state-of-the-art visual features on invariant object recognition tasks , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[19]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[20]  Thomas Serre,et al.  A New Biologically Inspired Color Image Descriptor , 2012, ECCV.

[21]  Luc Van Gool,et al.  SEEDS: Superpixels Extracted via Energy-Driven Sampling , 2012, ECCV.

[22]  Tinne Tuytelaars,et al.  Effective Use of Frequent Itemset Mining for Image Classification , 2012, ECCV.

[23]  Peter Lambert,et al.  Unsupervised texture segmentation and labeling using biologically inspired features , 2008, 2008 IEEE 10th Workshop on Multimedia Signal Processing.

[24]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[25]  Cevahir Çigla,et al.  Super pixel extraction via convexity induced boundary adaptation , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[26]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Nicole C Rust,et al.  Ambiguity and invariance: two fundamental challenges for visual processing , 2010, Current Opinion in Neurobiology.