论文信息 - Computer Vision – ECCV 2012

Computer Vision – ECCV 2012

Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, following the intuition that knowing the location of the object of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the objects, and (2) using the location information to pool foreground and background features separately to form the image-level representation. Step (1) is particularly challenging in a typical classification setting where precise object location annotations are not available during training. To address this challenge, we propose a framework that learns object detectors using only image-level class labels, or so-called weak labels. We validate our approach on the challenging PASCAL07 dataset. Our learned detectors are comparable in accuracy with stateof-the-art weakly supervised detection methods. More importantly, the resulting OCP approach significantly outperforms SPM-based pooling in image classification.

[1] Lie Lu,et al. A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[2] Yung-Yu Chuang,et al. A collaborative benchmark for region of interest detection algorithms , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Wen Gao,et al. Probabilistic Multi-Task Learning for Visual Saliency Estimation in Video , 2010, International Journal of Computer Vision.

[4] Allan D. Jepson,et al. Benchmarking Image Segmentation Algorithms , 2009, International Journal of Computer Vision.

[5] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[6] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[7] Yann LeCun,et al. Transformation Invariance in Pattern Recognition - Tangent Distance and Tangent Propagation , 2012, Neural Networks: Tricks of the Trade.

[8] Ali Borji,et al. State-of-the-Art in Visual Attention Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Qi Tian,et al. Saliency Density Maximization for Object Detection and Localization , 2010, ACCV.

[10] Luc Van Gool,et al. SURF: Speeded Up Robust Features , 2006, ECCV.

[11] Christopher M. Masciocchi,et al. Everyone knows what is interesting: salient locations which should be fixated. , 2009, Journal of vision.

[12] Vincent Lepetit,et al. Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Gabriela Csurka,et al. A framework for visual saliency detection with applications to image thumbnailing , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14] Ming Yang,et al. Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.

[15] Ronen Basri,et al. Image Segmentation by Probabilistic Bottom-Up Aggregation and Cue Integration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Brendan J. Frey,et al. Transformed component analysis: joint estimation of spatial transformations and image components , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17] Bernhard Schölkopf,et al. Ranking on Data Manifolds , 2003, NIPS.

[18] Matti Pietikäinen,et al. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19] Lihi Zelnik-Manor,et al. Salient Edges: A Multi Scale Approach , 2010 .

[20] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[21] Ivan Laptev,et al. On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[22] Song Wang,et al. Image-Segmentation Evaluation From the Perspective of Salient Object Extraction , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23] Zhi Liu,et al. Efficient saliency detection based on gaussian models , 2011 .

[24] Jing Shen,et al. An Improved Computational Approach for Salient Region Detection , 2010, J. Comput..

[25] Loong Fah Cheong,et al. Active Visual Segmentation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Gérard G. Medioni,et al. Dimensionality Estimation, Manifold Learning and Function Approximation using Tensor Voting , 2010, J. Mach. Learn. Res..

[27] Nicu Sebe,et al. Image saliency by isocentric curvedness and color , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28] Alexander Zien,et al. lp-Norm Multiple Kernel Learning , 2011, J. Mach. Learn. Res..

[29] Nuno Vasconcelos,et al. Discriminant Saliency for Visual Recognition from Cluttered Scenes , 2004, NIPS.

[30] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31] J. Tenenbaum,et al. A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[32] James H. Elder,et al. Design and perceptual validation of performance measures for salient object segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[33] B. Frey,et al. Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[34] Cor J. Veenman,et al. Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] M. Maggioni,et al. GEOMETRIC DIFFUSIONS AS A TOOL FOR HARMONIC ANALYSIS AND STRUCTURE DEFINITION OF DATA PART I: DIFFUSION MAPS , 2005 .

[36] King Ngi Ngan,et al. A Co-Saliency Model of Image Pairs , 2011, IEEE Transactions on Image Processing.

[37] Hermann Ney,et al. Deformation Models for Image Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[39] Tilke Judd,et al. Understanding and predicting where people look in images , 2011 .

[40] Laurent Itti,et al. Automatic foveation for video compression using a neurobiological model of visual attention , 2004, IEEE Transactions on Image Processing.

[41] Martin D. Levine,et al. Saliency Detection Based on Frequency and Spatial Domain Analyses , 2011, BMVC.

[42] Zhi Liu,et al. Nonparametric saliency detection using kernel density estimation , 2010, 2010 IEEE International Conference on Image Processing.

[43] A. Torralba,et al. The role of context in object recognition , 2007, Trends in Cognitive Sciences.

[44] Yupin Luo,et al. Edge-based method for detecting salient objects , 2011 .

[45] Andrew Zisserman,et al. Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[46] Michael Jones,et al. Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes , 2004, International Journal of Computer Vision.

[47] Jitendra Malik,et al. Shape Context: A New Descriptor for Shape Matching and Object Recognition , 2000, NIPS.

[48] HongJiang Zhang,et al. Contrast-based image attention analysis by using fuzzy growing , 2003, MULTIMEDIA '03.

[49] Deepu Rajan,et al. Random walks on graphs to model saliency in images , 2009, CVPR.

[50] Song Wang,et al. New benchmark for image segmentation evaluation , 2007, J. Electronic Imaging.

[51] J. Weickert,et al. Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[52] Paul A. Viola,et al. Learning from one example through shared densities on transforms , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[53] Andrew Gilbert,et al. Action recognition using Randomised Ferns , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[54] Liming Zhang,et al. A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression , 2010, IEEE Transactions on Image Processing.

[55] Sabine Süsstrunk,et al. Saliency detection for content-aware image resizing , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[56] Brendan J. Frey,et al. Estimating mixture models of images and inferring spatial transformations using the EM algorithm , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[57] Patrice Y. Simard,et al. Metrics and Models for Handwritten Character Recognition , 1998 .

[58] Bernhard Schölkopf,et al. Center-surround patterns emerge as optimal predictors for human saccade targets. , 2009, Journal of vision.

[59] Hermann Ney,et al. Adaptation in statistical pattern recognition using tangent vectors , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60] C. Schmid,et al. Description of Interest Regions with Center-Symmetric Local Binary Patterns , 2006, ICVGIP.

[61] Nanning Zheng,et al. Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.