论文信息 - Exemplar-Specific Patch Features for Fine-Grained Recognition

Exemplar-Specific Patch Features for Fine-Grained Recognition

In this paper, we present a new approach for fine-grained recognition or subordinate categorization, tasks where an algorithm needs to reliably differentiate between visually similar categories, e.g., different bird species. While previous approaches aim at learning a single generic representation and models with increasing complexity, we propose an orthogonal approach that learns patch representations specifically tailored to every single test exemplar. Since we query a constant number of images similar to a given test image, we obtain very compact features and avoid large-scale training with all classes and examples. Our learned mid-level features are built on shape and color detectors estimated from discovered patches reflecting small highly discriminative structures in the queried images. We evaluate our approach for fine-grained recognition on the CUB-2011 birds dataset and show that high recognition rates can be obtained by model combination.

Joachim Denzler | Trevor Darrell | Erik Rodner | Alexander Freytag

[1] Andrew Y. Ng,et al. The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[2] Jonathan Krause,et al. Fine-Grained Crowdsourcing for Fine-Grained Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3] C. V. Jawahar,et al. Blocks That Shout: Distinctive Parts for Scene Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4] Peter N. Belhumeur,et al. POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Yong Jae Lee,et al. Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time , 2013, 2013 IEEE International Conference on Computer Vision.

[6] Alexei A. Efros,et al. Unsupervised Discovery of Mid-Level Discriminative Patches , 2012, ECCV.

[7] Trevor Darrell,et al. Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[8] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9] Cordelia Schmid,et al. Applying Color Names to Image Description , 2007, 2007 IEEE International Conference on Image Processing.

[10] Shimon Ullman,et al. Object recognition with informative features and linear classification , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[12] Fahad Shahbaz Khan,et al. Portmanteau Vocabularies for Multi-Cue Image Representation , 2011, NIPS.

[13] Trevor Darrell,et al. Pooling-Invariant Image Feature Learning , 2013, ArXiv.

[14] Linda G. Shapiro,et al. Unsupervised Template Learning for Fine-Grained Object Recognition , 2012, NIPS.

[15] Jitendra Malik,et al. Discriminative Decorrelation for Clustering and Classification , 2012, ECCV.

[16] Kun Duan,et al. Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Jitendra Malik,et al. Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18] 智一吉田,et al. Efficient Graph-Based Image Segmentationを用いた圃場図自動作成手法の検討 , 2014 .

[19] Léon Bottou,et al. Local Learning Algorithms , 1992, Neural Computation.

[20] Dan Roth,et al. Learning a Sparse Representation for Object Detection , 2002, ECCV.

[21] Radu Tudor Ionescu,et al. Objectness to improve the bag of visual words model , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[22] Larry S. Davis,et al. Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance , 2011, 2011 International Conference on Computer Vision.

[23] Joachim Denzler,et al. Nonparametric Part Transfer for Fine-Grained Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25] Jitendra Malik,et al. SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26] Matthieu Guillaumin,et al. Segmentation Propagation in ImageNet , 2012, ECCV.

[27] Radu Tudor Ionescu,et al. Local Learning to Improve Bag of Visual Words Model for Facial Expression Recognition , 2013 .

[28] Arnold W. M. Smeulders,et al. Fine-Grained Categorization by Alignments , 2013, 2013 IEEE International Conference on Computer Vision.

[29] Pietro Perona,et al. Bird Species Categorization Using Pose Normalized Deep Convolutional Nets , 2014, ArXiv.

[30] David J. Fleet,et al. Computer Vision – ECCV 2014 , 2014, Lecture Notes in Computer Science.

[31] Pietro Perona,et al. Improved Bird Species Recognition Using Pose Normalized Deep Convolutional Nets , 2014, BMVC.

[32] Forrest N. Iandola,et al. Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction , 2013, 2013 IEEE International Conference on Computer Vision.

[33] Mads Nielsen,et al. Computer Vision — ECCV 2002 , 2002, Lecture Notes in Computer Science.

[34] Andrew Zisserman,et al. Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[35] Andrew Zisserman,et al. Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36] Frédéric Jurie,et al. Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.