Portmanteau Vocabularies for Multi-Cue Image Representation

We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[5]  Inderjit S. Dhillon,et al.  A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[6]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[7]  Naftali Tishby,et al.  Agglomerative Information Bottleneck , 1999, NIPS.

[8]  Fahad Shahbaz Khan,et al.  Top-down color attention for object recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Francis R. Bach,et al.  Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning , 2008, NIPS.

[11]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[12]  Manik Varma,et al.  More generality in efficient multiple kernel learning , 2009, ICML '09.

[13]  Pietro Perona,et al.  Visual Recognition with Humans in the Loop , 2010, ECCV.

[14]  Cordelia Schmid,et al.  Learning Color Names for Real-World Applications , 2009, IEEE Transactions on Image Processing.

[15]  Fahad Shahbaz Khan,et al.  Discriminative compact pyramids for object and scene recognition , 2012, Pattern Recognition.

[16]  Garrison W. Cottrell,et al.  Robust classification of objects, faces, and flowers using natural image statistics , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Satoshi Ito,et al.  Object Classification Using Heterogeneous Co-occurrence Features , 2010, ECCV.

[18]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Stefano Soatto,et al.  Localizing Objects with Smart Dictionaries , 2008, ECCV.

[20]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[21]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[22]  Fei-Fei Li,et al.  Combining randomization and discrimination for fine-grained image categorization , 2011, CVPR 2011.

[23]  A Treisman,et al.  Feature binding, attention and object perception. , 1998, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[24]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .