When Na\"ive Bayes Nearest Neighbours Meet Convolutional Neural Networks

Since Convolutional Neural Networks (CNNs) have become the leading learning paradigm in visual recognition, Naive Bayes Nearest Neighbour (NBNN)-based classifiers have lost momentum in the community. This is because (1) such algorithms cannot use CNN activations as input features; (2) they cannot be used as final layer of CNN architectures for end-to-end training , and (3) they are generally not scalable and hence cannot handle big data. This paper proposes a framework that addresses all these issues, thus bringing back NBNNs on the map. We solve the first by extracting CNN activations from local patches at multiple scale levels, similarly to [1]. We address simultaneously the second and third by proposing a scalable version of Naive Bayes Non-linear Learning (NBNL, [2]). Results obtained using pre-trained CNNs on standard scene and domain adaptation databases show the strength of our approach, opening a new season for NBNNs.

[1]  Mario Fritz,et al.  The Pooled NBNN Kernel: Beyond Image-to-Class and Image-to-Image , 2012, ACCV.

[2]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[3]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[4]  Luc Van Gool,et al.  Iterative Nearest Neighbors , 2015, Pattern Recognit..

[5]  David G. Lowe,et al.  Local Naive Bayes Nearest Neighbor for image classification , 2011, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Hidekazu Oiwa,et al.  Partition-wise Linear Models , 2014, NIPS.

[7]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[8]  Mark W. Schmidt,et al.  A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.

[9]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10]  Tong Zhang,et al.  Improved Local Coordinate Coding using Local Tangents , 2010, ICML.

[11]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[12]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Prasoon Goyal,et al.  Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[14]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jitendra Malik,et al.  Training Deformable Part Models with Decorrelated Features , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Trevor Darrell,et al.  Nearest-Neighbor Searching and Metric Space Dimensions , 2006 .

[17]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[18]  Hien Van Nguyen,et al.  Non-linear and Sparse Representations for Multi-Modal Recognition , 2013 .

[19]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[20]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[21]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Ling Huang,et al.  Large-Margin Convex Polytope Machine , 2014, NIPS.

[23]  Trevor Darrell,et al.  The NBNN kernel , 2011, 2011 International Conference on Computer Vision.

[24]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[26]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[27]  Horst Bischof,et al.  Optimizing 1-Nearest Prototype Classifiers , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[29]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Rama Chellappa,et al.  Generalized Domain-Adaptive Dictionaries , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[32]  Jorma Laaksonen,et al.  Convolutional Network Features for Scene Recognition , 2014, ACM Multimedia.

[33]  Anton van den Hengel,et al.  The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Barbara Caputo,et al.  Scene Recognition with Naive Bayes Non-linear Learning , 2014, 2014 22nd International Conference on Pattern Recognition.

[35]  Dieter Fox,et al.  Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms , 2011, NIPS.

[36]  Barbara Caputo,et al.  Multiclass Latent Locally Linear Support Vector Machines , 2013, ACML.

[37]  Julien Mairal,et al.  Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization , 2013, NIPS.

[38]  Xiaodong Yang,et al.  EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[39]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[40]  Barbara Caputo,et al.  Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Luc Van Gool,et al.  Naive Bayes Image Classification: Beyond Nearest Neighbors , 2012, ACCV.

[42]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Cewu Lu,et al.  Learning Important Spatial Pooling Regions for Scene Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Gang Wang,et al.  Exemplar based Deep Discriminative and Shareable Feature Learning for scene image classification , 2015, Pattern Recognit..

[45]  Barbara Caputo,et al.  Frustratingly Easy NBNN Domain Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[46]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[47]  Kristen Grauman,et al.  Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[48]  Philip H. S. Torr,et al.  Learning Anchor Planes for Classification , 2011, NIPS.

[49]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[50]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[51]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Liang-Tien Chia,et al.  Image-to-Class Distance Metric Learning for Image Classification , 2010, ECCV.

[53]  Philip H. S. Torr,et al.  Locally Linear Support Vector Machines , 2011, ICML.