论文信息 - When Na\"ive Bayes Nearest Neighbours Meet Convolutional Neural Networks

When Na\"ive Bayes Nearest Neighbours Meet Convolutional Neural Networks

Since Convolutional Neural Networks (CNNs) have become the leading learning paradigm in visual recognition, Naive Bayes Nearest Neighbour (NBNN)-based classifiers have lost momentum in the community. This is because (1) such algorithms cannot use CNN activations as input features; (2) they cannot be used as final layer of CNN architectures for end-to-end training , and (3) they are generally not scalable and hence cannot handle big data. This paper proposes a framework that addresses all these issues, thus bringing back NBNNs on the map. We solve the first by extracting CNN activations from local patches at multiple scale levels, similarly to [1]. We address simultaneously the second and third by proposing a scalable version of Naive Bayes Non-linear Learning (NBNL, [2]). Results obtained using pre-trained CNNs on standard scene and domain adaptation databases show the strength of our approach, opening a new season for NBNNs.

Ilja Kuzborskij | Fabio Maria Carlucci | Barbara Caputo | B. Caputo | Ilja Kuzborskij

[1] Mario Fritz,et al. The Pooled NBNN Kernel: Beyond Image-to-Class and Image-to-Image , 2012, ACCV.

[2] Yihong Gong,et al. Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[3] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[4] Luc Van Gool,et al. Iterative Nearest Neighbors , 2015, Pattern Recognit..

[5] David G. Lowe,et al. Local Naive Bayes Nearest Neighbor for image classification , 2011, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Hidekazu Oiwa,et al. Partition-wise Linear Models , 2014, NIPS.

[7] Antonio Torralba,et al. Recognizing indoor scenes , 2009, CVPR.

[8] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.

[9] S T Roweis,et al. Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[10] Tong Zhang,et al. Improved Local Coordinate Coding using Local Tangents , 2010, ICML.

[11] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[12] Yuan Shi,et al. Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Prasoon Goyal,et al. Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[14] Liang-Tien Chia,et al. Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Jitendra Malik,et al. Training Deformable Part Models with Decorrelated Features , 2013, 2013 IEEE International Conference on Computer Vision.

[16] Trevor Darrell,et al. Nearest-Neighbor Searching and Metric Space Dimensions , 2006 .

[17] Trevor Hastie,et al. The Elements of Statistical Learning , 2001 .

[18] Hien Van Nguyen,et al. Non-linear and Sparse Representations for Multi-Modal Recognition , 2013 .

[19] Michael I. Jordan,et al. Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[20] Svetlana Lazebnik,et al. Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[21] Eli Shechtman,et al. In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Ling Huang,et al. Large-Margin Convex Polytope Machine , 2014, NIPS.

[23] Trevor Darrell,et al. The NBNN kernel , 2011, 2011 International Conference on Computer Vision.

[24] Fei-Fei Li,et al. What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[25] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[26] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[27] Horst Bischof,et al. Optimizing 1-Nearest Prototype Classifiers , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[29] Cordelia Schmid,et al. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30] Rama Chellappa,et al. Generalized Domain-Adaptive Dictionaries , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[31] Alan L. Yuille,et al. The Concave-Convex Procedure , 2003, Neural Computation.

[32] Jorma Laaksonen,et al. Convolutional Network Features for Scene Recognition , 2014, ACM Multimedia.

[33] Anton van den Hengel,et al. The treasure beneath convolutional layers: Cross-convolutional-layer pooling for image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Barbara Caputo,et al. Scene Recognition with Naive Bayes Non-linear Learning , 2014, 2014 22nd International Conference on Pattern Recognition.

[35] Dieter Fox,et al. Hierarchical Matching Pursuit for Image Classification: Architecture and Fast Algorithms , 2011, NIPS.

[36] Barbara Caputo,et al. Multiclass Latent Locally Linear Support Vector Machines , 2013, ACML.

[37] Julien Mairal,et al. Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization , 2013, NIPS.

[38] Xiaodong Yang,et al. EigenJoints-based action recognition using Naïve-Bayes-Nearest-Neighbor , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[39] Guillermo Sapiro,et al. Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[40] Barbara Caputo,et al. Learning to Learn, from Transfer Learning to Domain Adaptation: A Unifying Perspective , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[41] Luc Van Gool,et al. Naive Bayes Image Classification: Beyond Nearest Neighbors , 2012, ACCV.

[42] David A. McAllester,et al. A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[43] Cewu Lu,et al. Learning Important Spatial Pooling Regions for Scene Classification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Gang Wang,et al. Exemplar based Deep Discriminative and Shareable Feature Learning for scene image classification , 2015, Pattern Recognit..

[45] Barbara Caputo,et al. Frustratingly Easy NBNN Domain Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[46] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .

[47] Kristen Grauman,et al. Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[48] Philip H. S. Torr,et al. Learning Anchor Planes for Classification , 2011, NIPS.

[49] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[50] Yihong Gong,et al. Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[51] Krista A. Ehinger,et al. SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52] Liang-Tien Chia,et al. Image-to-Class Distance Metric Learning for Image Classification , 2010, ECCV.

[53] Philip H. S. Torr,et al. Locally Linear Support Vector Machines , 2011, ICML.