Visual Words Assignment Via Information-Theoretic Manifold Embedding

Codebook-based learning provides a flexible way to extract the contents of an image in a data-driven manner for visual recognition. One central task in such frameworks is codeword assignment, which allocates local image descriptors to the most similar codewords in the dictionary to generate histogram for categorization. Nevertheless, existing assignment approaches, e.g., nearest neighbors strategy (hard assignment) and Gaussian similarity (soft assignment), suffer from two problems: 1) too strong Euclidean assumption and 2) neglecting the label information of the local descriptors. To address the aforementioned two challenges, we propose a graph assignment method with maximal mutual information (GAMI) regularization. GAMI takes the power of manifold structure to better reveal the relationship of massive number of local features by nonlinear graph metric. Meanwhile, the mutual information of descriptor-label pairs is ultimately optimized in the embedding space for the sake of enhancing the discriminant property of the selected codewords. According to such objective, two optimization models, i.e., inexact-GAMI and exact-GAMI, are respectively proposed in this paper. The inexact model can be efficiently solved with a closed-from solution. The stricter exact-GAMI nonparametrically estimates the entropy of descriptor-label pairs in the embedding space and thus leads to a relatively complicated but still trackable optimization. The effectiveness of GAMI models are verified on both the public and our own datasets.

[1]  Chengjun Liu,et al.  Scene image classification: Some novel descriptors , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[2]  Jane Labadin,et al.  Feature selection based on mutual information , 2015, 2015 9th International Conference on IT in Asia (CITA).

[3]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[4]  Hansheng Wang,et al.  A note on tail dependence regression , 2013, J. Multivar. Anal..

[5]  Qionghai Dai,et al.  Low-Rank Structure Learning via Nonconvex Heuristic Recovery , 2010, IEEE Transactions on Neural Networks and Learning Systems.

[6]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[7]  Jian-Bo Yang,et al.  An Effective Feature Selection Method via Mutual Information Estimation , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Qionghai Dai,et al.  Differences Help Recognition: A Probabilistic Interpretation , 2014, PloS one.

[11]  Xuelong Li,et al.  Biologically Inspired Features for Scene Classification in Video Surveillance , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[12]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[13]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Wen Gao,et al.  Manifold-Manifold Distance with application to face recognition based on image set , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  C. L. Philip Chen,et al.  Hierarchical Feature Extraction With Local Neural Response for Image Recognition , 2013, IEEE Transactions on Cybernetics.

[16]  Qionghai Dai,et al.  Free-Viewpoint Video of Human Actors Using Multiple Handheld Kinects , 2013, IEEE Transactions on Cybernetics.

[17]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[18]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  George Casella,et al.  Statistical Inference Vol. 70 , 1990 .

[20]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[21]  Qionghai Dai,et al.  Noisy Depth Maps Fusion for Multiview Stereo Via Matrix Completion , 2012, IEEE Journal of Selected Topics in Signal Processing.

[22]  R.G. Baraniuk,et al.  Compressive Sensing [Lecture Notes] , 2007, IEEE Signal Processing Magazine.

[23]  Ruiping Wang,et al.  Manifold Discriminant Analysis , 2009, CVPR.

[24]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[25]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[26]  Qingzhao Zhang,et al.  LOCAL LEAST ABSOLUTE RELATIVE ERROR ESTIMATING APPROACH FOR PARTIALLY LINEAR MULTIPLICATIVE MODEL , 2013 .

[27]  Edwin R. Hancock,et al.  Clustering and Embedding Using Commute Times , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[29]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[34]  V. A. Epanechnikov Non-Parametric Estimation of a Multivariate Probability Density , 1969 .

[35]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Qionghai Dai,et al.  Commute time guided transformation for feature extraction , 2012, Comput. Vis. Image Underst..

[37]  Qionghai Dai,et al.  Visual words assignment on a graph via minimal mutual information loss , 2012, BMVC.

[38]  Qionghai Dai,et al.  Graph Laplace for Occluded Face Completion and Recognition , 2011, IEEE Transactions on Image Processing.

[39]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[41]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[42]  Svetlana Lazebnik,et al.  Supervised Learning of Quantizer Codebooks by Information Loss Minimization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Jun Yu,et al.  On Combining Multiple Features for Cartoon Character Retrieval and Clip Synthesis , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[44]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[45]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[46]  Chong-Ho Choi,et al.  Input Feature Selection by Mutual Information Based on Parzen Window , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[48]  É. Slezak,et al.  Density estimation with non{parametric methods ? , 1997, astro-ph/9704096.