Compact Descriptors for Visual Search

To ensure application interoperability in visual object search technologies, the MPEG Working Group has made great efforts in standardizing visual search technologies. Moreover, extraction and transmission of compact descriptors are valuable for next-generation, mobile, visual search applications. This article reviews the significant progress of MPEG Compact Descriptors for Visual Search (CDVS) in standardizing technologies that will enable efficient and interoperable design of visual search applications. In addition, the article presents the location search and recognition oriented data collection and benchmark under the MPEG CDVS evaluation framework.

[1]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[3]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Shuicheng Yan,et al.  Weakly-supervised hashing in kernel space , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[6]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[7]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[8]  Ling-yu Duan,et al.  Rate-adaptive Compact Fisher Codes for Mobile Visual Search , 2014, IEEE Signal Processing Letters.

[9]  Aggelos K. Katsaggelos,et al.  Virtual touring: A Content Based Image Retrieval application , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[10]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[11]  Tao Mei,et al.  Contextual Bag-of-Words for Visual Categorization , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[13]  Shuicheng Yan,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007 .

[14]  Prateek Jain,et al.  Fast image search for learned metrics , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Michael Brady,et al.  Feature-based correspondence: an eigenvector approach , 1992, Image Vis. Comput..

[16]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[17]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Meng Wang,et al.  Real-Time Video Copy-Location Detection in Large-Scale Repositories , 2011, IEEE MultiMedia.

[19]  Yuriy Reznik,et al.  On MPEG work towards a standard for visual search , 2011, Optical Engineering + Applications.

[20]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[21]  Jiawei Han,et al.  Orthogonal Laplacianfaces for Face Recognition , 2006, IEEE Transactions on Image Processing.

[22]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[23]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[24]  Aggelos K. Katsaggelos,et al.  Forest hashing: Expediting large scale image retrieval , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[25]  B. S. Manjunath,et al.  Cortina: a system for large-scale, content-based web image retrieval , 2004, MULTIMEDIA '04.

[26]  A. Hoffman,et al.  The variation of the spectrum of a normal matrix , 1953 .

[27]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[28]  Aggelos K. Katsaggelos,et al.  A novel image retrieval framework exploring inter cluster distance , 2010, 2010 IEEE International Conference on Image Processing.

[29]  Dacheng Tao,et al.  Bregman Divergence-Based Regularization for Transfer Subspace Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[30]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[31]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[32]  Alexandr Andoni,et al.  Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[33]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[35]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[36]  Zhigang Luo,et al.  Manifold Regularized Discriminative Nonnegative Matrix Factorization With Fast Gradient Descent , 2011, IEEE Transactions on Image Processing.

[37]  Matthew A. Brown,et al.  Learning Local Image Descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[40]  Wen Gao,et al.  Location Discriminative Vocabulary Coding for Mobile Landmark Search , 2011, International Journal of Computer Vision.

[41]  Nenghai Yu,et al.  Complementary hashing for approximate nearest neighbor search , 2011, 2011 International Conference on Computer Vision.

[42]  H. C. Longuet-Higgins,et al.  An algorithm for associating the features of two images , 1991, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[43]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Aggelos K. Katsaggelos,et al.  Large Visual Repository Search with Hash Collision Design Optimization , 2013, IEEE MultiMedia.

[45]  Jian Yang,et al.  Kernel ICA: An alternative formulation and its application to face recognition , 2005, Pattern Recognit..

[46]  Aggelos K. Katsaggelos,et al.  Spectral approximation to point set similarity metric , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[47]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Pengfei Shi,et al.  Kernel Grassmannian distances and discriminant analysis for face recognition from image sets , 2009, Pattern Recognit. Lett..

[49]  Bernhard Schölkopf,et al.  A kernel view of the dimensionality reduction of manifolds , 2004, ICML.

[50]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[51]  Hongbin Zha,et al.  Optimizing kd-trees for scalable visual descriptor indexing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, CVPR.

[53]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[55]  Vishal Chitkara Color-Based Image Retrieval Using Compact Binary Signatures , 2001 .

[56]  Shinji Umeyama,et al.  An Eigendecomposition Approach to Weighted Graph Matching Problems , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Bülent Sankur,et al.  Similarity Learning for 3D Object Retrieval Using Relevance Feedback and Risk Minimization , 2010, International Journal of Computer Vision.

[58]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[59]  Stefano Soatto,et al.  Proximity Distribution Kernels for Geometric Context in Category Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[60]  Bernd Girod,et al.  Fast geometric re-ranking for image-based retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[61]  Wen Gao,et al.  Towards low bit rate mobile visual search with multiple-channel coding , 2011, ACM Multimedia.

[62]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[64]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[66]  Shuicheng Yan,et al.  Common visual pattern discovery via spatially coherent correspondences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[67]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[68]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[69]  Zhe Wang,et al.  Modeling LSH for performance tuning , 2008, CIKM '08.

[70]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[71]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[72]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[73]  Trevor Darrell,et al.  Pyramid Match Hashing: Sub-Linear Time Indexing Over Partial Correspondences , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Bernd Girod,et al.  Memory-Efficient Image Databases for Mobile Visual Search , 2014, IEEE MultiMedia.

[75]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[76]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[77]  Daniel D. Lee,et al.  Grassmann discriminant analysis: a unifying view on subspace-based learning , 2008, ICML '08.

[78]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[79]  Gianluca Francini,et al.  Statistical modelling of outliers for fast visual search , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[80]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.