Laplacian embedding and key points topology verification for large scale mobile visual identification

Visual query-by-capture applications call for a compact visual descriptor with minimum descriptor length. Preserving the visual identification performance while minimising the bit rate is a focus of the on-going MPEG7 CDVS (Compact Descriptors for Visual Search) standardisation effort. In this paper we tackle this problem by adopting Laplacian embedding for SIFT feature compression and employing topology verification based on a novel graph cut measure. In contrast to previous feature compression schemes, we approach the problem by finding a Laplacian embedding that preserves the nearest neighbour relations in feature space. Furthermore, we develop an efficient yet effective topology verification (TV) scheme to perform spatial consistency checking. In contrast to previous works on geometric verification, instead of enumerating all possible combinations of coordinate alignments of an image pair, this TV solution verifies possibly misaligned coordinate sets with a learning method which acquires a proper boundary between the topology representation of matched and non-matched image pairs. Furthermore, this TV solution is invariant to in-plane rotation, scaling and is quite resilient to a range of out-of-plane rotations. The proposed Laplacian embedding and Topological verification scheme are tested with the CDVS dataset and are found to be effective.

[1]  Matthew A. Brown,et al.  Learning Local Image Descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Chris H. Q. Ding,et al.  A min-max cut algorithm for graph partitioning and data clustering , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[4]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[5]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[6]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[7]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[9]  Andrew B. Kahng,et al.  New spectral methods for ratio cut partitioning and clustering , 1991, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[10]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Gianluca Francini,et al.  Statistical modelling of outliers for fast visual search , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[12]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[13]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[14]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[19]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Aggelos K. Katsaggelos,et al.  Laplacian sift in visual search , 2012, ICASSP 2012.

[21]  Tao Mei,et al.  Contextual Bag-of-Words for Visual Categorization , 2011, IEEE Transactions on Circuits and Systems for Video Technology.