Robust Spatial Matching for Object Retrieval and Its Parallel Implementation on GPU

Spatial matching for object retrieval is often time-consuming and susceptible to viewpoint changes. To address this problem, we propose a novel spatial matching method that is robust to viewpoint changes and implement it on modern graphics processing unit (GPU) in parallel for real-time applications. Unlike previous spatial matching methods used in object retrieval, in which the affine transformation estimation is based on the gravity vector assumption, our method abandons this strong assumption by matching the affine covariant neighbors (ACNs) of corresponding local regions and estimating affine transformation from each single pair of corresponding local regions. Taking into account real-time applications, we implement the method on modern GPU in parallel to speed up the process. Computations are distributed evenly to threads with load balancing, and device memory accesses are optimized with bitmap-based parallel scan. Experimental results demonstrate that our method is more robust and more efficient than previous methods especially when the viewpoints are changed, and the parallel implementation on GPU obtains ten times speedup.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[3]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[4]  Hermann Ney,et al.  Shared-memory parallelization for content-based image retrieval , 2006 .

[5]  Jiri Matas,et al.  Locally Optimized RANSAC , 2003, DAGM-Symposium.

[6]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[7]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[8]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[10]  Christoph H. Lampert Detecting objects in large image collections and videos by efficient subimage retrieval , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[12]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[13]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  B. S. Manjunath,et al.  An efficient color representation for image retrieval , 2001, IEEE Trans. Image Process..

[16]  Jitendra Malik,et al.  Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[17]  Yuning Jiang,et al.  Interactive visual object search through mutual information maximization , 2010, ACM Multimedia.

[18]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[19]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[20]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Eero P. Simoncelli,et al.  A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients , 2000, International Journal of Computer Vision.

[22]  Jianbo Shi,et al.  Segmentation given partial grouping constraints , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Mark J. Harris,et al.  Parallel Prefix Sum (Scan) with CUDA , 2011 .

[25]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[26]  Wen Gao,et al.  Effective and efficient object-based image retrieval using visual phrases , 2006, MM '06.

[27]  Jiri Matas,et al.  Geometric min-Hashing: Finding a (thick) needle in a haystack , 2009, CVPR.

[28]  Bingsheng He,et al.  Parallel Data Mining on Graphics Processors , 2011 .

[29]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[30]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[32]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Yakup Genc,et al.  GPU-based Video Feature Tracking And Matching , 2006 .

[34]  Gustavo Carneiro,et al.  Flexible Spatial Configuration of Local Image Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Steve Mann,et al.  Computer vision signal processing on graphics processing units , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[36]  Ming Yang,et al.  Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.