Visual instance mining from the graph perspective

In this paper, we address the problem of visual instance mining, which is to automatically discover frequently appearing visual instances from a large collection of images. We propose a scalable mining method by leveraging the graph structure with images as vertices. Different from most existing approaches that focus on either instance-level similarities or image-level context properties, our method captures both information. In the proposed framework, the instance-level information is integrated during the construction of a sparse instance graph based on the similarity between augmented local features, while the image-level context is explored with a greedy breadth-first search algorithm to discover clusters of visual instances from the graph. This framework can tackle the challenges brought by small visual instances, diverse intra-class variations, as well as noise in large-scale image databases. To further improve the robustness, we integrate two techniques into the basic framework. First, to better cope with the increasing noise of large databases, weak geometric consistency is adopted to efficiently combine the geometric information of local matches into the construction of the instance graph. Second, we propose the layout embedding algorithm, which leverages the algorithm originally designed for graph visualization to fully explore the image database structure. The proposed method was evaluated on four annotated data sets with different characteristics, and experimental results showed the superiority over state-of-the-art algorithms on all data sets. We also applied our framework on a one-million Flickr data set and proved its scalability.

[1]  Andreas Noack,et al.  Energy Models for Graph Clustering , 2007, J. Graph Algorithms Appl..

[2]  Wei Li,et al.  Scalable Visual Instance Mining with Instance Graph , 2015, BMVC.

[3]  Robin Sibson,et al.  SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method , 1973, Comput. J..

[4]  Mathieu Bastian,et al.  Gephi: An Open Source Software for Exploring and Manipulating Networks , 2009, ICWSM.

[5]  Michael Isard,et al.  General Theory , 1969 .

[6]  Rainer Lienhart,et al.  Bundle min-hashing for logo recognition , 2013, ICMR '13.

[7]  Rainer Lienhart,et al.  Scalable logo recognition in real-world images , 2011, ICMR.

[8]  Yi Li,et al.  ARISTA - image search to annotation on billions of web photos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Michael Isard,et al.  Partition Min-Hash for Partial Duplicate Image Discovery , 2010, ECCV.

[10]  Jiri Matas,et al.  Geometric min-Hashing: Finding a (thick) needle in a haystack , 2009, CVPR.

[11]  Shih-Fu Chang,et al.  Internet image archaeology: automatically tracing the manipulation history of photographs on the web , 2008, ACM Multimedia.

[12]  Chong-Wah Ngo,et al.  Scalable Visual Instance Mining with Threads of Features , 2014, ACM Multimedia.

[13]  Hisashi Koga,et al.  Scalable Object Discovery: A Hash-Based Approach to Clustering Co-occurring Visual Words , 2011, IEICE Trans. Inf. Syst..

[14]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Panu Turcot,et al.  Better matching with fewer features: The selection of useful features in large database recognition problems , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[17]  Junsong Yuan,et al.  Visual pattern discovery in image and video data: a brief survey , 2014, WIREs Data Mining Knowl. Discov..

[18]  Yong Rui,et al.  Towards indexing representative images on the web , 2012, ACM Multimedia.

[19]  Chong-Wah Ngo,et al.  Topological Spatial Verification for Instance Search , 2015, IEEE Transactions on Multimedia.

[20]  Hong-Yuan Mark Liao,et al.  Per-Cluster Ensemble Kernel Learning for Multi-Modal Image Clustering With Group-Dependent Feature Selection , 2014, IEEE Transactions on Multimedia.

[21]  Frank Dellaert,et al.  Dataset fingerprints: Exploring image collections through data mining , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Zhu Zhu,et al.  Organizing photographs with geospatial and image semantics , 2017, Multimedia Systems.

[23]  Olivier Buisson,et al.  Scalable mining of small visual objects , 2012, ACM Multimedia.

[24]  Olivier Buisson,et al.  Object-based visual query suggestion , 2012, Multimedia Tools and Applications.

[25]  M. Jacomy,et al.  ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software , 2014, PloS one.

[26]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[27]  Jiri Matas,et al.  Fast computation of min-Hash signatures for image collections , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Qi Tian,et al.  Fast and accurate near-duplicate image search with affinity propagation on the ImageWeb , 2014, Comput. Vis. Image Underst..

[29]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[30]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[31]  Ying Wu,et al.  Spatial Random Partition for Common Visual Pattern Discovery , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[32]  Shuicheng Yan,et al.  Common visual pattern discovery via spatially coherent correspondences , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Andrew Zisserman,et al.  Object Mining Using a Matching Graph on Very Large Image Collections , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[35]  Yao Li,et al.  Mid-level deep pattern mining , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Zhe Wang,et al.  High-confidence near-duplicate image detection , 2012, ICMR.

[37]  Jia Chen,et al.  Exploitation and Exploration Balanced Hierarchical Summary for Landmark Images , 2015, IEEE Transactions on Multimedia.

[38]  Jiri Matas,et al.  Large-Scale Discovery of Spatially Related Images , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[40]  Luc Van Gool,et al.  Video mining with frequent itemset configurations , 2006 .

[41]  Noah Snavely,et al.  Graph-Based Discriminative Learning for Location Recognition , 2013, International Journal of Computer Vision.

[42]  Jian Zhang,et al.  Graph-based clustering and ranking for diversified image search , 2014, Multimedia Systems.

[43]  D. Defays,et al.  An Efficient Algorithm for a Complete Link Method , 1977, Comput. J..

[44]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Qinghua Hu,et al.  Semi-supervised image clustering with multi-modal information , 2014, Multimedia Systems.

[46]  Wei Li,et al.  Partial-Duplicate Clustering and Visual Pattern Discovery on Web Scale Image Database , 2015, IEEE Transactions on Multimedia.

[47]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, CVPR.

[48]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[49]  Junsong Yuan,et al.  Efficient Mining of Optimal AND/OR Patterns for Visual Recognition , 2015, IEEE Transactions on Multimedia.

[50]  Yan Wang,et al.  DeepBag: Recognizing Handbag Models , 2015, IEEE Transactions on Multimedia.

[51]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[52]  Olivier Buisson,et al.  Scalable Mining of Small Visual Objects (with new experiments) , 2013 .

[53]  Chong-Wah Ngo,et al.  Snap-and-ask: answering multimodal question by naming visual instance , 2012, ACM Multimedia.

[54]  S. Dongen A cluster algorithm for graphs , 2000 .