An efficient fast-response content-based image retrieval framework for big data

In this paper, an efficient fast-response content-based image retrieval (CBIR) framework based on Hadoop MapReduce is proposed to operate stably with high performance targeting big data. It provides a novel bag of visual words (BOVW) technique based on a proposed chain-clustering binary search-tree (CC-BST) algorithm to build the visual statements for representing the image. As well, it introduces a proposed methodology for creating representatives for these visual statements as a solution for big-data' high-dimensionality. Further, those representatives are utilized to provide an indexing scheme for building one large file as an input for Hadoop. Moreover, an efficient-MapReduce technique is presented to exploit the created visual-representatives of the images to retrieve the top-relevant images for the input query. Empirical tests for the proposed techniques outperform the state-of-art compared techniques.

[1]  Jimmy J. Lin,et al.  Web-scale computer vision using MapReduce for multimedia data mining , 2010, MDMKDD '10.

[2]  Tomas Piatrik,et al.  Fusion in Computer Vision: Understanding Complex Visual Content , 2014 .

[3]  Yong Wang,et al.  Document Clustering with Semantic Analysis , 2006, Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06).

[4]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[5]  Lei Huang,et al.  Large-Scale Image Processing Research Cloud , 2014, CLOUD 2014.

[6]  Maisa Daoud,et al.  Content-Based Image Retrieval Using SOM and DWT , 2015 .

[7]  Victor S. Sheng,et al.  CGCI-SIFT: A More Efficient and Compact Representation of Local Descriptor , 2013 .

[8]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[9]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[10]  Yang Gao,et al.  A Content-Based Image Retrieval System Based on Hadoop and Lucene , 2012, 2012 Second International Conference on Cloud and Green Computing.

[11]  Meng Wang,et al.  Learning Visual Semantic Relationships for Efficient Visual Retrieval , 2015, IEEE Transactions on Big Data.

[12]  Pengpeng Zhao,et al.  A Comparative Study of SIFT and its Variants , 2013 .

[13]  Kunihiko Kaneko,et al.  PARALLEL IMAGE DATABASE PROCESSING WITH MAPREDUCE AND PERFORMANCE EVALUATION IN PSEUDO DISTRIBUTED MODE , 2012 .

[14]  Joan E. Beaudoin Content‐based image retrieval methods and professional image users , 2016, J. Assoc. Inf. Sci. Technol..

[15]  Laurent Amsaleg,et al.  Indexing and searching 100M images with map-reduce , 2013, ICMR.

[16]  Priyanka Jain,et al.  Content Based Image Retrieval on Hadoop Framework , 2015, 2015 IEEE International Congress on Big Data.

[17]  Pabitra Mitra,et al.  A survey on image retrieval performance of different bag of visual words indexing techniques , 2014, Proceedings of the 2014 IEEE Students' Technology Symposium.

[18]  Maulika S. Patel,et al.  Shape Extraction Using Edge Detection Techniques , 2014, ICTCS '14.

[19]  G. Jagajothi,et al.  Distributed Retrieval of Images using Particle Swarm Optimization and Hadoop , 2013 .

[20]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[21]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[22]  Mehmed Kantardzic,et al.  Data Mining: Concepts, Models, Methods, and Algorithms , 2002 .

[23]  DongSheng Yin,et al.  Content-Based Image Retrial Based on Hadoop , 2013 .

[24]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[25]  Chih-Fong Tsai,et al.  Bag-of-Words Representation in Image Annotation: A Review , 2012 .

[26]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  S. K. Singh,et al.  An Experimental Study on Content Based Image Retrieval Based On Number of Clusters Using Hierarchical Clustering Algorithm , 2014 .

[28]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.