Parallel image search application based on online hashing hierarchical ranking

As we know that the nearest neighbor search is a good and effective method for large-scaled image search. This paper mainly focuses on the design and implementation of parallel image retrieval system based on Hadoop architecture and hashing binary code. Firstly, the whole architecture design of the system and the whole flow of image Hash retrieval are introduced. Then, the image feature extraction based on MAPREDUCE parallel data processing framework, as well as the next feature quantization and the construction method of the Hash index table are introduced in details. Finally, through the comparison of retrieval performances on the INRIA data set, it has been proved that the proposed system has better performance than the single node system on both the retrieval speed and the capability of dealing with massive data.

[1]  Hermann Ney,et al.  Features for image retrieval: an experimental comparison , 2008, Information Retrieval.

[2]  Shih-Fu Chang,et al.  Overview of the MPEG-7 standard , 2001, IEEE Trans. Circuits Syst. Video Technol..

[3]  Kunle Olukotun,et al.  Map-Reduce for Machine Learning on Multicore , 2006, NIPS.

[4]  Prateek Jain,et al.  Fast Similarity Search for Learned Metrics , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Mathias Lux Revisiting the Vector Retrieval Model in Context of the MPEG-7 Semantic Description Scheme , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[6]  Michael C. Schatz,et al.  CloudBurst: highly sensitive read mapping with MapReduce , 2009, Bioinform..

[7]  Gathering clouds and a sequencing storm , 2010, Nature Biotechnology.

[8]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[9]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[10]  Jing Huang,et al.  Image indexing using color correlograms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Yiannis S. Boutalis,et al.  FCTH: Fuzzy Color and Texture Histogram - A Low Level Feature for Accurate Image Retrieval , 2008, 2008 Ninth International Workshop on Image Analysis for Multimedia Interactive Services.

[12]  Abraham Silberschatz,et al.  HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads , 2009, Proc. VLDB Endow..

[13]  Sanjay Ghemawat,et al.  MapReduce: a flexible data processing tool , 2010, CACM.

[14]  Subhransu Maji,et al.  Max-margin additive classifiers for detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  GhemawatSanjay,et al.  The Google file system , 2003 .

[16]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Yiannis S. Boutalis,et al.  CEDD: Color and Edge Directivity Descriptor: A Compact Descriptor for Image Indexing and Retrieval , 2008, ICVS.

[18]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Ying Liu,et al.  A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[20]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[21]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[22]  Rohini K. Srihari,et al.  Image background search: combining object detection techniques with content-based image retrieval (CBIR) systems , 1999, Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL'99).

[23]  Mathias Lux,et al.  Lire: lucene image retrieval: an extensible java CBIR library , 2008, ACM Multimedia.