CHCF: A Cloud-Based Heterogeneous Computing Framework for Large-Scale Image Retrieval

The last decade has witnessed a dramatic growth of multimedia content and applications, which in turn requires an increasing demand of computational resources. Meanwhile, the high-performance computing world undergoes a trend toward heterogeneity. However, it is never easy to develop domain-specific applications on heterogeneous systems while maximizing the system efficiency. In this paper, a novel framework, namely, cloud-based heterogeneous computing framework (CHCF), is proposed with a set of tools and techniques for compilation, optimization, and execution of multimedia mining applications on heterogeneous systems. With the aid of the compiler and the utility library provided by CHCF, users are able to develop multimedia mining applications rapidly and efficiently. The proposed framework employs a number of techniques, including adaptive data partitioning, knowledge-based hierarchical scheduling, and performance estimation, to achieve high computing performance. As one of the most important multimedia mining applications, large-scale image retrieval is investigated based on the proposed CHCF. The scalability, computing performance, and programmability of CHCF are studied for large-scale image retrieval by case studies and experimental evaluations. The experimental results demonstrate that CHCF can achieve good scalability and significant computing performance improvements for image retrieval.

[1]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[2]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Kilian Q. Weinberger,et al.  Reliable tags using image similarity: mining specificity and expertise from large-scale multimedia databases , 2009, WSMC '09.

[5]  Teresa H. Y. Meng,et al.  Merge: a programming model for heterogeneous multi-core systems , 2008, ASPLOS.

[6]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[8]  Joel H. Saltz,et al.  DataCutter: Middleware for Filtering Very Large Scientific Datasets on Archival Storage Systems , 2000, IEEE Symposium on Mass Storage Systems.

[9]  Shuwu Zhang,et al.  Three components for large scale image retrieval , 2012, 2012 International Conference on Image Analysis and Signal Processing.

[10]  David I. August,et al.  Software-controlled fault tolerance , 2005, TACO.

[11]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Jimmy J. Lin,et al.  Web-scale computer vision using MapReduce for multimedia data mining , 2010, MDMKDD '10.

[13]  Yannis Avrithis,et al.  To Aggregate or Not to aggregate: Selective Match Kernels for Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Matti A. Hiltunen,et al.  Coyote: a system for constructing fine-grain configurable communication services , 1998, TOCS.

[15]  Greg Stitt,et al.  Elastic computing: A portable optimization framework for hybrid computers , 2012, Parallel Comput..

[16]  Tyng-Yeu Liang,et al.  An OpenMP Compiler for Hybrid CPU/GPU Computing Architecture , 2011, 2011 Third International Conference on Intelligent Networking and Collaborative Systems.

[17]  Shirish Tatikonda,et al.  SystemML: Declarative machine learning on MapReduce , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[18]  Rudolf Eigenmann,et al.  OpenMP to GPGPU: a compiler framework for automatic translation and optimization , 2009, PPoPP '09.

[19]  Ümit V. Çatalyürek,et al.  Improving performance of adaptive component-based dataflow middleware , 2012, Parallel Comput..

[20]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[21]  Naga K. Govindaraju,et al.  Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[22]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[23]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[24]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[25]  Rainer Lienhart,et al.  Image retrieval on large-scale image databases , 2007, CIVR '07.

[26]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[27]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[28]  Laurent Amsaleg,et al.  Indexing and searching 100M images with map-reduce , 2013, ICMR.

[29]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Shih-Fu Chang,et al.  Query-Adaptive Image Search With Hash Codes , 2013, IEEE Transactions on Multimedia.

[32]  Theodore Andronikos,et al.  Distributed dynamic load balancing for pipelined computations on heterogeneous systems , 2011, Parallel Comput..

[33]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[34]  Gagan Agrawal,et al.  AUTO-GC: Automatic translation of data mining applications to GPU clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[35]  Yves Robert,et al.  Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms , 2004, Third International Symposium on Parallel and Distributed Computing/Third International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks.

[36]  Cédric Augonnet,et al.  StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..

[37]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[38]  Jan-Michael Frahm,et al.  Comparative Evaluation of Binary Features , 2012, ECCV.

[39]  Michi Henning,et al.  A new approach to object-oriented middleware , 2004, IEEE Internet Computing.