A linear-time approximation of the earth mover's distance

Color descriptors are one of the important features used in content-based image retrieval. The dominant color descriptor (DCD) represents a few perceptually dominant colors in an image through color quantization. For image retrieval based on DCD, the earth mover's distance and the optimal color composition distance are proposed to measure the dissimilarity between two images. Although providing good retrieval results, both methods are too time-consuming to be used in a large image database. To solve the problem, we propose a new distance function that calculates an approximate earth mover's distance in linear time. To calculate the dissimilarity in linear time, the proposed approach employs the space-filling curve for multidimensional color space. To improve the accuracy, the proposed approach uses multiple curves and adjusts the color positions. As a result, our approach achieves order-of-magnitude time improvement but incurs small errors. We have performed extensive experiments to show the effectiveness and efficiency of the proposed approach. The results reveal that our approach achieves almost the same results with the EMD in linear time.

[1]  James Ze Wang,et al.  Content-based image retrieval: approaches and trends of the new age , 2005, MIR '05.

[2]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[3]  S. Rachev The Monge–Kantorovich Mass Transference Problem and Its Stochastic Applications , 1985 .

[4]  Anthony K. H. Tung,et al.  Efficient and effective similarity search over probabilistic data based on Earth Mover’s Distance , 2010, The VLDB Journal.

[5]  B. S. Manjunath,et al.  An efficient color representation for image retrieval , 2001, IEEE Trans. Image Process..

[6]  Cordelia Schmid,et al.  Coloring Local Feature Extraction , 2006, ECCV.

[7]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  David W. Jacobs,et al.  Approximate earth mover’s distance in linear time , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Sariel Har-Peled,et al.  Efficiently approximating the minimum-volume bounding box of a point set in three dimensions , 1999, SODA '99.

[10]  Jeremy Birn,et al.  Digital Lighting and Rendering , 2006 .

[11]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[12]  Martin J. Shepperd,et al.  Estimating Software Project Effort Using Analogies , 1997, IEEE Trans. Software Eng..

[13]  Larry S. Davis,et al.  Human detection using partial least squares analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Jianying Hu,et al.  Extraction of perceptually important colors and similarity measurement for image matching, retrieval and analysis , 2002, IEEE Trans. Image Process..

[15]  Yufei Tao,et al.  The Bdual-Tree: indexing moving objects by space filling curves in the dual space , 2008, The VLDB Journal.

[16]  Michael S. Landy,et al.  Visual perception of texture , 2002 .

[17]  Ira Assent,et al.  Approximation Techniques for Indexing the Earth Mover’s Distance in Multimedia Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[18]  Ying Liu,et al.  A survey of content-based image retrieval with high-level semantics , 2007, Pattern Recognit..

[19]  Rong Yan,et al.  Recent developments in content-based and concept-based image/video retrieval , 2008, ACM Multimedia.

[20]  Ira Assent,et al.  Efficient EMD-based similarity search in multimedia databases via flexible dimensionality reduction , 2008, SIGMOD Conference.

[21]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[22]  Ken C. K. Lee,et al.  Approaching the Skyline in Z Order , 2007, VLDB.

[23]  H. V. Jagadish,et al.  Analysis of the Hilbert Curve for Representing Two-Dimensional Space , 1997, Inf. Process. Lett..

[24]  Miguel Tavares Coimbra,et al.  MPEG-7 Visual Descriptors—Contributions for Automated Feature Extraction in Capsule Endoscopy , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Christos Faloutsos,et al.  Analysis of the Clustering Properties of the Hilbert Space-Filling Curve , 2001, IEEE Trans. Knowl. Data Eng..

[26]  Haibin Ling,et al.  An Efficient Earth Mover's Distance Algorithm for Robust Histogram Comparison , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Ambuj K. Singh,et al.  Indexing Spatially Sensitive Distance Measures Using Multi-resolution Lower Bounds , 2006, EDBT.

[28]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.