An Efficient Earth Mover's Distance Algorithm for Robust Histogram Comparison

We propose EMD-L1: a fast and exact algorithm for computing the earth mover's distance (EMD) between a pair of histograms. The efficiency of the new algorithm enables its application to problems that were previously prohibitive due to high time complexities. The proposed EMD-L1 significantly simplifies the original linear programming formulation of EMD. Exploiting the L1 metric structure, the number of unknown variables in EMD-L1 is reduced to O(N) from O(N2) of the original EMD for a histogram with N bins. In addition, the number of constraints is reduced by half and the objective function of the linear program is simplified. Formally, without any approximation, we prove that the EMD-L1 formulation is equivalent to the original EMD with a L1 ground distance. To perform the EMD-L1 computation, we propose an efficient tree-based algorithm, Tree-EMD. Tree-EMD exploits the fact that a basic feasible solution of the simplex algorithm-based solver forms a spanning tree when we interpret EMD-L1 as a network flow optimization problem. We empirically show that this new algorithm has an average time complexity of O(N2), which significantly improves the best reported supercubic complexity of the original EMD. The accuracy of the proposed methods is evaluated by experiments for two computation-intensive problems: shape recognition and interest point matching using multidimensional histogram-based local features. For shape recognition, EMD-L1 is applied to compare shape contexts on the widely tested MPEG7 shape data set, as well as an articulated shape data set. For interest point matching, SIFT, shape context and spin image are tested on both synthetic and real image pairs with large geometrical deformation, illumination change, and heavy intensity noise. The results demonstrate that our EMD-L1-based solutions outperform previously reported state-of-the-art features and distance measures in solving the two tasks

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Robert Tibshirani,et al.  1-norm Support Vector Machines , 2003, NIPS.

[3]  F. L. Hitchcock The Distribution of a Product from Several Sources to Numerous Localities , 1941 .

[4]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[5]  Michael Werman,et al.  A Unified Approach to the Change of Resolution: Space and Gray-Level , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[7]  James Lee Hafner,et al.  Efficient Color Histogram Indexing for Quadratic Form Distance Functions , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Zhuowen Tu,et al.  Shape Matching and Recognition - Using Generative Models and Informative Features , 2004, ECCV.

[9]  Arnold W. M. Smeulders,et al.  Image Databases and Multi-Media Search , 1998, Image Databases and Multi-Media Search.

[10]  Cordelia Schmid,et al.  A sparse texture representation using affine-invariant regions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  Leonidas J. Guibas,et al.  The Earth Mover's Distance under transformation sets , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[12]  Peter J. Bickel,et al.  The Earth Mover's distance is the Mallows distance: some insights from statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[13]  C. Mallows A Note on Asymptotic Joint Normality , 1972 .

[14]  W. James MacLean,et al.  Video-rate stereo depth measurement on programmable hardware , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Trevor Darrell,et al.  Fast contour matching using approximate earth mover's distance , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  Christopher J. Taylor,et al.  Transforming Pixel Signatures into an Improved Metric Space , 2002, BMVC.

[17]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[19]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Christopher J. Taylor,et al.  Measuring similarity between pixel signatures , 2002, Image Vis. Comput..

[21]  Jitendra Malik,et al.  Computational framework for determining stereo correspondence from a set of linear spatial filters , 1992, Image Vis. Comput..

[22]  Ulrich Eckhardt,et al.  Shape descriptors for non-rigid shapes with a single closed contour , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[23]  Helen C. Shen,et al.  Generalized texture representation and metric , 1983, Comput. Vis. Graph. Image Process..

[24]  Nicolai Petkov,et al.  Distance sets for shape filters and shape recognition , 2003, IEEE Trans. Image Process..

[25]  S. Rachev The Monge–Kantorovich Mass Transference Problem and Its Stochastic Applications , 1985 .

[26]  Björn Stenger,et al.  Shape context and chamfer matching in cluttered scenes , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[27]  Konstantinos N. Plataniotis,et al.  A Novel Vector-Based Approach to Color Image Retrieval Using a Vector Angular-Based Distance Measure , 1999, Comput. Vis. Image Underst..

[28]  Michael H. F. Wilkinson,et al.  Shape representation and recognition through morphological curvature scale spaces , 2006, IEEE Transactions on Image Processing.

[29]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[30]  Joachim M. Buhmann,et al.  Empirical evaluation of dissimilarity measures for color and texture , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[31]  Haibin Ling,et al.  Using the inner-distance for classification of articulated shapes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[34]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[35]  Jitendra Malik,et al.  A Computational Framework for Determining Stereo Correspondence from a Set of Linear Spatial Filters , 1991, ECCV.

[36]  Haibin Ling,et al.  EMD-L1: An Efficient and Robust Algorithm for Comparing Histogram-Based Descriptors , 2006, ECCV.

[37]  Linda G. Shapiro,et al.  A SIFT descriptor with global context , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[38]  Philip N. Klein,et al.  On Aligning Curves , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[41]  Narendra Karmarkar,et al.  A new polynomial-time algorithm for linear programming , 1984, Comb..

[42]  Josef Kittler,et al.  Efficient and Robust Retrieval by Shape Content through Curvature Scale Space , 1998, Image Databases and Multi-Media Search.

[43]  George O. Wesolowsky,et al.  THE WEBER PROBLEM: HISTORY AND PERSPECTIVES. , 1993 .

[44]  Azriel Rosenfeld,et al.  A distance metric for multidimensional histograms , 1985, Comput. Vis. Graph. Image Process..

[45]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[46]  Jitendra Malik,et al.  Recognizing objects in adversarial clutter: breaking a visual CAPTCHA , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[47]  Carlo Tomasi,et al.  Perceptual metrics for image database navigation , 1999 .

[48]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[49]  Hung-Khoon Tan,et al.  Common pattern discovery using earth mover's distance and local flow maximization , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[50]  Haibin Ling,et al.  Deformation invariant image matching , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.