An Incremental Hausdorff Distance Calculation Algorithm

The Hausdorff distance is commonly used as a similarity measure between two point sets. Using this measure, a set X is considered similar to Y iff every point in X is close to at least one point in Y. Formally, the Hausdorff distance HausDist(X, Y) can be computed as the Max-Min distance from X to Y, i.e., find the maximum of the distance from an element in X to its nearest neighbor (NN) in Y. Although this is similar to the closest pair and farthest pair problems, computing the Hausdorff distance is a more challenging problem since its Max-Min nature involves both maximization and minimization rather than just one or the other. A traditional approach to computing HausDist(X, Y) performs a linear scan over X and utilizes an index to help compute the NN in Y for each x in X. We present a pair of basic solutions that avoid scanning X by applying the concept of aggregate NN search to searching for the element in X that yields the Hausdorff distance. In addition, we propose a novel method which incrementally explores the indexes of the two sets X and Y simultaneously. As an example application of our techniques, we use the Hausdorff distance as a measure of similarity between two trajectories (represented as point sets). We also use this example application to compare the performance of our proposed method with the traditional approach and the basic solutions. Experimental results show that our proposed method outperforms all competitors by one order of magnitude in terms of the tree traversal cost and total response time.

[1]  Hanan Samet,et al.  Incremental distance join algorithms for spatial databases , 1998, SIGMOD '98.

[2]  Günter Rote,et al.  Matching shapes with a reference point , 1994, SCG '94.

[3]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[4]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[5]  Sukho Lee,et al.  Adaptive multi-stage distance join processing , 2000, SIGMOD '00.

[6]  Walid G. Aref,et al.  An extensible index for spatial databases , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[7]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[8]  Wei-Ying Ma,et al.  Understanding mobility based on GPS data , 2008, UbiComp.

[9]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[10]  Yannis Manolopoulos,et al.  Closest pair queries in spatial databases , 2000, SIGMOD '00.

[11]  Bernhard Seeger,et al.  A revised r*-tree in comparison with related index structures , 2009, SIGMOD Conference.

[12]  Micha Sharir,et al.  Hausdorff distance under translation for points and balls , 2003, TALG.

[13]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[14]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[15]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Helmut Alt,et al.  Approximate Matching of Polygonal Shapes (Extended Abstract) , 1991, SCG.

[17]  J. Hershberger,et al.  Speeding Up the Douglas-Peucker Line-Simplification Algorithm , 1992 .

[18]  Hanan Samet,et al.  Foundations of multidimensional and metric data structures , 2006, Morgan Kaufmann series in data management systems.

[19]  Jon M. Kleinberg,et al.  On dynamic Voronoi diagrams and the minimum Hausdorff distance for point sets under Euclidean motion in the plane , 1992, SCG '92.

[20]  Kyriakos Mouratidis,et al.  Aggregate nearest neighbor queries in spatial databases , 2005, TODS.

[21]  Pierre-François Marteau,et al.  Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[23]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[24]  Carlos Andújar,et al.  Topology-reducing surface simplification using a discrete solid representation , 2002, TOGS.

[25]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[26]  Beng Chin Ooi,et al.  Gorder: An Efficient Method for KNN Join Processing , 2004, VLDB.

[27]  Piotr Indyk,et al.  Approximate congruence in nearly linear time , 2000, SODA '00.

[28]  Xing Xie,et al.  Mining interesting locations and travel sequences from GPS trajectories , 2009, WWW '09.

[29]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.