SWAM : A Family of Access Methods for Similarity Search in Querical Data Networks

Querical Data Networks (QDNs), e.g., peer-topeer and sensor networks, are large-scale, selforganizing, distributed query processing systems. We formalize the problem of similarity search in QDNs and propose a family of distributed access methods, termed Small-World Access Methods (SWAM), which unlike LH∗ and (more recently) DHTs does not control the assignment of data objects to QDN nodes. We propose a Voronoi-based instance of SWAM that indexes multi-attribute objects and for a QDN with N nodes has query time, communication cost, and computation cost of O(log N) for exact-match queries, and O(log N +sN) and O(log N + k) for range queries (with selectivity s) and kNN queries, respectively.

[1]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[2]  Atsuyuki Okabe,et al.  Spatial Tessellations: Concepts and Applications of Voronoi Diagrams , 1992, Wiley Series in Probability and Mathematical Statistics.

[3]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[4]  Pedro M. Domingos,et al.  Reconciling schemas of disparate data sources: a machine-learning approach , 2001, SIGMOD '01.

[5]  Gurmeet Singh Manku,et al.  SETS: search enhanced by topic segmentation , 2003, SIGIR.

[6]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[7]  Hector Garcia-Molina,et al.  Designing a super-peer network , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[8]  Karl Aberer,et al.  P-Grid: a self-organizing structured P2P system , 2003, SGMD.

[9]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[10]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[11]  Farnoush Banaei Kashani,et al.  Searchable Querical Data Networks , 2003, DBISP2P.

[12]  Jon M. Kleinberg,et al.  The small-world phenomenon: an algorithmic perspective , 2000, STOC '00.

[13]  Divyakant Agrawal,et al.  Approximate Range Selection Queries in Peer-to-Peer Systems , 2003, CIDR.

[14]  Gonzalo Navarro Searching in metric spaces by spatial approximation , 2002, The VLDB Journal.

[15]  Witold Litwin,et al.  LH*—a scalable, distributed data structure , 1996, TODS.

[16]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[17]  Béla Bollobás,et al.  Random Graphs , 1985 .

[18]  Farnoush Banaei Kashani,et al.  Brief announcement: efficient flooding in power-law networks , 2003, PODC '03.

[19]  Raimund Seidel,et al.  Exact Upper Bounds for the Number of Faces in d-Dimensional Voronoi Diagrams , 1990, Applied Geometry And Discrete Mathematics.

[20]  Deborah Estrin,et al.  Data-Centric Storage in Sensornets with GHT, a Geographic Hash Table , 2003, Mob. Networks Appl..