Processing Top-k Dominating Queries in Metric Spaces

<i>Top</i>-<i>k</i> <i>dominating queries</i> combine the natural idea of selecting the <i>k</i> best items with a comprehensive “goodness” criterion based on dominance. A point <i>p</i><sub>1</sub> dominates <i>p</i><sub>2</sub> if <i>p</i><sub>1</sub> is as good as <i>p</i><sub>2</sub> in all attributes and is strictly better in at least one. Existing works address the problem in settings where data objects are multidimensional points. However, there are domains where we only have access to the distance between two objects. In cases like these, attributes reflect distances from a set of input objects and are dynamically generated as the input objects change. Consequently, prior works from the literature cannot be applied, despite the fact that the dominance relation is still meaningful and valid. For this reason, in this work, we present the first study for processing <i>top-<i>k</i> dominating queries</i> over distance-based dynamic attribute vectors, defined over a <i>metric space</i>. We propose four progressive algorithms that utilize the properties of the underlying metric space to efficiently solve the problem and present an extensive, comparative evaluation on both synthetic and real-world datasets.

[1]  Xiang Lian,et al.  Top-k dominating queries in uncertain databases , 2009, EDBT '09.

[2]  Jukka Teuhola,et al.  Heaps and Heapsort on Secondary Storage , 1999, Theor. Comput. Sci..

[3]  Yannis Manolopoulos,et al.  Fast Nearest-Neighbor Query Processing in Moving-Object Databases , 2003, GeoInformatica.

[4]  Sharad Mehrotra,et al.  Progressive approximate aggregate queries with a multi-resolution tree structure , 2001, SIGMOD '01.

[5]  Yannis Manolopoulos,et al.  Progressive processing of subspace dominating queries , 2011, The VLDB Journal.

[6]  Cyrus Shahabi,et al.  The spatial skyline queries , 2006, VLDB.

[7]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[8]  Yannis Manolopoulos,et al.  Continuous Top-k Dominating Queries , 2012, IEEE Transactions on Knowledge and Data Engineering.

[9]  Xiang Lian,et al.  Dynamic skyline queries in metric spaces , 2008, EDBT '08.

[10]  Verena Kantere,et al.  Top-k dominant web services under multi-criteria matching , 2009, EDBT '09.

[11]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[12]  Gerhard Weikum,et al.  ACM Transactions on Database Systems , 2005 .

[13]  Walid G. Aref,et al.  Supporting top-kjoin queries in relational databases , 2004, The VLDB Journal.

[14]  Christos Doulkeridis,et al.  Angle-based space partitioning for efficient parallel skyline computation , 2008, SIGMOD Conference.

[15]  Hanan Samet,et al.  Index-driven similarity search in metric spaces (Survey Article) , 2003, TODS.

[16]  Man Lung Yiu,et al.  Multi-dimensional top-k dominating queries , 2009, The VLDB Journal.

[17]  Heng Tao Shen,et al.  Multi-source Skyline Query Processing in Road Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[18]  Kostas Tsichlas,et al.  Dynamic Processing of Dominating Queries with Performance Guarantees , 2014, ICDT.

[19]  Wang-Chien Lee,et al.  Processing k nearest neighbor queries in location-aware sensor networks , 2007, Signal Process..

[20]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[21]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[22]  Luis Gravano,et al.  Evaluating top-k queries over web-accessible databases , 2004, TODS.

[23]  J. Shane Culpepper,et al.  Efficient in-memory top-k document retrieval , 2012, SIGIR '12.

[24]  Ricardo A. Baeza-Yates,et al.  Searching in metric spaces , 2001, CSUR.

[25]  David Fuhry,et al.  Efficient skyline computation in metric space , 2009, EDBT '09.

[26]  Man Lung Yiu,et al.  Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data , 2007, VLDB.

[27]  Ihab F. Ilyas,et al.  A survey of top-k query processing techniques in relational database systems , 2008, CSUR.

[28]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[29]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[30]  Ashwin Lall,et al.  Randomized Multi-pass Streaming Skyline Algorithms , 2009, Proc. VLDB Endow..

[31]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[32]  Yufei Tao,et al.  On finding skylines in external memory , 2011, PODS.

[33]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[34]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[35]  Z. Meral Özsoyoglu,et al.  Indexing large metric spaces for similarity search queries , 1999, TODS.

[36]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[37]  George Valkanas,et al.  Metric-Based Top-k Dominating Queries , 2014, EDBT.

[38]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[39]  Jian Pei,et al.  Threshold-based probabilistic top-k dominating queries , 2010, The VLDB Journal.

[40]  Xiang Lian,et al.  Efficient Processing of Metric Skyline Queries , 2009, IEEE Transactions on Knowledge and Data Engineering.

[41]  GunopulosDimitrios,et al.  Processing Top-k Dominating Queries in Metric Spaces , 2016 .

[42]  Kyriakos Mouratidis,et al.  Aggregate nearest neighbor queries in spatial databases , 2005, TODS.

[43]  R. Varshney,et al.  Supporting top-k join queries in relational databases , 2011 .