Matching dominance: capture the semantics of dominance for multi-dimensional uncertain objects

The dominance operator plays an important role in a wide spectrum of multi-criteria decision making applications. Generally speaking, a dominance operator is a <i>partial order</i> on a set O of objects, and we say the dominance operator has the monotonic property regarding a family of ranking functions F if <i>o</i><sub>1</sub> <i>dominates</i> <i>o</i><sub>2</sub> implies <i>f</i>(<i>o</i><sub>1</sub>) ≥ <i>f</i>(<i>o</i><sub>2</sub>) for any ranking function <i>f</i> ∈ F and objects <i>o</i><sub>1</sub>, <i>o</i><sub>2</sub> ∈ O. The dominance operator on the multi-dimensional points is well defined, which has the monotonic property regarding any monotonic ranking (scoring) function. Due to the uncertain nature of data in many emerging applications, a variety of existing works have studied the semantics of ranking query on uncertain objects. However, the problem of dominance operator against multi-dimensional uncertain objects remains open. Although there are several attempts to propose dominance operator on multi-dimensional uncertain objects, none of them claims the monotonic property on these ranking approaches. Motivated by this, in this paper we propose a novel <i>matching</i> based <i>dominance</i> operator, namely <b>matching dominance</b>, to capture the semantics of the dominance for multi-dimensional uncertain objects so that the new dominance operator has the monotonic property regarding the monotonic <i>parameterized ranking</i> function, which can unify other popular ranking approaches for uncertain objects. Then we develop a layer indexing technique, Matching Dominance based Band (<b>MDB</b>), to facilitate the top <i>k</i> queries on multi-dimensional uncertain objects based on the <i>matching dominance</i> operator proposed in this paper. Efficient algorithms are proposed to compute the MDB index. Comprehensive experiments convincingly demonstrate the effectiveness and efficiency of our indexing techniques.

[1]  Susanne E. Hambrusch,et al.  Orion 2.0: native support for uncertain data , 2008, SIGMOD Conference.

[2]  Jian Pei,et al.  Ranking queries on uncertain data: a probabilistic threshold approach , 2008, SIGMOD Conference.

[3]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[4]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[5]  Xiang Lian,et al.  Ranked Query Processing in Uncertain Databases , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  John R. Smith,et al.  The onion technique: indexing for linear optimization queries , 2000, SIGMOD '00.

[7]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[8]  Muhammad Aamir Cheema,et al.  Stochastic skyline operator , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[9]  Stanley B. Zdonik,et al.  Top-k queries on uncertain data: on score distribution and typical answers , 2009, SIGMOD Conference.

[10]  Ilaria Bartolini,et al.  The Skyline of a Probabilistic Relation , 2013, IEEE Transactions on Knowledge and Data Engineering.

[11]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[12]  Kyriakos Mouratidis,et al.  Continuous monitoring of top-k queries over sliding windows , 2006, SIGMOD Conference.

[13]  Moshe Shaked,et al.  Stochastic orders and their applications , 1994 .

[14]  Muhammad Aamir Cheema,et al.  Stochastic skylines , 2012, TODS.

[15]  LiJian,et al.  A unified approach to ranking in probabilistic databases , 2011, VLDB 2011.

[16]  Xi Zhang,et al.  On the semantics and evaluation of top-k queries in probabilistic databases , 2008, ICDE Workshops.

[17]  Lei Zou,et al.  Pareto-Based Dominant Graph: An Efficient Indexing Structure to Answer Top-K Queries , 2008, IEEE Transactions on Knowledge and Data Engineering.

[18]  Feifei Li,et al.  Semantics of Ranking Queries for Probabilistic Data and Expected Ranks , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[19]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[20]  Dan Suciu,et al.  Efficient query evaluation on probabilistic databases , 2004, The VLDB Journal.

[21]  Dorit S. Hochbaum,et al.  The Pseudoflow Algorithm: A New Algorithm for the Maximum-Flow Problem , 2008, Oper. Res..

[22]  Christian Böhm,et al.  Probabilistic skyline queries , 2009, CIKM.

[23]  Lise Getoor,et al.  PrDB: managing and exploiting rich correlations in probabilistic databases , 2009, The VLDB Journal.

[24]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.