An experimental survey of regret minimization query and variants: bridging the best worlds between top-k query and skyline query

When faced with a database containing millions of tuples, a user may be only interested in a (typically much) smaller representative subset. Recently, a query called the regret minimization query was proposed toward this purpose to create such a subset for users. Specifically, this query finds a set of tuples that minimizes the user regret (measured by how far the user’s favorite tuple in the selected set is from his/her favorite tuple in the whole database). The regret minimization query was shown to be very useful in bridging the best worlds between two existing well-known queries, top-k queries and skyline queries: Like top-k queries, the total number of tuples returned in this new query is controllable, and like skyline queries, this new query does not require a user to specify any preference function. Thus, it has attracted a lot of attention from researchers in the database community. Various methods were proposed for regret minimization. However, despite the abundant research effort, there is no systematic comparison among the existing methods. This paper surveys this interesting and evolving research topic by broadly reviewing and comparing the state-of-the-art methods for regret minimization. Moreover, we study different variants of the regret minimization query that has garnered considerable attention in recent years and present some interesting problems that have not yet been addressed in the literature. We implemented 12 state-of-the-art methods published in top-tier venues such as SIGMOD and VLDB from 2010 to 2018 for obtaining regret minimization sets and give an experimental comparison under various parameter settings on both synthetic and real datasets. Our evaluation shows that the optimal choice of methods for regret minimization depends on the application demands. This paper provides an empirical guideline for making such a decision.

[1]  Jan Chomicki,et al.  Discovering Relative Importance of Skyline Attributes , 2009, Proc. VLDB Endow..

[2]  Pankaj K. Agarwal,et al.  Approximating extent measures of points , 2004, JACM.

[3]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[4]  Hanan Samet,et al.  K-Regret Queries Using Multiplicative Utility Functions , 2016, ACM Trans. Database Syst..

[5]  Cheng Long,et al.  Efficient k-Regret Query Algorithm with Restriction-free Bound for any Dimensionality , 2018, SIGMOD Conference.

[6]  Qi Dong,et al.  Efficient Approximate Algorithms for k-Regret Queries with Binary Constraints , 2018, WISA.

[7]  Stefano Battiston,et al.  A model of a trust-based recommendation system on a social network , 2006, Autonomous Agents and Multi-Agent Systems.

[8]  John R. Smith,et al.  The onion technique: indexing for linear optimization queries , 2000, SIGMOD '00.

[9]  Qi Dong,et al.  Efficient Processing of k-regret Queries via Skyline Priority , 2018, WISA.

[10]  Stavros Sintos,et al.  Faster Approximation Algorithm for the k-Regret Minimizing Set and Related Problems , 2018, ALENEX.

[11]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[12]  Ganesh Venkataraman,et al.  Personalized Job Recommendation System at LinkedIn: Practical Challenges and Lessons Learned , 2017, RecSys.

[13]  Qi Dong,et al.  Speed-Up Algorithms for Happiness-Maximizing Representative Databases , 2018, APWeb/WAIM Workshops.

[14]  Yuichi Yoshida,et al.  Regret Ratio Minimization in Multi-Objective Submodular Function Maximization , 2017, AAAI.

[15]  Raymond Chi-Wing Wong,et al.  Geometry approach for k-regret query , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[16]  R SmithJohn,et al.  The onion technique , 2000 .

[17]  Raymond Chi-Wing Wong,et al.  Strongly Truthful Interactive Regret Minimization , 2019, SIGMOD Conference.

[18]  Qi Dong,et al.  Efficient Processing of k-regret Queries via Skyline Frequency , 2018, WISA.

[19]  Ashwin Lall,et al.  k-Regret Queries with Nonlinear Utilities , 2015, Proc. VLDB Endow..

[20]  Anthony K. H. Tung,et al.  On High Dimensional Skylines , 2006, EDBT.

[21]  Yannis Manolopoulos,et al.  Domination Mining and Querying , 2007, DaWaK.

[22]  M. Lephoto Information Retrieval Techniques and Applications , 2017 .

[23]  Victor P. Il'ev,et al.  An Approximation Guarantee of the Greedy Descent Algorithm for Minimizing a Supermodular Set Function , 2001, Discret. Appl. Math..

[24]  Jiping Zheng,et al.  An Efficient Algorithm for Computing k-Average-Regret Minimizing Sets in Databases , 2018, WISA.

[25]  Marlene Goncalves,et al.  Top-k Skyline: A Unified Approach , 2005, OTM Workshops.

[26]  Raymond Chi-Wing Wong,et al.  Finding Average Regret Ratio Minimizing Set in Database , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[27]  H. Varian Microeconomic analysis : answers to exercises , 1992 .

[28]  Xuemin Lin,et al.  Selecting Stars: The k Most Representative Skyline Operator , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[29]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[30]  Xiang Lian,et al.  Top-k dominating queries in uncertain databases , 2009, EDBT '09.

[31]  Seung-won Hwang,et al.  Personalized top-k skyline queries in high-dimensional space , 2009, Inf. Syst..

[32]  Raymond Chi-Wing Wong,et al.  Minimizing Average Regret Ratio in Database , 2016, SIGMOD Conference.

[33]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[34]  Mark S. Ackerman,et al.  Expertise recommender: a flexible recommendation system and architecture , 2000, CSCW '00.

[35]  Abolfazl Asudeh,et al.  Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Representative , 2017, SIGMOD Conference.

[36]  Kazuhisa Makino,et al.  Interactive regret minimization , 2012, SIGMOD Conference.

[37]  Mohamed A. Soliman,et al.  Top-k Query Processing in Uncertain Databases , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[38]  Pankaj K. Agarwal,et al.  Practical Methods for Shape Fitting and Kinetic Data Structures using Coresets , 2004, SCG '04.

[39]  Mohamed A. Sharaf,et al.  Diversifying with Few Regrets, But too Few to Mention , 2015, ExploreDB@SIGMOD/PODS.

[40]  Jian Pei,et al.  Distance-Based Representative Skyline , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[41]  Subhash Suri,et al.  Efficient Algorithms for k-Regret Minimizing Sets , 2017, SEA.

[42]  S. Russel and P. Norvig,et al.  “Artificial Intelligence – A Modern Approach”, Second Edition, Pearson Education, 2003. , 2015 .

[43]  Abolfazl Asudeh,et al.  RRR: Rank-Regret Representative , 2018, SIGMOD Conference.

[44]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[45]  Richard J. Lipton,et al.  Regret-minimizing representative databases , 2010, Proc. VLDB Endow..

[46]  Raymond Chi-Wing Wong,et al.  k-Regret Minimizing Set: Efficient Algorithms and Hardness , 2017, ICDT.

[47]  Alex Thomo,et al.  Computing k-Regret Minimizing Sets , 2014, Proc. VLDB Endow..

[48]  Éva Tardos,et al.  Algorithm design , 2005 .

[49]  Stuart J. Russell,et al.  Artificial Intelligence , 1999 .