Efficient Processing of k-regret Minimization Queries with Theoretical Guarantees

Assisting end users to identify desired results from a large dataset is an important problem for multi-criteria decision making. To address this problem, top-k and skyline queries have been widely adopted, but they both have inherent drawbacks, i.e., the user either has to provide a specific utility function or faces many results. The k-regret minimization query is proposed, which integrates the merits of top-k and skyline queries. Due to the NP-hardness of the problem, the k-regret minimization query is time consuming and the greedy framework is widely adopted. However, formal theoretical analysis of the greedy approaches for the quality of the returned results is still lacking. In this paper, we first fill this gap by conducting a nontrivial theoretical analysis of the approximation ratio of the returned results. To speed up query processing, a sampling-based method, STOCPRESGREED, is developed to reduce the evaluation cost. In addition, a theoretical analysis of the required sample size is conducted to bound the quality of the returned results. Finally, comprehensive experiments are conducted on both real and synthetic datasets to demonstrate the efficiency and effectiveness of the proposed methods.

[1]  Yuichi Yoshida,et al.  Regret Ratio Minimization in Multi-Objective Submodular Function Maximization , 2017, AAAI.

[2]  Raymond Chi-Wing Wong,et al.  Strongly Truthful Interactive Regret Minimization , 2019, SIGMOD Conference.

[3]  RRR , 2019, Proceedings of the 2019 International Conference on Management of Data.

[4]  Qing Liu,et al.  Efficient Computation of the Skyline Cube , 2005, VLDB.

[5]  Ihab F. Ilyas,et al.  A survey of top-k query processing techniques in relational database systems , 2008, CSUR.

[6]  Jian Pei,et al.  Distance-Based Representative Skyline , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[7]  Stavros Sintos,et al.  Faster Approximation Algorithm for the k-Regret Minimizing Set and Related Problems , 2018, ALENEX.

[8]  Abolfazl Asudeh,et al.  Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Representative , 2017, SIGMOD Conference.

[9]  Tao Li,et al.  Ontology-enriched multi-document summarization in disaster management using submodular function , 2013, Inf. Sci..

[10]  Davide Martinenghi,et al.  Reconciling Skyline and Ranking Queries , 2017, Proc. VLDB Endow..

[11]  Raymond Chi-Wing Wong,et al.  Being Happy with the Least: Achieving α-happiness with Minimum Number of Tuples , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[12]  Tommy W. S. Chow,et al.  Opinion subset selection via submodular maximization , 2021, Inf. Sci..

[13]  Qi Dong,et al.  Speed-Up Algorithms for Happiness-Maximizing Representative Databases , 2018, APWeb/WAIM Workshops.

[14]  Andreas Krause,et al.  Lazier Than Lazy Greedy , 2014, AAAI.

[15]  Xuemin Lin,et al.  Selecting Stars: The k Most Representative Skyline Operator , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Raymond Chi-Wing Wong,et al.  An experimental survey of regret minimization query and variants: bridging the best worlds between top-k query and skyline query , 2019, The VLDB Journal.

[17]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[18]  László Lovász,et al.  Submodular functions and convexity , 1982, ISMP.

[19]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[20]  Ken-ichi Kawarabayashi,et al.  Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm , 2014, ICML.

[21]  Ashwin Lall,et al.  k-Regret Queries with Nonlinear Utilities , 2015, Proc. VLDB Endow..

[22]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[23]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[24]  Andreas Krause,et al.  Guarantees for Greedy Maximization of Non-submodular Functions with Applications , 2017, ICML.

[25]  Raymond Chi-Wing Wong,et al.  k-Regret Minimizing Set: Efficient Algorithms and Hardness , 2017, ICDT.

[26]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[27]  Cheng Long,et al.  Efficient k-Regret Query Algorithm with Restriction-free Bound for any Dimensionality , 2018, SIGMOD Conference.

[28]  Subhash Suri,et al.  Efficient Algorithms for k-Regret Minimizing Sets , 2017, SEA.

[29]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[30]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[31]  Ming Zhang,et al.  Skyline queries with constraints: Integrating skyline and traditional query operators , 2010, Data Knowl. Eng..

[32]  藤重 悟 Submodular functions and optimization , 1991 .

[33]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[34]  Alex Thomo,et al.  Computing k-Regret Minimizing Sets , 2014, Proc. VLDB Endow..

[35]  Abolfazl Asudeh,et al.  RRR: Rank-Regret Representative , 2018, SIGMOD Conference.

[36]  Jiping Zheng,et al.  Faster Algorithms for k-Regret Minimizing Sets via Monotonicity and Sampling , 2019, CIKM.

[37]  Raymond Chi-Wing Wong,et al.  Geometry approach for k-regret query , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[38]  Alexandros G. Dimakis,et al.  Scalable Greedy Feature Selection via Weak Submodularity , 2017, AISTATS.

[39]  Gautam Das,et al.  A Unified Optimization Algorithm For Solving "Regret-Minimizing Representative" Problems , 2019, Proc. VLDB Endow..

[40]  Davide Martinenghi,et al.  Flexible Skylines , 2020, ACM Trans. Database Syst..

[41]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[42]  Raymond Chi-Wing Wong,et al.  Minimizing Average Regret Ratio in Database , 2016, SIGMOD Conference.

[43]  Raymond Chi-Wing Wong,et al.  Finding Average Regret Ratio Minimizing Set in Database , 2018, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[44]  Jan Vondrák,et al.  Fast algorithms for maximizing submodular functions , 2014, SODA.

[45]  Zaiqiao Meng,et al.  Recurrent neural variational model for follower-based influence maximization , 2020, Inf. Sci..

[46]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[47]  Richard J. Lipton,et al.  Regret-minimizing representative databases , 2010, Proc. VLDB Endow..

[48]  Richard J. Lipton,et al.  Representative skylines using threshold-based preference distributions , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[49]  Kazuhisa Makino,et al.  Interactive regret minimization , 2012, SIGMOD Conference.