On m-Impact Regions and Standing Top-k Influence Problems

In this paper, we study the m-impact region problem (mIR). In a context where users look for available products with top-k queries, mIR identifies the part of the product space that attracts the most user attention. Specifically, mIR determines the kind of attribute values that lead a (new or existing) product to the top-k result for at least a fraction of the user population. mIR has several applications, ranging from effective marketing to product improvement. Importantly, it also leads to (exact and efficient) solutions for standing top-k impact problems, which were previously solved heuristically only, or whose current solutions face serious scalability limitations. We experiment, among others, on data mined from actual user reviews for real products, and demonstrate the practicality and efficiency of our algorithms, both for mIR and for standing top-k impact problems.

[1]  Vagelis Hristidis,et al.  Leveraging collaborative tagging for web item design , 2011, KDD.

[2]  Zhao Zhang,et al.  Reverse k-Ranks Query , 2014, Proc. VLDB Endow..

[3]  Micha Sharir,et al.  On the Zone Theorem for Hyperplane Arrangements , 1991, SIAM J. Comput..

[4]  Ying Zhang,et al.  Cost optimization based on influence and user preference , 2019, Knowledge and Information Systems.

[5]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[6]  Micha Sharir,et al.  Arrangements and Their Applications , 2000, Handbook of Computational Geometry.

[7]  Jarek Gryz,et al.  Algorithms and analyses for maximal vector computation , 2007, The VLDB Journal.

[8]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[9]  Kyriakos Mouratidis,et al.  Creating Top Ranking Options in the Continuous Option and Preference Space , 2019, Proc. VLDB Endow..

[10]  Man Lung Yiu,et al.  Efficient top-k aggregation of ranked inputs , 2007, TODS.

[11]  Kyriakos Mouratidis,et al.  Determining the Impact Regions of Competing Options in Preference Space , 2017, SIGMOD Conference.

[12]  Yufei Tao,et al.  Branch-and-bound processing of ranked queries , 2007, Inf. Syst..

[13]  Pankaj K. Agarwal,et al.  Top-k preferences in high dimensions , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[14]  Arbee L. P. Chen,et al.  Determining k-most demanding products with maximum expected number of total customers , 2013, IEEE Transactions on Knowledge and Data Engineering.

[15]  Yue Lu,et al.  Rated aspect summarization of short comments , 2009, WWW '09.

[16]  Philip S. Yu,et al.  Maximizing bichromatic reverse nearest neighbor for Lp-norm in two- and three-dimensional spaces , 2011, The VLDB Journal.

[17]  Yufei Tao,et al.  Multi-dimensional Reverse k NN Search , 2005 .

[18]  David P. Dobkin,et al.  The quickhull algorithm for convex hulls , 1996, TOMS.

[19]  Kurt Mehlhorn,et al.  Four Results on Randomized Incremental Constructions , 1992, Comput. Geom..

[20]  R. C. Monteiro,et al.  Interior path following primal-dual algorithms , 1988 .

[21]  Pankaj K. Agarwal,et al.  Processing a large number of continuous preference top-k queries , 2012, SIGMOD Conference.

[22]  Bernard Chazelle,et al.  An optimal convex hull algorithm in any fixed dimension , 1993, Discret. Comput. Geom..

[23]  Nikos Mamoulis,et al.  Maximizing a Record’s Standing in a Relation , 2015, IEEE Transactions on Knowledge and Data Engineering.

[24]  S. Muthukrishnan,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD '00.

[25]  Jian Pei,et al.  Efficient Skyline and Top-k Retrieval in Subspaces , 2007, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ihab F. Ilyas,et al.  A survey of top-k query processing techniques in relational database systems , 2008, CSUR.

[27]  Christos Doulkeridis,et al.  Reverse top-k queries , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[28]  Muhammad Aamir Cheema,et al.  SLICE: Reviving regions-based pruning for reverse k nearest neighbors queries , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[29]  Raymond Chi-Wing Wong,et al.  An experimental survey of regret minimization query and variants: bridging the best worlds between top-k query and skyline query , 2019, The VLDB Journal.

[30]  Zhitao Shen,et al.  A Unified Framework for Efficiently Processing Ranking Related Queries , 2014, EDBT.

[31]  New products , 1940, Electrical Engineering.

[32]  Avrim Blum,et al.  Preference Elicitation and Query Learning , 2004, J. Mach. Learn. Res..

[33]  Heikki Mannila,et al.  Standing Out in a Crowd: Selecting Attributes for Maximum Visibility , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[34]  Man Lung Yiu,et al.  Multi-dimensional top-k dominating queries , 2009, The VLDB Journal.

[35]  Robert D. Nowak,et al.  Active Ranking using Pairwise Comparisons , 2011, NIPS.

[36]  Nikos Mamoulis,et al.  Efficient All Top-k Computation - A Unified Solution for All Top-k, Reverse Top-k and Top-m Influential Queries , 2013, IEEE Transactions on Knowledge and Data Engineering.

[37]  Xuemin Lin,et al.  Influence based cost optimization on user preference , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[38]  Hua Lu,et al.  Upgrading Uncompetitive Products Economically , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[39]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[40]  Ketan Mulmuley,et al.  On levels in arrangements and voronoi diagrams , 1991, Discret. Comput. Geom..

[41]  Timothy M. Chan Output-sensitive results on convex hulls, extreme points, and related problems , 1995, SCG '95.

[42]  Raymond Chi-Wing Wong,et al.  Finding Top-k Preferable Products , 2012, IEEE Transactions on Knowledge and Data Engineering.

[43]  Kyriakos Mouratidis,et al.  Maximum Rank Query , 2015, Proc. VLDB Endow..

[44]  Li Qian,et al.  Learning User Preferences By Adaptive Pairwise Comparison , 2015, Proc. VLDB Endow..

[45]  Anthony K. H. Tung,et al.  DADA: a data cube for dominant relationship analysis , 2006, SIGMOD Conference.

[46]  Yue Lu,et al.  Latent aspect rating analysis on review text data: a rating regression approach , 2010, KDD.

[47]  K. Srinivasan,et al.  New Products, Upgrades, and New Releases: A Rationale for Sequential Product Introduction , 1997 .

[48]  Raymond Chi-Wing Wong,et al.  Finding the influence set through skylines , 2009, EDBT '09.

[49]  Ying Cai,et al.  Querying Improvement Strategies , 2017, EDBT.

[50]  Cheng Long,et al.  Efficient k-Regret Query Algorithm with Restriction-free Bound for any Dimensionality , 2018, SIGMOD Conference.

[51]  Kyriakos Mouratidis,et al.  Exact Processing of Uncertain Top-k Queries in Multi-criteria Settings , 2018, Proc. VLDB Endow..

[52]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[53]  Christos Doulkeridis,et al.  Branch-and-bound algorithm for reverse top-k queries , 2013, SIGMOD '13.

[54]  John R. Smith,et al.  The onion technique: indexing for linear optimization queries , 2000, SIGMOD '00.

[55]  Abolfazl Asudeh,et al.  Efficient Computation of Regret-ratio Minimizing Set: A Compact Maxima Representative , 2017, SIGMOD Conference.

[56]  Nick Koudas,et al.  Maximizing Gain over Flexible Attributes in Peer to Peer Marketplaces , 2019, PAKDD.

[57]  Davide Martinenghi,et al.  Reconciling Skyline and Ranking Queries , 2017, Proc. VLDB Endow..

[58]  Mark de Berg,et al.  Computational geometry: algorithms and applications , 1997 .

[59]  Gang Chen,et al.  Answering Why-not Questions on Reverse Top-k Queries , 2015, Proc. VLDB Endow..

[60]  Wei Wu,et al.  MaxFirst for MaxBRkNN , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[61]  Chengfei Liu,et al.  Know your customer: computing k-most promising products for targeted marketing , 2016, The VLDB Journal.

[62]  Kyriakos Mouratidis,et al.  Direct neighbor search , 2014, Inf. Syst..

[63]  Christos Doulkeridis,et al.  Identifying the most influential data objects with reverse top-k queries , 2010, Proc. VLDB Endow..

[64]  Renato D. C. Monteiro,et al.  Interior path following primal-dual algorithms. part II: Convex quadratic programming , 1989, Math. Program..

[65]  Raymond Chi-Wing Wong,et al.  Creating Competitive Products , 2009, Proc. VLDB Endow..

[66]  Nikos Mamoulis,et al.  Under Consideration for Publication in Knowledge and Information Systems Dominance Relationship Analysis with Budget Constraints , 2022 .

[67]  Richard J. Lipton,et al.  Regret-minimizing representative databases , 2010, Proc. VLDB Endow..

[68]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.