Query-Based Data Pricing

Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each with a specific price. In this article, we propose a framework for pricing data on the Internet that, given the price of a few views, allows the price of any query to be derived automatically. We call this capability query-based pricing. We first identify two important properties that the pricing function must satisfy, the arbitrage-free and discount-free properties. Then, we prove that there exists a unique function that satisfies these properties and extends the seller's explicit prices to all queries. Central to our framework is the notion of query determinacy, and in particular instance-based determinacy: we present several results regarding the complexity and properties of it. When both the views and the query are unions of conjunctive queries or conjunctive queries, we show that the complexity of computing the price is high. To ensure tractability, we restrict the explicit prices to be defined only on selection views (which is the common practice today). We give algorithms with polynomial time data complexity for computing the price of two classes of queries: chain queries (by reducing the problem to network flow), and cyclic queries. Furthermore, we completely characterize the class of conjunctive queries without self-joins that have PTIME data complexity, and prove that pricing all other queries is NP-complete, thus establishing a dichotomy on the complexity of the pricing problem when all views are selection queries.

[1]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[2]  Serge Abiteboul,et al.  Foundations of Databases , 1994 .

[3]  Michael Stonebraker,et al.  Mariposa: a wide-area distributed database system , 1996, The VLDB Journal.

[4]  Serge Abiteboul,et al.  Complexity of answering queries using materialized views , 1998, PODS.

[5]  H. Varian,et al.  VERSIONING: THE SMART WAY TO SELL INFORMATION , 1998 .

[6]  Bruce Schneier,et al.  Secrets and Lies: Digital Security in a Networked World , 2000 .

[7]  Stéphane Grumbach,et al.  On the content of materialized aggregate views , 2000, PODS '00.

[8]  Thomas H. Cormen,et al.  Introduction to algorithms [2nd ed.] , 2001 .

[9]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[10]  Diego Calvanese,et al.  Lossless regular views , 2002, PODS.

[11]  P. K. Kannan,et al.  Pricing of Information Products on Online Servers: Issues, Models, and Analysis , 2002, Manag. Sci..

[12]  D. Pinto Secrets and Lies: Digital Security in a Networked World , 2003 .

[13]  Leonid Libkin,et al.  Elements of Finite Model Theory , 2004, Texts in Theoretical Computer Science.

[14]  Leonid Libkin,et al.  Elements Of Finite Model Theory (Texts in Theoretical Computer Science. An Eatcs Series) , 2004 .

[15]  Alberto O. Mendelzon,et al.  Authorization Views and Conditional Query Containment , 2005, ICDT.

[16]  Victor Vianu,et al.  Views and queries: Determinacy and rewriting , 2010, TODS.

[17]  Foto N. Afrati,et al.  Rewriting Conjunctive Queries Determined by Views , 2007, MFCS.

[18]  Maarten Marx,et al.  Queries determined by views: pack your views , 2007, PODS.

[19]  Alan Nash,et al.  Determinacy and Rewriting of Conjunctive Queries Using Views: A Progress Report , 2007, ICDT.

[20]  Verena Kantere,et al.  An Economic Model for Self-Tuned Cloud Caching , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[21]  Dan Suciu,et al.  The Complexity of Causality and Responsibility for Query Answers and non-Answers , 2010, Proc. VLDB Endow..

[22]  Georg Gottlob,et al.  Schema mapping discovery from data instances , 2010, JACM.

[23]  C. Dwork A firm foundation for private data analysis , 2011, Commun. ACM.

[24]  Dan Suciu,et al.  Data Markets in the Cloud: An Opportunity for the Database Community , 2011, Proc. VLDB Endow..

[25]  Dan Suciu,et al.  QueryMarket Demonstration: Pricing for Online Data Markets , 2012, Proc. VLDB Endow..

[26]  Gerome Miklau,et al.  Pricing Aggregate Queries in a Data Marketplace , 2012, WebDB.

[27]  M. Balazinska,et al.  Query-based data pricing , 2012, PODS '12.