View Maintenance in Web Data Platforms

Modern Web Data Platforms (WDPs) handle large amount of data and activity through massively distributed infras- tructures. To achieve performance and availability at Internet scale, WDPs restrict querying capability, and provide weaker consistency guarantees than traditional ACID transactions. The sheer volume of parallel processing without ACID trans- action guarantees, and the large number of independent compo- nents in WDPs pose special challenges for view maintenance with respect to concurrent update propagation and correct execution of non-idempotent view updates in the presence of failures. In this paper, we introduce a novel consistency framework for deferred view maintenance that embodies the weaker consistency primitives prevalent in modern WDPs. Based on this model, we identify techniques to achieve consistent view maintenance for different classes of views. Our analysis covers aggregate, key- foreign-key join, and select-project views. I. I NTRODUCTION

[1]  Leslie Lamport,et al.  Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[2]  Frank Wm. Tompa,et al.  Efficiently updating materialized views , 1986, SIGMOD '86.

[3]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[4]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[5]  Inderpal Singh Mumick,et al.  The Stanford Data Warehousing Project , 1995 .

[6]  Leonid Libkin,et al.  Incremental maintenance of views with duplicates , 1995, SIGMOD '95.

[7]  Latha S. Colby,et al.  Algorithms for deferred view maintenance , 1996, SIGMOD '96.

[8]  Gang Zhou,et al.  A framework for supporting data integration using the materialized and virtual approaches , 1996, SIGMOD '96.

[9]  Jennifer Widom,et al.  Making views self-maintainable for data warehousing , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[10]  Yue Zhuge,et al.  The Strobe algorithms for multi-source warehouse consistency , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[11]  Ambuj K. Singh,et al.  Efficient view maintenance at data warehouses , 1997, SIGMOD '97.

[12]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[13]  Inderpal Singh Mumick,et al.  Incremental Maintenance Of Views With Duplicates , 1999 .

[14]  Hamid Pirahesh,et al.  Incremental Maintenance for Non-Distributive Aggregate Functions , 2002, VLDB.

[15]  Jeffrey F. Naughton,et al.  A comparison of three methods for join view maintenance in parallel RDBMS , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[16]  James Bailey,et al.  Incremental View Maintenance By Base Relation Tagging in Distributed Databases , 2004, Distributed and Parallel Databases.

[17]  Elke A. Rundensteiner,et al.  Multiversion-based view maintenance over distributed data sources , 2004, TODS.

[18]  Luping Ding,et al.  Dynamic Materialized Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[19]  Hicham G. Elmongui,et al.  Lazy Maintenance of Materialized Views , 2007, VLDB.

[20]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[21]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[22]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.