Quaestor: Query Web Caching for Database-as-a-Service Providers

Today, web performance is primarily governed by round-trip latencies between end devices and cloud services. To improve performance, services need to minimize the delay of accessing data. In this paper, we propose a novel approach to low latency that relies on existing content delivery and web caching infrastructure. The main idea is to enable application-independent caching of query results and records with tunable consistency guarantees, in particular bounded staleness. Quaestor (Query Store) employs two key concepts to incorporate both expiration-based and invalidation-based web caches: (1) an Expiring Bloom Filter data structure to indicate potentially stale data, and (2) statistically derived cache expiration times to maximize cache hit rates. Through a distributed query invalidation pipeline, changes to cached query results are detected in real-time. The proposed caching algorithms offer a new means for data-centric cloud services to trade latency against staleness bounds, e.g. in a database-as-a-service. Quaestor is the core technology of the backend-as-a-service platform Baqend, a cloud service for low-latency websites. We provide empirical evidence for Quaestor's scalability and performance through both simulation and experiments. The results indicate that for read-heavy workloads, up to tenfold speed-ups can be achieved through Quaestor's caching.

[1]  References , 1971 .

[2]  Vincent Cate,et al.  Alex - a Global Filesystem , 1992 .

[3]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[4]  Kurt Jeffery Worrell Invalidation in Large Scale Network Object Caches , 1994 .

[5]  Margo I. Seltzer,et al.  World Wide Web Cache Consistency , 1996, USENIX Annual Technical Conference.

[6]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[7]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[8]  Divyakant Agrawal,et al.  Enabling dynamic content caching for database-driven web sites , 2001, SIGMOD '01.

[9]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[10]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[11]  Sriram Padmanabhan,et al.  DBProxy: a dynamic data cache for web applications , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[12]  Jonathan Goldstein,et al.  Transparent mid-tier database caching in SQL server , 2003, SIGMOD '03.

[13]  Jonathan Goldstein,et al.  MTCache: transparent mid-tier database caching in SQL server , 2004, Proceedings. 20th International Conference on Data Engineering.

[14]  Hamid Pirahesh,et al.  Adaptive Database Caching with DBCache , 2004, IEEE Data Eng. Bull..

[15]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[16]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[17]  Bruce M. Maggs,et al.  Scalable query result caching for web applications , 2008, Proc. VLDB Endow..

[18]  Rajkumar Buyya,et al.  A Taxonomy of CDNs , 2008 .

[19]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[20]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[21]  Richard T. Hurley,et al.  A performance investigation of web caching architectures , 2008, C3S2E '08.

[22]  Rada Chirkova,et al.  Materialized Views , 2012, Found. Trends Databases.

[23]  Caching search engine results over incremental indices , 2010, WWW '10.

[24]  Michael J. Freedman,et al.  Experiences with CoralCDN: A Five-Year Operational View , 2010, NSDI.

[25]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[26]  Adam Silberstein,et al.  Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[27]  Yawei Li,et al.  Megastore: Providing Scalable, Highly Available Storage for Interactive Services , 2011, CIDR.

[28]  Henrik Loeser,et al.  "One Size Fits All": An Idea Whose Time Has Come and Gone? , 2011, BTW.

[29]  Michael J. Freedman,et al.  Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[30]  Patrick Wendell,et al.  Going viral: flash crowds in an open CDN , 2011, IMC '11.

[31]  Xiaozhou Li,et al.  Analyzing consistency properties for fun and profit , 2011, PODC '11.

[32]  Marcos K. Aguilera,et al.  Transactional storage for geo-replicated systems , 2011, SOSP.

[33]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[34]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[35]  Ion Stoica,et al.  Probabilistically Bounded Staleness for Practical Partial Quorums , 2012, Proc. VLDB Endow..

[36]  Özgür Ulusoy,et al.  Adaptive Time-to-Live Strategies for Query Result Caching in Web Search Engines , 2012, ECIR.

[37]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[38]  Marcos K. Aguilera,et al.  Consistency-based service level agreements for cloud storage , 2013, SOSP.

[39]  Ali Ghodsi,et al.  Bolt-on causal consistency , 2013, SIGMOD '13.

[40]  Robbert van Renesse,et al.  An analysis of Facebook photo caching , 2013, SOSP.

[41]  David Zhang,et al.  On brewing fresh espresso: LinkedIn's distributed data serving platform , 2013, SIGMOD '13.

[42]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.

[43]  P. V. Mieghem,et al.  Performance Analysis of Complex Networks and Systems , 2014 .

[44]  Norbert Ritter,et al.  NoSQL OLTP Benchmarking: A Survey , 2014, GI-Jahrestagung.

[45]  Norbert Ritter,et al.  Orestes: A scalable Database-as-a-Service architecture for low latency , 2014, 2014 IEEE 30th International Conference on Data Engineering Workshops.

[46]  Martin Thomson,et al.  Hypertext Transfer Protocol Version 2 (HTTP/2) , 2015, RFC.

[47]  Norbert Ritter,et al.  The Cache Sketch: Revisiting Expiration-based Caching in the Age of Cloud Data Management , 2015, BTW.

[48]  Sanjeev Kumar,et al.  Existential consistency: measuring and understanding consistency at Facebook , 2015, SOSP.

[49]  Anshul Jaiswal,et al.  Realtime Data Processing at Facebook , 2016, SIGMOD Conference.

[50]  Marko Vukolic,et al.  Consistency in Non-Transactional Distributed Storage Systems , 2015, ACM Comput. Surv..