Oort: User-Centric Cloud Storage with Global Queries

In principle, the web should provide the perfect stage for user-generated content, allowing users to share their data seamlessly with other users across services and applications. In practice, the web fragments a user’s data over many sites, each exposing only limited APIs for sharing. This paper describes Oort, a new cloud storage system that organizes data primarily by user rather than by application or web site. Oort allows users to choose which web software to use with their data and which other users to share it with, while giving applications powerful tools to query that data. Users rent space from providers that cooperate to provide a global, federated, general-purpose storage system. To support large-scale, multi-user applications such as Twitter and e-mail, Oort provides global queries that find and combine data from relevant users across all providers. Oort makes global query execution efficient by recognizing and merging similar queries issued by many users’ application instances, largely eliminating the per-user factor in the global complexity of queries. Our evaluation predicts that an Oort implementation could handle traffic similar to that seen by Twitter using a hundred cooperating Oort servers, and that applications with other sharing patterns, like e-mail, can also be executed efficiently.

[1]  Tim Berners-Lee,et al.  A Demonstration of the Solid Platform for Social Web Applications , 2016, WWW.

[2]  Jaswinder Pal Singh,et al.  Analysis and algorithms for content-based event matching , 2005, 25th IEEE International Conference on Distributed Computing Systems Workshops.

[3]  Gregory Cooper,et al.  Thialfi: a client notification service for internet-scale applications , 2011, SOSP '11.

[4]  James R. Larus,et al.  Orleans: cloud computing for everyone , 2011, SoCC.

[5]  Sven Bittner,et al.  The arbitrary Boolean publish/subscribe model: making the case , 2007, DEBS '07.

[6]  Michael Walfish,et al.  World Wide Web Without Walls , 2007, HotNets.

[7]  Jennifer Widom,et al.  Continuous queries over data streams , 2001, SGMD.

[8]  Robert Tappan Morris,et al.  Amber: Decoupling User Data from Web Applications , 2015, HotOS.

[9]  Sasu Tarkoma,et al.  Chained forests for fast subsumption matching , 2007, DEBS '07.

[10]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[11]  Ioana Manolescu,et al.  Invisible Glue: Scalable Self-Tunning Multi-Stores , 2015, CIDR.

[12]  Arvind Krishnamurthy,et al.  Customizable and Extensible Deployment for Mobile/Cloud Applications , 2014, OSDI.

[13]  Nickolai Zeldovich,et al.  Separating Web Applications from User Data Storage with BSTORE , 2010, WebApps.

[14]  Brian F. Cooper Spanner: Google's globally-distributed database , 2013, SYSTOR '13.

[15]  Hector Garcia-Molina,et al.  Query Merging: Improving Query Subscription Processing in a Multicast Environment , 2003, IEEE Trans. Knowl. Data Eng..

[16]  Erez Shmueli,et al.  openPDS: Protecting the Privacy of Metadata through SafeAnswers , 2014, PloS one.

[17]  Ian Rae,et al.  F1: A Distributed SQL Database That Scales , 2013, Proc. VLDB Endow..

[18]  David S. Rosenblum,et al.  Achieving scalability and expressiveness in an Internet-scale event notification service , 2000, PODC '00.

[19]  Dick Hardt,et al.  The OAuth 2.0 Authorization Framework , 2012, RFC.

[20]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.