Toward a cloud computing research agenda

The 2008 LADIS workshop on Large Scale Distributed Systems brought together leaders from the commercial cloud computing community with researchers working on a variety of topics in distributed computing. The dialog yielded some surprises: some hot research topics seem to be of limited near-term importance to the cloud builders, while some of their practical challenges seem to pose new questions to us as systems researchers. This brief note summarizes our impressions.

[1]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[2]  D. Dolev,et al.  Sharing memory robustly in message-passing systems , 1995, JACM.

[3]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[4]  Michael K. Reiter,et al.  Byzantine quorum systems , 1997, STOC '97.

[5]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[6]  Jim Gray,et al.  Scalability Terminology: Farms, Clones, Partitions, Packs, RACS and RAPS , 1999, ArXiv.

[7]  Miguel Oom Temudo de Castro,et al.  Practical Byzantine fault tolerance , 1999, OSDI '99.

[8]  Mateo Valero,et al.  Multiple-banked register file architectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[9]  Idit Keidar,et al.  Group communication specifications: a comprehensive study , 2001, CSUR.

[10]  Nancy A. Lynch,et al.  Rambo II: rapidly reconfigurable atomic memory for dynamic networks , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[11]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[12]  GhemawatSanjay,et al.  The Google file system , 2003 .

[13]  George Candea,et al.  Crash-Only Software , 2003, HotOS.

[14]  M. Dahlin,et al.  A scalable distributed information management system , 2004, SIGCOMM '04.

[15]  Marc Najork,et al.  Boxwood: Abstractions as the Foundation for Storage Infrastructure , 2004, OSDI.

[16]  George Candea,et al.  Recovery-oriented computing: building multitier dependability , 2004, Computer.

[17]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[18]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[19]  Wolf-Dietrich Weber,et al.  Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[20]  James R. Hamilton,et al.  On Designing and Deploying Internet-Scale Services , 2007, LISA.

[21]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[22]  Yoav Tock,et al.  SpiderCast: a scalable interest-aware overlay for topic-based pub/sub communication , 2007, DEBS '07.

[23]  Marcos K. Aguilera,et al.  Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.

[24]  Edmund L. Wong,et al.  BFT: the time is now , 2008, LADIS '08.

[25]  Eliezer Dekel,et al.  Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware , 2008 .

[26]  Ramakrishna Kotla,et al.  Zyzzyva , 2007, SOSP.

[27]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[28]  Yair Amir,et al.  Paxos for System Builders: an overview , 2008, LADIS '08.

[29]  Robbert van Renesse,et al.  Efficient reconciliation and flow control for anti-entropy protocols , 2008, LADIS '08.

[30]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[31]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[32]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[33]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[34]  Yoav Tock,et al.  Gravity: An Interest-Aware Publish/Subscribe System Based on Structured Overlays , 2008, DEBS 2008.

[35]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[36]  Benjamin Reed,et al.  A simple totally ordered broadcast protocol , 2008, LADIS '08.

[37]  Yoav Tock,et al.  Dr. Multicast: Rx for data center communication scalability , 2008, LADIS '08.

[38]  Petr Kuznetsov,et al.  Defining weakly consistent Byzantine fault-tolerant services , 2008, LADIS '08.

[39]  Idit Keidar,et al.  Fail-Aware Untrusted Storage , 2009, 2009 IEEE/IFIP International Conference on Dependable Systems & Networks.

[40]  Petr Kuznetsov,et al.  Zeno: Eventually Consistent Byzantine-Fault Tolerance , 2009, NSDI.

[41]  Christopher Olston,et al.  Interactive Analysis of Web-Scale Data , 2009, CIDR.