Quatrain: Accelerating Data Aggregation between Multiple Layers

Composition of multiple layers (or components/services) has been a dominant practice in building distributed systems, meanwhile aggregation has become a typical pattern of data flows nowadays. However, the efficiency of data aggregation is usually impaired by multiple layers due to amplified delay. Current solutions based on data/execution flow optimization mostly counteract flexibility, reusability, and isolation of layers abstraction. Otherwise, programmers have to do much error-prone manual programming to optimize communication, and it is complicated in a multithreaded environment. To resolve the dilemma, we propose a new style of inter-process communication that not only optimizes data aggregation but also retains the advantages of layered (or component-based/service-oriented) architecture. Our approach relaxes the traditional definition of procedure and allows a procedure to return multiple times. Specifically, we implement an extended remote procedure calling framework Quatrain to support the new multireturn paradigm. In this paper, we establish the importance of multiple returns, introduce our very simple semantics, and present a new synchronization protocol that frees programmers from multireturn-related thread coordination. Several practical applications are constructed with Quatrain, and the evaluation shows an average of 56% reduction of response time, compared with the traditional calling paradigm, in realistic environments.

[1]  Esmond Pitt,et al.  java.rmi: The Remote Method Invocation Guide , 2001 .

[2]  Ron Kohavi,et al.  Practical guide to controlled experiments on the web: listen to your customers not to the hippo , 2007, KDD '07.

[3]  Ariel Ortiz Ramirez Three-Tier Architecture , 2000 .

[4]  Scott Seely,et al.  SOAP: Cross Platform Web Services Development Using XML , 2001 .

[5]  Raj Srinivasan,et al.  RPC: Remote Procedure Call Protocol Specification Version 2 , 1995, RFC.

[6]  Michael Stal,et al.  Web services: beyond component-based computing , 2002, CACM.

[7]  Don W. Browning Net Remoting , 2010 .

[8]  Eric A. Brewer,et al.  USENIX Association Proceedings of HotOS IX : The 9 th Workshop on Hot Topics in Operating Systems , 2003 .

[9]  Albert G. Greenberg,et al.  The nature of data center traffic: measurements & analysis , 2009, IMC '09.

[10]  Eric C. Cooper Replicated distributed programs , 1985, SOSP '85.

[11]  Albert G. Greenberg,et al.  Reining in the Outliers in Map-Reduce Clusters using Mantri , 2010, OSDI.

[12]  Robert Thurlow,et al.  RPC: Remote Procedure Call Protocol Specification Version 2 , 2009, RFC.

[13]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[15]  Christo Wilson,et al.  Better never than late , 2011, SIGCOMM 2011.

[16]  David A. Maltz,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM 2010.

[17]  Mendel Rosenblum,et al.  Fast crash recovery in RAMCloud , 2011, SOSP.

[18]  David Fisher,et al.  Multi-return function call , 2006, J. Funct. Program..

[19]  Mahadev Satyanarayanan,et al.  Parallel Communication in a Large Distributed Environment , 1990, IEEE Trans. Computers.

[20]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[21]  Wei Sun,et al.  Towards Service Composition Based on Mashup , 2007, 2007 IEEE Congress on Services (Services 2007).

[22]  Marcos K. Aguilera,et al.  RPC Chains: Efficient Client-Server Communication in Geodistributed Systems , 2009, NSDI.

[23]  Craig Chambers,et al.  FlumeJava: easy, efficient data-parallel pipelines , 2010, PLDI '10.

[24]  Tobias Nestler,et al.  Towards a mashup-driven end-user programming of SOA-based applications , 2008, iiWAS.

[25]  Daniel E. Geer,et al.  Project Athena as a distributed computer system , 1990, Computer.

[26]  Liuba Shrira,et al.  Promises: linguistic support for efficient asynchronous procedure calls in distributed systems , 1988, PLDI '88.

[27]  David E. Culler,et al.  SEDA: an architecture for well-conditioned, scalable internet services , 2001, SOSP.

[28]  George T. Heineman,et al.  Component-Based Software Engineering: Putting the Pieces Together , 2001 .

[29]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[30]  Volker Markl,et al.  Damia: data mashups for intranet applications , 2008, SIGMOD Conference.

[31]  Andrew Birrell,et al.  Implementing Remote procedure calls , 1983, SOSP '83.