A Workflow-Inspired, Modular and Robust Approach to Experiments in Distributed Systems

Experimentation in large-scale distributed systems research is very challenging due to the size and complexity of modern systems and applications spanning domains of high performance computing, P2P networks, cloud computing, etc. Some obstacles that each researcher must face are: the difficulty of properly structuring experiments due to their complexity, the inflexibility of existing methodologies and tools and the scalability problems resulting from the size of studied systems. In this paper, we propose a novel method of representing and executing experiments that solves these problems. To this end, we present an interdisciplinary approach to the control of large-scale experiments in distributed systems research that draws its foundations from workflow management and scientific workflows. This workflow-inspired approach distinguishes itself by its representation of experiments, modular architecture and robust error handling. We show how the aforementioned problems are solved by our approach in an exemplary performance study of an HTTP server.

[1]  Maximilian Ott,et al.  OMF: a control and management framework for networking testbeds , 2010, OPSR.

[2]  Jacques Wainer,et al.  Empirical evaluation in Computer Science research published by ACM , 2009, Inf. Softw. Technol..

[3]  Amin Vahdat,et al.  Loose Synchronization for Large-Scale Networked Systems , 2006, USENIX Annual Technical Conference, General Track.

[4]  Olivier Richard,et al.  TakTuk, adaptive deployment of remote executions , 2009, HPDC '09.

[5]  Franck Cappello,et al.  Grid'5000: a large scale, reconfigurable, controlable and monitorable Grid platform , 2005 .

[6]  Walter F. Tichy,et al.  Should Computer Scientists Experiment More? , 1998, Computer.

[7]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[8]  Lucas Nussbaum,et al.  Leveraging business workflows in distributed systems research for the orchestration of reproducible and scalable experiments , 2011 .

[9]  Yanyan Wang,et al.  Automating experimentation on distributed testbeds , 2005, ASE.

[10]  Ryan K. L. Ko,et al.  A computer scientist's introductory guide to business process management (BPM) , 2009, ACM Crossroads.

[11]  Wil M. P. van der Aalst,et al.  Workflow Patterns , 2003, Distributed and Parallel Databases.

[12]  Mathias Weske,et al.  Business Process Management: Concepts, Languages, Architectures , 2007 .

[13]  Pascal Felber,et al.  SPLAY: Distributed Systems Evaluation Made Simple (or How to Turn Ideas into Live Systems in a Breeze) , 2009, NSDI.

[14]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[15]  Eric Eide,et al.  Integrated Scientific Workflow Management for the Emulab Network Testbed , 2006, USENIX Annual Technical Conference, General Track.

[16]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[17]  Franck Cappello,et al.  Grid'5000: a large scale and highly reconfigurable grid experimental testbed , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[18]  Corinne Touati,et al.  Toward an experiment engine for lightweight grids , 2007, GridNets '07.