Managing Large Scale Experiments in Distributed Testbeds

Performing experiments that involve a large amount of resources or a complex configuration proves to be a hard task. In this paper we present Expo, which is a tool for conducting experiments on distributed platforms. First, the tool is described along with the concepts of resource and task sets, which abstracts away some of the complexity in the experiment conduction. Next, the tool is compared with other similar solutions based on some qualitative criteria, scalability and expressiveness tests, as well as the feedback coming from using dedicated testbeds. The paper finishes with the evaluation of Expo scalability and some use cases on Grid5000 and PlanetLab testbeds. Our experience showed that Expo is a promising tool to help the user with two primary concerns: performing a large scale experiment efficiently and easily, together with its reproducilibity.

[1]  Franck Cappello,et al.  Grid'5000: a large scale and highly reconfigurable grid experimental testbed , 2005, The 6th IEEE/ACM International Workshop on Grid Computing, 2005..

[2]  Pascal Felber,et al.  SPLAY: Distributed Systems Evaluation Made Simple (or How to Turn Ideas into Live Systems in a Breeze) , 2009, NSDI.

[3]  Eric Eide,et al.  An Experimentation Workbench for Replayable Networking Research , 2007, NSDI.

[4]  Maximilian Ott,et al.  OMF: a control and management framework for networking testbeds , 2010, OPSR.

[5]  D. Feitelson Experimental Computer Science: the Need for a Cultural Change , 2006 .

[6]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OSDI '02.

[7]  Amin Vahdat,et al.  Remote Control: Distributed Application Configuration, Management, and Visualization with Plush , 2007, LISA.

[8]  Larry L. Peterson,et al.  Experiences building PlanetLab , 2006, OSDI '06.

[9]  Yanyan Wang,et al.  Automating experimentation on distributed testbeds , 2005, ASE.

[10]  Cláudio T. Silva,et al.  VisTrails: visualization meets data management , 2006, SIGMOD Conference.

[11]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[12]  Noel De Palma,et al.  Autonomic management policy specification in Tune , 2008, SAC '08.

[13]  Brice Videau,et al.  Expo : un moteur de conduite d'expériences pour plates-formes Dédiées , 2008 .

[14]  Olivier Richard,et al.  TakTuk, adaptive deployment of remote executions , 2009, HPDC '09.

[15]  David S. Johnson,et al.  A theoretician's guide to the experimental analysis of algorithms , 1999, Data Structures, Near Neighbor Searches, and Methodology.

[16]  Corinne Touati,et al.  Toward an experiment engine for lightweight grids , 2007, GridNets '07.

[17]  Manpreet Singh,et al.  ORBIT testbed software architecture: supporting experiments as a service , 2005, First International Conference on Testbeds and Research Infrastructures for the DEvelopment of NeTworks and COMmunities.

[18]  Olivier Richard,et al.  A tool for environment deployment in clusters and light grids , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[19]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[20]  Franck Cappello,et al.  Grid'5000: A Large Scale And Highly Reconfigurable Experimental Grid Testbed , 2006, Int. J. High Perform. Comput. Appl..

[21]  Brian E. Granger,et al.  IPython: A System for Interactive Scientific Computing , 2007, Computing in Science & Engineering.

[22]  Fadi Khalil,et al.  Multi-scale Modeling: from Electromagnetism to Grid , 2009 .

[23]  Miron Livny,et al.  The NMI Build & Test Laboratory: Continuous Integration Framework for Distributed Computing Software , 2006, LISA.