Integrated Scientific Workflow Management for the Emulab Network Testbed

The main forces that shaped current network testbeds were the needs for realism and scale. Now that several testbeds support large and complex experiments, management of experimentation processes and results has become more difficult and a barrier to high-quality systems research. The popularity of network testbeds means that new tools for managing experiment workflows, addressing the ready-made base of testbed users, can have important and significant impacts. We are now evolving Emulab, our large and popular network testbed, to support experiments that are organized around scientific workflows. This paper summarizes the opportunities in this area, the new approaches we are taking, our implementation in progress, and the challenges in adapting scientific workflow concepts for testbed-based research. With our system, we expect to demonstrate that a network testbed with integrated scientific workflow management can be an important tool to aid research in networking and distributed systems.

[1]  D. Andersen Challenges and Opportunities in Internet Data Mining , 2006 .

[2]  Marcos K. Aguilera,et al.  Olive: Distributed Point-in-Time Branching Storage for Real Systems , 2006, NSDI.

[3]  Mike Hibler,et al.  USENIX Association Proceedings of the General Track : 2003 USENIX Annual , 2003 .

[4]  Mike Hibler,et al.  Feedback-directed Virtualization Techniques for Scalable Network Experimentation , 2004 .

[5]  Margo I. Seltzer,et al.  Provenance-Aware Storage Systems , 2006, USENIX ATC, General Track.

[6]  Rajkumar Buyya,et al.  A Taxonomy of Workflow Management Systems for Grid Computing , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[7]  Cláudio T. Silva,et al.  VisTrails: enabling interactive multiple-view visualizations , 2005, VIS 05. IEEE Visualization, 2005..

[8]  Ian J. Taylor,et al.  Distributed computing with Triana on the Grid , 2005, Concurr. Pract. Exp..

[9]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[10]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[11]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..

[12]  Cláudio T. Silva,et al.  Managing the Evolution of Dataflows with VisTrails , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[13]  Tal Garfinkel,et al.  Virtualization Aware File Systems: Getting Beyond the Limitations of Virtual Disks , 2006, NSDI.

[14]  Amin Vahdat,et al.  PlanetLab application management using plush , 2006, OPSR.