Introducing PRECIP: An API for Managing Repeatable Experiments in the Cloud

Cloud computing with its on-demand access to resources has emerged as a tool used by researchers from a wide range of domains to run computer-based experiments. In this paper we introduce a flexible experiment management API, written in Python that simplifies and formalizes the execution of scientific experiments on cloud infrastructures. We describe the features and functionality of PRECIP (Pegasus Repeatable Experiments for the Cloud in Python), and how PRECIP can be used to set up experiments on academic clouds such as OpenStack Eucalyptus, Nimbus, and commercial clouds such as Amazon EC2.

[1]  Lavanya Ramakrishnan,et al.  Performance evaluation of a MongoDB and hadoop platform for scientific data analysis , 2013, Science Cloud '13.

[2]  Ewa Deelman,et al.  Wrangler: virtual cluster provisioning for the cloud , 2011, HPDC '11.

[3]  Miron Livny,et al.  Zoo: a desktop experiment management environment , 1997, SIGMOD '97.

[4]  Tao Li,et al.  ASAP: A Self-Adaptive Prediction System for Instant Cloud Resource Demand Provisioning , 2011, 2011 IEEE 11th International Conference on Data Mining.

[5]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[6]  Etzard Stolte,et al.  B-Fabric: A Data and Application Integration Framework for Life Sciences Research , 2007, DILS.

[7]  Marian Bubak,et al.  Component Approach to Computational Applications on Clouds , 2011, ICCS.

[8]  Warren Smith,et al.  Design of the FutureGrid experiment management framework , 2010, 2010 Gateway Computing Environments Workshop (GCE).

[9]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[10]  William E. Johnston,et al.  The NetLogger methodology for high performance distributed systems performance analysis , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[11]  Tom White,et al.  Hadoop: The Definitive Guide , 2009 .