Scientific workflows and clouds

In recent years, empirical science has been evolving from physical experimentation to computation-based research. In astronomy, researchers seldom spend time at a telescope, but instead access the large number of image databases that are created and curated by the community [42]. In bioinformatics, data repositories hosted by entities such as the National Institutes of Health [29] provide the data gathered by Genome-Wide Association Studies and enable researchers to link particular genotypes to a variety of diseases.

[1]  Kim B. Olsen,et al.  Ground motion environment of the Los Angeles region , 2006 .

[2]  Li Zhao,et al.  Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Example , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[3]  Dmitrii Zagorodnov,et al.  Eucalyptus : A Technical Report on an Elastic Utility Computing Archietcture Linking Your Programs to Useful Systems , 2008 .

[4]  M. Kunze,et al.  The Cumulus project: Build a scientific cloud for a data center , 2009 .

[5]  Eli M. Dow,et al.  Xen and the Art of Repeated Research , 2004, USENIX Annual Technical Conference, FREENIX Track.

[6]  B. Barish,et al.  LIGO and the Detection of Gravitational Waves , 1999 .

[7]  Borja Sotomayor,et al.  Capacity Leasing in Cloud Systems using the OpenNebula Engine , 2008 .

[8]  Ian J. Taylor,et al.  Distributed P2P computing within Triana: a galaxy visualization test case , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[9]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[10]  James Liebert,et al.  The Two Micron All Sky Survey (2MASS): Overview and Status , 1997 .

[11]  Jorge Luis Rodriguez,et al.  The Open Science Grid , 2005 .

[12]  Chandra Krintz,et al.  Paravirtualization for HPC Systems , 2006, ISPA Workshops.

[13]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[14]  Xian-He Sun,et al.  Lattice QCD Workflows: A Case Study , 2008, 2008 IEEE Fourth International Conference on eScience.

[15]  Daniel S. Katz,et al.  Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand , 2004, SPIE Astronomical Telescopes + Instrumentation.

[16]  Keith Beattie,et al.  Reducing Time-to-Solution Using Distributed High-Throughput Mega-Workflows - Experiences from SCEC CyberShake , 2008, 2008 IEEE Fourth International Conference on eScience.

[17]  Eugene Ciurana,et al.  Google App Engine , 2009 .

[18]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[19]  Robert Ross,et al.  Implementation and performance of a parallel file system for high performance distributed applications , 1996, Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing.

[20]  Thomas Hess,et al.  Software as a Service , 2008, Wirtschaftsinf..

[21]  Carole A. Goble,et al.  Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[22]  Rhys Newman,et al.  Performance implications of virtualization and hyper-threading on high energy physics applications in a grid environment , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[23]  A. Zahariev Google App Engine , 2009 .

[24]  Jeffrey S. Vetter,et al.  Xen-Based HPC: A Parallel I/O Perspective , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[25]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[26]  Katarzyna Keahey,et al.  Contextualization: Providing One-Click Virtual Clusters , 2008, 2008 IEEE Fourth International Conference on eScience.

[27]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[28]  William Gropp,et al.  Skjellum using mpi: portable parallel programming with the message-passing interface , 1994 .

[29]  Junwei Cao,et al.  A Case Study on the Use of Workflow Technologies for Scientific Analysis: Gravitational Wave Data Analysis , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[30]  Borja Sotomayor,et al.  Virtual Clusters for Grid Communities , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[31]  Carole A. Goble,et al.  myGrid: personalised bioinformatics on the information grid , 2003, ISMB.

[32]  Jianting Zhang,et al.  Data Integration and Workflow Solutions for Ecology , 2005, DILS.