Life science application support in an interoperable e-science environment

In the last decade, life science applications have become more and more integrated into e-Science environments, hence they are typically very demanding, both in terms of computational capabilities and data capacities. Especially the access to life science applications, embedded in such environments via Grid clients still constitutes a major hurdle for scientists that do not have an IT background. Life science applications often comprise a whole set of small programs instead of a single executable. Many of the graphical Grid clients are not perfectly suited for these types of applications, as they often assume that Grid jobs will run a single executable instead of a set of chained executions (i.e. sequences). This means that in order to execute a sequence of multiple programs on a single Grid resource, piping data from one program to the next, the user would have to run a hand-written shell script. Otherwise each program is independently scheduled as a Grid job, which causes unnecessary file transfers between the jobs, even if they are scheduled on the same resource. We present a generic solution to this problem and provide a reference implementation, which seamlessly integrates with the Grid middleware UNICORE. Our approach focuses on a comfortable user interface for the creation of such program sequences, validated in UNICORE-driven HPC-based Grids. Thus, we applied our approach in order to provide support for the usage of the AMBER package (a widely-used collection of programs for molecular dynamics simulations) within Grid workflows. We finally provide a scientific use case of our approach leveraging the interoperability of two different scientific infrastructures that represents an instance of the infrastructure interoperability reference model.

[1]  Jun Qin,et al.  ASKALON: A Development and Grid Computing Environment for Scientific Workflows , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[2]  Frank Leymann,et al.  Modeling Stateful Resources with Web Services , 2004 .

[3]  Shaowen Wang,et al.  Interoperation of world‐wide production e‐Science infrastructures , 2009, Concurr. Comput. Pract. Exp..

[4]  Achim Streit,et al.  Classification of Different Approaches for e-Science Applications in Next Generation Computing Infrastructures , 2008, 2008 IEEE Fourth International Conference on eScience.

[5]  Thomas Lengauer,et al.  Computational methods for biomolecular docking. , 1996, Current opinion in structural biology.

[6]  Morris Riedel,et al.  GridBeans: Support e-Science and Grid Applications , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[7]  Arun Krishnan,et al.  Wildfire: distributed, Grid-enabled workflow construction and execution , 2004, BMC Bioinformatics.

[8]  D. C. Rapaport,et al.  The Art of Molecular Dynamics Simulation , 1997 .

[9]  P. Bourne,et al.  The New Biology and the Grid , 2003 .

[10]  Dennis Gannon,et al.  Workflows for e-Science, Scientific Workflows for Grids , 2014 .

[11]  Anthony Rowe,et al.  The discovery net system for high throughput bioinformatics , 2003, ISMB.

[12]  Francine Berman,et al.  Overview of the Book: Grid Computing – Making the Global Infrastructure a Reality , 2003 .

[13]  Martin Hofmann-Apitius,et al.  Improving e-Science with Interoperability of the e-Infrastructures EGEE and DEISA , 2008 .

[14]  Achim Streit,et al.  Research advances by using interoperable e-science infrastructures , 2009, Cluster Computing.

[15]  J. Lindemann,et al.  Advanced Resource Connector middleware for lightweight computational Grids , 2007, Future Gener. Comput. Syst..

[16]  Steven Tuecke,et al.  GridFTP: Protocol Extensions to FTP for the Grid , 2001 .

[17]  A. Ferrari,et al.  Validation of an automated procedure for the prediction of relative free energies of binding on a set of aldose reductase inhibitors. , 2007, Bioorganic & medicinal chemistry.

[18]  Jeff Weber,et al.  Workflow Management in Condor , 2007, Workflows for e-Science, Scientific Workflows for Grids.