Scientific middleware for abstracted parallelisation

In this paper we introduce a class of problems that arise when the analysis of data split into an unknown number of pieces is attempted. Such analysis falls under the definition of Grid computing, but fails to be addressed by the current Grid computing projects, as they do not provide the appropriate abstractions. We then describe a distributed web service based middleware platform, which solves these problems by supporting construction of parallel data analysis functions for datasets with an unknown level of distribution. This analysis is achieved through the combination of Martlet, a work-flow language that uses constructs from functional programming to abstract the parallelisation in computations away from the user, and the construction of supporting middleware. To construct such a supporting middleware it is necessary to provide the capability to reason about the data structures held without restricting their nature. Issues covered in the development of this supporting middleware include the ability to handle distributed data transfer and management, function deployment and execution.

[1]  Paul Watson,et al.  A Grid Application Framework based on Web Services Specifications and Practices , 2004 .

[3]  Donald F. Ferguson,et al.  The WS-Resource Framework , 2004 .

[4]  Peter Henderson,et al.  A lazy evaluator , 1976, POPL.

[5]  Jonathan D. Blower,et al.  Data streaming, workflow and firewall-friendly Grid Services with Styx , 2005 .

[6]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[7]  Daniel Goodman,et al.  Grid Style Web Services for ClimatePrediction.net , 2004 .

[8]  Dave Stainforth,et al.  Climateprediction.net: Design Principles for Publicresource Modeling Research , 2002, IASTED PDCS.

[9]  Daniel Goodman,et al.  Martlet: A scientific work-flow language for abstracted parallelisation , 2006 .

[10]  Gene H. Golub,et al.  Matrix computations , 1983 .

[11]  Bob Atkinson Web Services Security (WS-Security) , 2003 .

[12]  Kerstin Kleese van Dam,et al.  THE NERC DATAGRID PROTOTYPE , 2003 .

[13]  V. Vianu,et al.  Edinburgh Why and Where: A Characterization of Data Provenance , 2017 .

[14]  Donald F. Ferguson,et al.  Web Services Addressing (WS- Addressing) , 2004 .

[15]  Madhusudhan Govindaraju,et al.  Investigating the limits of SOAP performance for scientific computing , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[16]  David P. Anderson,et al.  SETI@home: an experiment in public-resource computing , 2002, CACM.

[17]  Fabrizio Gagliardi,et al.  European DataGrid Project: Experiences of Deploying a Large Scale Testbed for E-science Applications , 2002, Performance.

[18]  R. Bird Introduction to functional programming using Haskell, Second Edition , 1998 .

[19]  Matthew R. Pocock,et al.  Taverna: a tool for the composition and enactment of bioinformatics workflows , 2004, Bioinform..