Supporting the construction of workflows for biodiversity problem-solving accessing secure, distributed resources

In the Biodiversity World (BDW) project we have created a flexible and extensible Web Services-based Grid environment for biodiversity researchers to solve problems in biodiversity and analyse biodiversity patterns. In this environment, heterogeneous and globally distributed biodiversity-related resources such as data sets and analytical tools are made available to be accessed and assembled by users into workflows to perform complex scientific experiments. One such experiment is bioclimatic modelling of the geographical distribution of individual species using climate variables in order to explain past and future climate-related changes in species distribution. Data sources and analytical tools required for such analysis of species distribution are widely dispersed, available on heterogeneous platforms, present data in different formats and lack inherent interoperability. The present BDW system brings all these disparate units together so that the user can combine tools with little thought as to their original availability, data formats and interoperability. The new prototype BDW system architecture not only brings together heterogeneous resources but also enables utilisation of computational resources and provides a secure access to BDW resources via a federated security model. We describe features of the new BDW system and its security model which enable user authentication from a workflow application as part of workflow execution.

[1]  Neil Caithness,et al.  Biodiversity World: a problem-solving environment for analysing biodiversity patterns , 2006, Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06).

[2]  David R. B. Stockwell,et al.  The GARP modelling system: problems and solutions to automated spatial prediction , 1999, Int. J. Geogr. Inf. Sci..

[3]  P. Smith Santa Fe, New Mexico , 1969 .

[4]  Daniel S. Katz,et al.  The Pegasus portal: web based grid computing , 2005, SAC '05.

[5]  Neil Caithness,et al.  Building a Biodiversity GRID , 2004, LSGRID.

[6]  M. Robertson,et al.  A PCA‐based modelling technique for predicting environmental suitability for organisms from presence records , 2001 .

[7]  Trevor Paterson,et al.  Scientific Names Are Ambiguous as Identifiers for Biological Taxa: Their Context and Definition Are Required for Accurate Data Integration , 2005, DILS.

[8]  Michael J. Sanderson,et al.  R8s: Inferring Absolute Rates of Molecular Evolution, Divergence times in the Absence of a Molecular Clock , 2003, Bioinform..

[9]  Edward A. Lee,et al.  CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1–7 Prepared using cpeauth.cls [Version: 2002/09/19 v2.02] Taverna: Lessons in creating , 2022 .

[10]  R. J. White,et al.  SPICE: A Flexible Architecture for Integrating Autonomous Databases to Comprise a Distributed Catalogue of Life , 2000, DEXA.

[11]  Yong Zhao,et al.  Chimera: a virtual data system for representing, querying, and automating data derivation , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[12]  John F. B. Mitchell,et al.  The simulation of SST, sea ice extents and ocean heat transports in a version of the Hadley Centre coupled model without flux adjustments , 2000 .

[13]  Joshua R. Smith,et al.  LIGO: the Laser Interferometer Gravitational-Wave Observatory , 1992, Science.

[14]  Bertram Ludäscher,et al.  Actor-Oriented Design of Scientific Workflows , 2005, ER.

[15]  J.S. Pahwa,et al.  Accessing biodiversity resources in computational environments from workflow applications , 2006, 2006 Workshop on Workflows in Support of Large-Scale Science.

[16]  Michael P. Cummings,et al.  PAUP* [Phylogenetic Analysis Using Parsimony (and Other Methods)] , 2004 .

[17]  Sachinkumar Wasnik,et al.  Bioinformatics Application Integration in GeneGrid , 2005 .

[18]  Bohn Stafleu van Loghum,et al.  Online … , 2002, LOG IN.

[19]  D. Swofford PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0b10 , 2002 .

[20]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[21]  C. Goble,et al.  The {my}Grid Project: Services, Architecture and Demonstrator , 2003 .

[22]  A. Townsend Peterson,et al.  Ecological Niche Modeling Using the Kepler Workflow System , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[23]  A. Silberschatz,et al.  RESEARCH DIRECTIONS IN BIODIVERSITY AND ECOSYSTEM INFORMATICS , 2001 .

[24]  D. Wilkinson,et al.  Towards an e-biology of ageing: integrating theory and data , 2003, Nature Reviews Molecular Cell Biology.

[25]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[26]  C. Kesselman,et al.  Montage: A Grid Enabled Image Mosaic Service for the National Virtual Observatory , 2004 .

[27]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..