High-performance GRID computing in chemoinformatics

The field of Grid computing provides access to a scalable pool of computing resources and promises to unlock many chemometric problems that were previously computationally prohibitive. This chapter describes many of the details and concepts behind the Grid paradigm, and presents a number of chemometric investigations that have adopted the Grid to address their computational and data access needs.

[1]  Bertram Ludäscher,et al.  Kepler: an extensible system for design and execution of scientific workflows , 2004 .

[2]  Geoffrey C. Fox,et al.  High-performance commodity computing , 1998 .

[3]  Liz Lyon,et al.  Integrating research data into the publication workflow: the eBank UK experience , 2004 .

[4]  Francine Berman,et al.  The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  Min Zhan,et al.  A data review and re-assessment of ovarian cancer serum proteomic profiling , 2003, BMC Bioinformatics.

[6]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[7]  Douglas B. Kell,et al.  Genetic algorithms as a method for variable selection in multiple linear regression and partial least squares regression, with applications to pyrolysis mass spectrometry , 1997 .

[8]  T. Oinn,et al.  Soaplab - a unified Sesame door to analysis tools , 2003 .

[9]  Reagan Moore,et al.  Virtualization Services for Data Grids , 2003 .

[10]  Ian T. Foster,et al.  The Anatomy of the Grid: Enabling Scalable Virtual Organizations , 2001, Int. J. High Perform. Comput. Appl..

[11]  Domenico Talia,et al.  Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids , 2005, PKDD.

[12]  David Abramson,et al.  Virtual Laboratory: Enabling On-Demand Drug Design with the World Wide Grid , 2001, ArXiv.

[13]  Reagan Moore,et al.  The SDSC storage resource broker , 2010, CASCON.

[14]  Ian T. Foster,et al.  A security architecture for computational grids , 1998, CCS '98.

[15]  Dianna L. Hardy,et al.  YourSRB: A cross platform interface for SRB and Digital Libraries , 2007, ACSW.

[16]  C. Goble,et al.  The {my}Grid Project: Services, Architecture and Demonstrator , 2003 .

[17]  G. Fox,et al.  Overview of Grid Computing Environments , 2003 .

[18]  Douglas Thain,et al.  Grid Deployment of Legacy Bioinformatics Applications with Transparent Data Access , 2006, 2006 7th IEEE/ACM International Conference on Grid Computing.

[19]  Steven Tuecke,et al.  An online credential repository for the Grid: MyProxy , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[20]  Arie Shoshani,et al.  Storage resource managers: Middleware components for gridstorage , 2005 .

[21]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[22]  Yvan Vander Heyden,et al.  Benchmarking of QSAR Models for Blood-Brain Barrier Permeation , 2007, J. Chem. Inf. Model..

[23]  David Abramson,et al.  The Virtual Laboratory: a toolset to enable distributed molecular modelling for drug design on the World‐Wide Grid , 2003, Concurr. Comput. Pract. Exp..

[24]  Gary W. Kramer,et al.  Spectro ML-A Markup Language for Molecular Spectrometry Data , 2001 .

[25]  Ian T. Foster,et al.  Data management and transfer in high-performance computational grid environments , 2002, Parallel Comput..

[26]  David Abramson,et al.  Nimrod: a tool for performing parametrised simulations using distributed workstations , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[27]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[28]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[29]  David Abramson,et al.  The Grid Economy , 2005, Proceedings of the IEEE.

[30]  Michael A. Rappa,et al.  The utility business model and the future of computing services , 2004, IBM Syst. J..

[31]  David Abramson,et al.  A flexible IO scheme for grid workflows , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[32]  Anne E. Trefethen,et al.  The Data Deluge: An e-Science Perspective , 2003 .