Data webs for earth science data

We describe high performance data webs for earth science data, which are designed for interactively analyzing small to moderate size remote data sets, as well as mining distributed data sets. Achieving high performance required developing specialized high performance transport services as well as specialized high performance middleware services for merging multiple data streams. Data webs complement data grids, which are grid based infrastructures designed to support arbitrary distributed computation over distributed data using a trusted computing model.

[1]  Robert L. Grossman,et al.  Data mining standards initiatives , 2002, CACM.

[2]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[3]  Robert L. Grossman,et al.  Simple Available Bandwidth Utilization Library for High-Speed Wide Area Networks , 2005, The Journal of Supercomputing.

[4]  Steven Tuecke,et al.  Protocols and services for distributed data-intensive science , 2002 .

[5]  Robert L. Grossman,et al.  A Dataspace Infrastructure for Astronomical Data , 2001 .

[6]  Philip K. Chan,et al.  Advances in Distributed and Parallel Knowledge Discovery , 2000 .

[7]  Robert L. Grossman,et al.  Merging Multiple Data Streams on Common Keys over High Performance Networks , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[8]  Yong Zhao,et al.  Chimera: a virtual data system for representing, querying, and automating data derivation , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[9]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[10]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[11]  Rajeev Thakur,et al.  Users guide for ROMIO: A high-performance, portable MPI-IO implementation , 1997 .

[12]  Robert L. Grossman,et al.  DataSpace: a data Web for the exploratory analysis and mining of data , 2002, Comput. Sci. Eng..

[13]  Brian Kantor,et al.  Network News Transfer Protocol , 1986, RFC.

[14]  Rajeev Thakur,et al.  On implementing MPI-IO portably and with high performance , 1999, IOPADS '99.

[15]  Robert L. Grossman,et al.  PSockets: The Case for Application-level Network Striping for Data Intensive Applications using High Speed Wide Area Networks , 2000, ACM/IEEE SC 2000 Conference (SC'00).