Exploiting Latent I/O Asynchrony in Petascale Science Applications

Current and emerging large-scale HPC applications face daunting I/O challenges. In existing codes, problems arise both from large data volumes and from the need to perform complex online data manipulations, including data staging, reorganization, and transformation. We describe three related techniques for enabling, encouraging, and exploiting latent I/O asynchrony in HPC applications: data taps, IOgraphs, and Metabots.

[1]  Michael T. Heath,et al.  Common‐refinement‐based data transfer between non‐matching meshes in multiphysics simulations , 2004 .

[2]  Bertram Ludäscher,et al.  Scientific workflow management and the Kepler system: Research Articles , 2006 .

[3]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[4]  R. Aymar,et al.  The ITER project , 1997 .

[5]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[6]  Ron A. Oldfield,et al.  Efficient Parallel I/o in sEismic Imaging , 1998, Int. J. High Perform. Comput. Appl..

[7]  Karsten Schwan,et al.  Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS) , 2008, CLADE '08.

[8]  Joel H. Saltz,et al.  An approach for automatic data virtualization , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[9]  Leonid Oliker,et al.  Leading Computational Methods on Scalar and Vector HEC Platforms , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[10]  Garth A. Gibson,et al.  A Case for Network-Attached Secure Disks, , 1996 .

[11]  Rolf Riesen,et al.  Lightweight I/O for Scientific Applications , 2006, 2006 IEEE International Conference on Cluster Computing.

[12]  Douglas Thain,et al.  Distributed computing in practice: the Condor experience , 2005, Concurr. Pract. Exp..

[13]  Karsten Schwan,et al.  XChange: coupling parallel applications in a dynamic environment , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[14]  Arun Jagatheesan,et al.  Gridflow description, query, and execution at SCEC using the SDSC matrix , 2004, Proceedings. 13th IEEE International Symposium on High performance Distributed Computing, 2004..

[15]  Karsten Schwan,et al.  Efficient Wire Formats for High Performance Computing , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[16]  Adam Arbree,et al.  Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[17]  Karsten Schwan,et al.  SmartPointers: Personalized Scientific Data Portals In Your Hand , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[18]  Karsten Schwan,et al.  Service Augmentation for High End Interactive Data Services , 2005, 2005 IEEE International Conference on Cluster Computing.

[19]  Calton Pu,et al.  Infosphere project: system support for information flow applications , 2001, SGMD.

[20]  Scott Klasky,et al.  Experiments with Wide Area Data Coupling Using the Seine Coupling Framework , 2006, HiPC.

[21]  Christos Faloutsos,et al.  Active Disks for Large-Scale Data Processing , 2001, Computer.

[22]  Karsten Schwan,et al.  Event services for high performance computing , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[23]  P H Rutherford The ITER Project. , 1996, Science.

[24]  Robert Latham,et al.  A next-generation parallel file system for Linux cluster. , 2004 .

[25]  B.P. Miller,et al.  MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[26]  Greg Eisenhauer,et al.  Fast heterogeneous binary data interchange , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[27]  Radu Calinescu,et al.  WSRF-Based Modeling of Clinical Trial Information for Collaborative Cancer Research , 2008, 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID).

[28]  Scott Klasky,et al.  Visualizing gyrokinetic simulations , 2004, IEEE Visualization 2004.

[29]  Edward A. Lee,et al.  Scientific workflow management and the Kepler system , 2006, Concurr. Comput. Pract. Exp..