Extending scalability of collective IO through nessie and staging

The increasing fidelity of scientific simulations as they scale towards exascale sizes is straining the proven IO techniques championed throughout terascale computing. Chief among the successful IO techniques is the idea of collective IO where processes coordinate and exchange data prior to writing to storage in an effort to reduce the number of small, independent IO operations. As well as collective IO works for efficiently creating a data set in the canonical order, 3-D domain decompositions prove troublesome due to the amount of data exchanged prior to writing to storage. When each process has a tiny piece of a 3-D simulation space rather than a complete 'pencil' or 'plane', 2-D or 1-D domain decompositions respectively, the communication overhead to rearrange the data can dwarf the time spent actually writing to storage [27]. Our approach seeks to transparently increase scalability and performance while maintaining both the IO routines in the application and the final data format in the storage system. Accomplishing this leverages both the Nessie [23] RPC framework and a staging area with staging services. Through these tools, we employ a variety of data processing operations prior to invoking the native API to write data to storage yielding as much as a 3X performance improvement over the native calls.

[1]  M. Polte,et al.  Fast log-based concurrent writing of checkpoints , 2008, 2008 3rd Petascale Data Storage Workshop.

[2]  Keith D. Underwood,et al.  An Evaluation of the Impacts of Network Bandwidth and Dual-Core Processors on Scalability , 2006 .

[3]  Karsten Schwan,et al.  Six degrees of scientific data: reading patterns for extreme scale science IO , 2011, HPDC '11.

[4]  Mark Frederick Hoemmen,et al.  An Overview of Trilinos , 2003 .

[5]  Jeffrey S. Vetter,et al.  Performance characterization and optimization of parallel I/O on the Cray XT , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[6]  James W. Hurrell The CommuniTy earTh SySTem model , 2013 .

[7]  Rajeev Thakur,et al.  Improving collective I/O performance using threads , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[8]  Onkar Sahni,et al.  Scalable parallel I/O alternatives for massively parallel partitioned solver systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[9]  Karsten Schwan,et al.  LIVE data workspace: A flexible, dynamic and extensible platform for petascale applications , 2007, 2007 IEEE International Conference on Cluster Computing.

[10]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[11]  Wei-keng Liao,et al.  Scaling parallel I/O performance through I/O delegate and caching system , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Mariana Vertenstein,et al.  An application-level parallel I/O library for Earth system models , 2012, Int. J. High Perform. Comput. Appl..

[13]  Karsten Schwan,et al.  PreDatA – preparatory data analytics on peta-scale machines , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[14]  Rolf Riesen,et al.  Portals 3.0: protocol building blocks for low overhead communication , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[15]  Jeffrey S. Vetter,et al.  ParColl: Partitioned Collective I/O on the Cray XT , 2008, 2008 37th International Conference on Parallel Processing.

[16]  Scott Klasky,et al.  Terascale direct numerical simulations of turbulent combustion using S3D , 2008 .

[17]  Patrick M. Widener,et al.  Efficient Data-Movement for Lightweight I/O , 2006, 2006 IEEE International Conference on Cluster Computing.

[18]  Scott Klasky,et al.  DataSpaces: an interaction and coordination framework for coupled simulation workflows , 2012, HPDC '10.

[19]  Robert Latham,et al.  Scalable I/O forwarding framework for high-performance computing systems , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[20]  Karsten Schwan,et al.  Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[21]  10th IEEE International Symposium on High Performance Distributed Computing (HPDC-10 2001), 7-9 August 2001, San Francisco, CA, USA , 2001, HPDC.

[22]  Rolf Riesen,et al.  Lightweight I/O for Scientific Applications , 2006, 2006 IEEE International Conference on Cluster Computing.

[23]  Karsten Schwan,et al.  Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS) , 2008, CLADE '08.