In Situ Storage Layout Optimization for AMR Spatio-temporal Read Accesses

Analyses of large simulation data often concentrate on regions in space and in time that contain important information. As simulations adopt Adaptive Mesh Refinement (AMR), the data records from a region of interest could be widely scattered on storage devices and accessing interesting regions results in significantly reduced I/O performance. In this work, we study the organization of block-structured AMR data on storage to improve performance of spatio-temporal data accesses. AMR has a complex hierarchical multi-resolution data structure that does not fit easily with the existing approaches that focus on uniform mesh data. To enable efficient AMR read accesses, we develop an in situ data layout optimization framework. Our framework automatically selects from a set of candidate layouts based on a performance model, and reorganizes the data before writing to storage. We evaluate this framework with three AMR datasets and access patterns derived from scientific applications. Our performance model is able to identify the best layout scheme and yields up to a 3X read performance improvement compared to the original layout. Though it is not possible to turn all read accesses into contiguous reads, we are able to achieve 90% of contiguous read throughput with the optimized layouts on average.

[1]  Mark F. Adams,et al.  Chombo Software Package for AMR Applications Design Document , 2014 .

[2]  Robert B. Ross,et al.  RADAR: Runtime Asymmetric Data-Access Driven Scientific Data Replication , 2014, ISC.

[3]  Rajeev Thakur,et al.  Pattern-Direct and Layout-Aware Replication Scheme for Parallel I/O Systems , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[4]  Surendra Byna,et al.  SDS: a framework for scientific data services , 2013, PDSW@SC.

[5]  G. Bryan,et al.  Introducing Enzo, an AMR Cosmology Application , 2004, astro-ph/0403044.

[6]  Santosh Pande,et al.  Performance prediction of large-scale parallel discrete event models of physical systems , 2005, Proceedings of the Winter Simulation Conference, 2005..

[7]  Yong Chen,et al.  Locality-driven high-level I/O aggregation for processing scientific datasets , 2013, 2013 IEEE International Conference on Big Data.

[8]  Ray W. Grout,et al.  EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization , 2011, 2011 IEEE International Conference on Cluster Computing.

[9]  David Trebotich,et al.  A Numerical Algorithm for Complex Biological Flow in Irregular Microdevice Geometries , 2004 .

[10]  Lustre : A Scalable , High-Performance File System Cluster , 2003 .

[11]  R. Deiterding A parallel adaptive method for simulating shock-induced combustion with detailed chemical kinetics in complex domains , 2009 .

[12]  Cláudio T. Silva,et al.  Visibility-based prefetching for interactive out-of-core rendering , 2003, IEEE Symposium on Parallel and Large-Data Visualization and Graphics, 2003. PVG 2003..

[13]  P. Colella,et al.  Local adaptive mesh refinement for shock hydrodynamics , 1989 .

[14]  Xian-He Sun,et al.  A cost-intelligent application-specific data layout scheme for parallel file systems , 2011, HPDC '11.

[15]  Daniel F. Martin,et al.  Adaptive mesh, finite volume modeling of marine ice sheets , 2013, J. Comput. Phys..

[16]  B. R. Noack,et al.  On the transition of the cylinder wake , 1995 .

[17]  Dinesh Manocha,et al.  Cache-oblivious mesh layouts , 2005, ACM Trans. Graph..

[18]  Divyakant Agrawal,et al.  Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data , 2010, SIGMOD 2010.

[19]  Robert B. Ross,et al.  PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[20]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[21]  Surendra Byna,et al.  Expediting scientific data analysis with reorganization of data , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[22]  Hua Ji,et al.  A new adaptive mesh refinement data structure with an application to detonation , 2010, J. Comput. Phys..

[23]  H. Sagan Space-filling curves , 1994 .

[24]  Robert B. Ross,et al.  MLOC: Multi-level Layout Optimization Framework for Compressed Scientific Data Exploration with Heterogeneous Access Patterns , 2012, 2012 41st International Conference on Parallel Processing.

[25]  Surendra Byna,et al.  Model-Driven Data Layout Selection for Improving Read Performance , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[26]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[27]  Jingjin Wu,et al.  Improving Parallel IO Performance of Cell-based AMR Cosmology Applications , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[28]  Yu Zhuang,et al.  Hierarchical Collective I/O Scheduling for High-Performance Computing , 2015, Big Data Res..