AMRZone: A Runtime AMR Data Sharing Framework for Scientific Applications

Frameworks that facilitate runtime data sharingacross multiple applications are of great importance for scientificdata analytics. Although existing frameworks work well overuniform mesh data, they can not effectively handle adaptive meshrefinement (AMR) data. Among the challenges to construct anAMR-capable framework include: (1) designing an architecturethat facilitates online AMR data management, (2) achievinga load-balanced AMR data distribution for the data stagingspace at runtime, and (3) building an effective online indexto support the unique spatial data retrieval requirements forAMR data. Towards addressing these challenges to supportruntime AMR data sharing across scientific applications, wepresent the AMRZone framework. Experiments over real-worldAMR datasets demonstrate AMRZone's effectiveness at achievinga balanced workload distribution, reading/writing large-scaledatasets with thousands of parallel processes, and satisfyingqueries with spatial constraints. Moreover, AMRZone's performance and scalability are even comparable with existing state-of-the-art work when tested over uniform mesh data with up to16384 cores, in the best case, our framework achieves a 46% performance improvement.

[1]  Kevin Harms,et al.  Scalable Parallel I/O on a Blue Gene/Q Supercomputer Using Compression, Topology-Aware Data Aggregation, and Subfiling , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[2]  Hanan Samet,et al.  Storing a collection of polygons using quadtrees , 1985, TOGS.

[3]  Karsten Schwan,et al.  PreDatA – preparatory data analytics on peta-scale machines , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[4]  Odysseas I. Pentakalos An Introduction to the InfiniBand Architecture , 2002, Int. CMG Conference.

[5]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[6]  Craig A. Knoblock,et al.  Advanced Programming in the UNIX Environment , 1992, Addison-Wesley professional computing series.

[7]  James M. Kang,et al.  Space-Filling Curves , 2017, Encyclopedia of GIS.

[8]  Houjun Tang,et al.  Parallel In Situ Detection of Connected Components in Adaptive Mesh Refinement Data , 2015, 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[9]  Scott Klasky,et al.  Moving the Code to the Data - Dynamic Code Deployment Using ActiveSpaces , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[10]  P. Colella,et al.  Local adaptive mesh refinement for shock hydrodynamics , 1989 .

[11]  Mark F. Adams,et al.  Chombo Software Package for AMR Applications Design Document , 2014 .

[12]  Fan Zhang,et al.  Combining in-situ and in-transit processing to enable extreme-scale scientific analysis , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[14]  Karsten Schwan,et al.  Event-based systems: opportunities and challenges at exascale , 2009, DEBS '09.

[15]  Arie Shoshani,et al.  Hello ADIOS: the challenges and lessons of developing leadership class I/O frameworks , 2014, Concurr. Comput. Pract. Exp..

[16]  Mark A. Franklin,et al.  Gemini: An Optical Interconnection Network for Parallel Processing , 2002, IEEE Trans. Parallel Distributed Syst..

[17]  Rudolf Bayer The Universal B-Tree for multidimensional Indexing , 1996 .

[18]  Christos Faloutsos,et al.  Analysis of the Clustering Properties of the Hilbert Space-Filling Curve , 2001, IEEE Trans. Knowl. Data Eng..

[19]  Scott Klasky,et al.  DataSpaces: an interaction and coordination framework for coupled simulation workflows , 2012, HPDC '10.

[20]  Sanjoy Dasgupta,et al.  Learning Polytrees , 1999, UAI.

[21]  M. Berger,et al.  Adaptive mesh refinement for hyperbolic partial differential equations , 1982 .

[22]  Karsten Schwan,et al.  Flexpath: Type-Based Publish/Subscribe System for Large-Scale Science Analytics , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[23]  Karsten Schwan,et al.  DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[24]  M. Berger,et al.  Automatic adaptive grid refinement for the Euler equations , 1985 .

[25]  Karsten Schwan,et al.  In-situ I/O processing: a case for location flexibility , 2011, PDSW '11.

[26]  Daniel F. Martin,et al.  Adaptive mesh, finite volume modeling of marine ice sheets , 2013, J. Comput. Phys..

[27]  Rynson W. H. Lau,et al.  Heat diffusion based dynamic load balancing for distributed virtual environments , 2010, VRST '10.

[28]  Surendra Byna,et al.  Parallel query evaluation as a Scientific Data Service , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[29]  Scott Klasky,et al.  DART: a substrate for high speed asynchronous data IO , 2008, HPDC '08.