Buffering IO for data management in multi-physics simulations

A library for parallel IO and data management has been developed for multi-physics simulations. The goal of the library is to provide sustainable, interoperable, efficient, scalable, and convenient tools for parallel IO and data management for high-level data structures in numerical simulations, and to provide tools for the connection between applications. The high-level data structures include one- and multi-dimensional arrays, structured meshes, unstructured meshes, the meshes generated through (block-based, patch-based, and cell-based) adaptive mesh refinement, variables associated with these meshes, and data defined on particles in particle simulations. The IO mechanism can be collective and non-collective. The library is typically used for restarting files, visualization files, and files connecting different applications. The library is based on MPI-IO. Compared with the IO performance of MPI-IO, the overhead to write the explicit users' data structures are less than five percent. To further improve IO performance, in addition to the bookkeeping data, the library could buffer problem-size data before calling MPI-IO while keeping users' explicit high-level data structures. The buffering mechanism improves IO performance by a factor 10 to 20 in multi-physics simulations involving AMR and unstructured meshes.