论文信息 - BPAR: A Bundle-Based Parallel Aggregation Framework for Decoupled I/O Execution

BPAR: A Bundle-Based Parallel Aggregation Framework for Decoupled I/O Execution

In today's "Big Data" era, developers have adopted I/O techniques such as MPI-IO, Parallel NetCDF and HDF5 to garner enough performance to manage the vast amount of data that scientific applications require. These I/O techniques offer parallel access to shared datasets and together with a set of optimizations such as data sieving and two-phase I/O to boost I/O throughput. While most of these techniques focus on optimizing the access pattern on a single file or file extent, few of these techniques consider cross-file I/O optimizations. This paper aims to explore the potential benefit from cross-file I/O aggregation. We propose a Bundle-based PARallel Aggregation framework (BPAR) and design three partitioning schemes under such framework that targets at improving the I/O performance of a mission-critical application GEOS-5, as well as a broad range of other scientific applications. The results of our experiments reveal that BPAR can achieve on average 2.1× performance improvement over the baseline GEOS-5.

[1] Jianwei Li,et al. Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[2] John Shalf,et al. Tuning HDF5 for Lustre File Systems , 2010 .

[3] Frank B. Schmuck,et al. GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[4] Eric Barton,et al. A Novel network request scheduler for a large scale storage system , 2009, Computer Science - Research and Development.

[5] Cong Xu,et al. Profiling and Improving I/O Performance of a Large-Scale Climate Scientific Application , 2013, 2013 22nd International Conference on Computer Communication and Networks (ICCCN).

[6] Karsten Schwan,et al. Adaptable, metadata rich IO methods for portable high performance IO , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[7] Robert B. Ross,et al. Small-file access in parallel file systems , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[8] Michael Stonebraker,et al. Efficient organization of large multidimensional arrays , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[9] Feiyi Wang,et al. OLCF ’ s 1 TB / s , Next-Generation Lustre File System , 2013 .

[10] Robert B. Ross,et al. On the role of burst buffers in leadership-class storage systems , 2012, 012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST).

[11] Surendra Byna,et al. Parallel I/O prefetching using MPI file caching and I/O signatures , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[12] Jeffrey S. Vetter,et al. Performance characterization and optimization of parallel I/O on the Cray XT , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[13] Robert Latham,et al. Combining I/O operations for multiple array variables in parallel netCDF , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[14] Robert B. Ross,et al. PVFS: A Parallel File System for Linux Clusters , 2000, Annual Linux Showcase & Conference.

[15] Scott Klasky,et al. A lightweight I/O scheme to facilitate spatial and temporal queries of scientific data analytics , 2013, 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST).

[16] Saurabh Gupta,et al. Best Practices and Lessons Learned from Deploying and Operating Large-Scale Data-Centric Parallel File Systems , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[17] Robert B. Ross,et al. Accelerating I/O Forwarding in IBM Blue Gene/P Systems , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[18] Galen M. Shipman,et al. A Next-Generation Parallel File System Environment for the OLCF , 2012 .

[19] Teng Wang,et al. BurstMem: A high-performance burst buffer system for scientific applications , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[20] Iain Bethune,et al. Adding Parallel I/O to PARA-BMU , 2012 .

[21] Wei-keng Liao,et al. Scaling parallel I/O performance through I/O delegate and caching system , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[22] Karsten Schwan,et al. DataStager: scalable data staging services for petascale applications , 2009, HPDC '09.

[23] Teng Wang,et al. A case of system-wide power management for scientific applications , 2013, 2013 IEEE International Conference on Cluster Computing (CLUSTER).

[24] Jeffrey S. Vetter,et al. ParColl: Partitioned Collective I/O on the Cray XT , 2008, 2008 37th International Conference on Parallel Processing.

[25] Rajeev Thakur,et al. Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[26] Cong Xu,et al. SLOAVx: Scalable LOgarithmic AlltoallV Algorithm for Hierarchical Multicore Systems , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[27] Robert Latham,et al. Scalable I/O forwarding framework for high-performance computing systems , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[28] Wei-keng Liao,et al. I/O analysis and optimization for an AMR cosmology application , 2002, Proceedings. IEEE International Conference on Cluster Computing.

[29] John Bent,et al. PLFS: a checkpoint filesystem for parallel applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[30] Yong Chen,et al. Locality-driven high-level I/O aggregation for processing scientific datasets , 2013, 2013 IEEE International Conference on Big Data.