Improving the Average Response Time in Collective I/O

In collective I/O, MPI processes exchange requests so that the rearranged requests can result in the shortest file system access time. Scheduling the exchange sequence determines the response time of participating processes. Existing implementations that simply follow the increasing order of file offsets do not necessary produce the best performance. To minimize the average response time, we propose three scheduling algorithms that consider the number of processes per file stripe and the number of accesses per process. Our experimental results demonstrate improvements of up to 50% in the average response time using two synthetic benchmarks and a high-resolution climate application.

[1]  Jianwei Li,et al.  Parallel netCDF: A High-Performance Scientific I/O Interface , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[2]  David Kotz,et al.  Disk-directed I/O for MIMD multiprocessors , 1994, OSDI '94.

[3]  Ravi Jain,et al.  Scheduling Parallel I/O Operations in Multiple Bus Systems , 1992, J. Parallel Distributed Comput..

[4]  Edgar Gabriel,et al.  Performance Evaluation of Collective Write Algorithms in MPI I/O , 2009, ICCS.

[5]  Wei-keng Liao,et al.  Design and Evaluation of MPI File Domain Partitioning Methods under Extent-Based File Locking Protocol , 2011, IEEE Transactions on Parallel and Distributed Systems.

[6]  Jack Dongarra,et al.  Computational Science – ICCS 2009: 9th International Conference Baton Rouge, LA, USA, May 25-27, 2009 Proceedings, Part I , 2009, ICCS.

[7]  Wei-keng Liao,et al.  Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[8]  Pangfeng Liu,et al.  Efficient distributed algorithms for parallel I/O scheduling , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[9]  Marianne Winslett,et al.  Server-Directed Collective I/O in Panda , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[10]  Akio Arakawa,et al.  CLOUDS AND CLIMATE: A PROBLEM THAT REFUSES TO DIE. Clouds of many , 2022 .

[11]  Alok N. Choudhary,et al.  Improved parallel I/O via a two-phase run-time access strategy , 1993, CARN.

[12]  Jesús Carretero,et al.  On Evaluating Decentralized Parallel I/O Scheduling Strategies for Parallel File Systems , 2006, VECPAR.

[13]  Karen Schuchardt,et al.  IO strategies and data services for petascale data sets from a global cloud resolving model , 2007 .

[14]  Ravi Jain,et al.  Applying randomized edge coloring algorithms to distributed communication: an experimental study , 1995, SPAA '95.

[15]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.