Pre-execution Data Prefetching with Inter-thread I/O Scheduling

With the rate of computing power growing much faster than that of storage I/O access, parallel applications suffer more from I/O latency. I/O prefetching is effective in hiding I/O latency. However, existing I/O prefetching techniques are conservative and their effectiveness is limited. Recently, a more aggressive prefetching approach named pre-execution prefetching [19] has been proposed. In this paper, we first identify the drawback of this pre-execution prefetching approach, and then propose a new method to overcome the drawback by scheduling the I/O operations between the main thread and the prefetching thread. By careful I/O scheduling, our approach further extends the computation and I/O concurrency and avoids the I/O competition within one process. The results of extensive experiments, including experiments on real-life applications such as big matrix manipulation and Hill encryption, demonstrate the benefits of the proposed approach.

[1]  Jim Zelenka,et al.  Informed prefetching and caching , 1995, SOSP.

[2]  Rajeev Thakur,et al.  LACIO: A New Collective I/O Strategy for Parallel I/O Systems , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[3]  R. Ross,et al.  20 Parallel I / O and the Parallel Virtual File System , 2022 .

[4]  Chen Jin,et al.  Adaptive IO System (ADIOS) , 2008 .

[5]  Mahmut T. Kandemir,et al.  A compiler-directed data prefetching scheme for chip multiprocessors , 2009, PPoPP '09.

[6]  Robert B. Ross,et al.  Improving I/O Forwarding Throughput with Data Compression , 2011, 2011 IEEE International Conference on Cluster Computing.

[7]  Jack Dongarra,et al.  Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, Dublin, Ireland, September 7-10, 2008. Proceedings , 2008, PVM/MPI.

[8]  Rajeev Thakur,et al.  Data sieving and collective I/O in ROMIO , 1998, Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.

[9]  John Bent,et al.  PLFS: a checkpoint filesystem for parallel applications , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[10]  Yue Zhao,et al.  HIDING I/O LATENCY WITH PARALLEL PRE-EXECUTION PREFETCHING , 2012 .

[11]  Surendra Byna,et al.  Exploring Parallel I/O Concurrency with Speculative Prefetching , 2008, 2008 37th International Conference on Parallel Processing.

[12]  Daniel A. Reed Scalable Input/Output: Achieving System Balance , 2003 .

[13]  Russel Hugo Patterson,et al.  Informed Prefetching and Caching (CMU-CS-97-204) , 1997 .

[14]  Ming Wu,et al.  Scalability of heterogeneous computing , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[15]  Michael L. Scott,et al.  Aggressive Prefetching: An Idea Whose Time Has Come , 2005, HotOS.

[16]  Frank B. Schmuck,et al.  GPFS: A Shared-Disk File System for Large Computing Clusters , 2002, FAST.

[17]  Jehan-François Pâris,et al.  Making Early Predictions of File Accesses , 2005 .

[18]  Carla Schlatter Ellis,et al.  Prefetching in File Systems for MIMD Multiprocessors , 1990, IEEE Trans. Parallel Distributed Syst..

[19]  Surendra Byna,et al.  Hiding I/O latency with pre-execution prefetching for parallel applications , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[20]  Karsten Schwan,et al.  Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS) , 2008, CLADE '08.

[21]  Thomas Ludwig,et al.  Using Non-blocking I/O Operations in High Performance Computing to Reduce Execution Times , 2009, PVM/MPI.

[22]  Xiaoning Ding,et al.  DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch , 2007, USENIX Annual Technical Conference.

[23]  Todd C. Mowry,et al.  Compiler-based I/O prefetching for out-of-core applications , 2001, TOCS.

[24]  Angelos Bilas,et al.  Using transparent compression to improve SSD-based I/O caches , 2010, EuroSys '10.

[25]  John May,et al.  Parallel I/O for High Performance Computing , 2000 .