As scientists expand their models to describe physical phenomena of increasingly large extent, I/O becomes crucial and a system with limited I/O capacity can severely constrain the performance of the entire program.We provide experimental results, performed on an lntel Touchtone Delta and nCUBE 2 I/O system, to show that the performance of existing parallel I/O systems can vary by several orders of magnitude as a function of the data access pattern of the parallel program. We then propose a two-phase access strategy, to be implemented in a runtime system, in which the data distribution on computational nodes is decoupled from storage distribution. Our experimental results show that performance improvements of several orders of magnitude over direct access based data distribution methods can be obtained, and that performance for most data access patterns can be improved to within a factor of 2 of the best performance. Further, the cost of redistribution is a very small fraction of the overall access cost.
[1]
Juan Miguel Del Rosario.
High Performance Parallel I/O on the nCUBE 2 (並列/分散処理論文 ) -- (ア-キテクチャ)
,
1992
.
[2]
Ken Kennedy,et al.
Fortran D Language Specification
,
1990
.
[3]
E. DeBenedictis,et al.
nCUBE's Parallel I/O with Unix Compatibility
,
1991,
The Sixth Distributed Memory Computing Conference, 1991. Proceedings.
[4]
James C. French,et al.
Performance Measurement of the Concurrent File System of the Intel iPSC/2 Hypercube
,
1993,
J. Parallel Distributed Comput..
[5]
Alok N. Choudhary,et al.
An experimental performance evaluation of Touchstone Delta Concurrent File System
,
1993,
ICS '93.