The performance of parallel distributed file systems suffers from many clients executing a large number of operations in parallel, because the I/O subsystem can be easily overwhelmed by the sheer amount of incoming I/O operations. Many optimizations exist that try to alleviate this problem. Client-side optimizations perform preprocessing to minimize the amount of work the file servers have to do. Server-side optimizations use server-internal knowledge to improve performance. The HD Trace framework contains components to simulate, trace and visualize applications. It is used as a test bed to evaluate optimizations that could later be implemented in real-life projects. This paper compares existing client-side optimizations and newly implemented server-side optimizations and evaluates their usefulness for I/O patterns commonly found in HPC. Server-directed I/O chooses the order of non-contiguous I/O operations and tries to aggregate as many operations as possible to decrease the load on the I/O subsystem and improve overall performance. The results show that server-side optimizations beat client-side optimizations in terms of performance for many use cases. Integrating such optimizations into parallel distributed file systems could alleviate the need for sophisticated client-side optimizations. Due to their additional knowledge of internal workflows server-side optimizations may be better suited to provide high performance in general.
[1]
David Kotz,et al.
Disk-directed I/O for MIMD multiprocessors
,
1994,
OSDI '94.
[2]
Rajeev Thakur,et al.
Optimizing noncontiguous accesses in MPI-IO
,
2002,
Parallel Comput..
[3]
Michael Kühn.
Simulation-Aided Performance Evaluation of Input/Output Optimizations for Distributed Systems
,
2009
.
[4]
Jesús Carretero,et al.
Multiple-Phase Collective I/O Technique for Improving Data Access Locality
,
2007,
15th EUROMICRO International Conference on Parallel, Distributed and Network-Based Processing (PDP'07).
[5]
Thomas Ludwig,et al.
Optimizations for Two-Phase Collective I/O
,
2011,
PARCO.
[6]
Robert B. Ross,et al.
Noncontiguous I/O accesses through MPI-IO
,
2003,
CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..
[7]
Jesús Carretero,et al.
A collective I/O implementation based on inspector–executor paradigm
,
2008,
The Journal of Supercomputing.
[8]
Rajeev Thakur,et al.
Data sieving and collective I/O in ROMIO
,
1998,
Proceedings. Frontiers '99. Seventh Symposium on the Frontiers of Massively Parallel Computation.