论文信息 - Topic 2 - Performance Prediction and Evaluation

Topic 2 - Performance Prediction and Evaluation

Parallel computing enables solutions to computational problems that are impossible on sequential systems due to their limited performance. To meet this objective, it is critical that users can both measure performance on a given system and predict the performance for other systems. Achieving high performance on parallel computer systems is the product of an intimate combination of hardware architecture (processor, memory, interconnection network), system software, runtime environment, algorithms, and application design. Performance evaluation is the science of understanding these factors that contribute to the overall expression of parallel performance on real machines and on systems yet to be realized. Benchmarking and performance characterization methodologies and tools provide an empirical foundation for performance evaluation. Performance prediction techniques provide a means to model performance behaviors and properties as system, algorithm, and software features change, particularly in the context of large-scale parallelism. These two areas are closely related since most prediction requires data to be gathered from measured runs of a program, to identify application signatures or to understand the performance characteristics of current machines. A total of eighteen papers were submitted to the performance prediction and evaluation topic area. The submissions covered a broad range of prediction and evaluation topics, and reflect a high level of current interest in the parallel computing community. The eight papers accepted (44state-of-the-art results from leading parallel performance researchers in the field today. The papers cover two general themes in performance prediction and evaluation. The first theme considers methods to explore performance properties from di↵er-ent evaluation contexts: data access, processor, and interconnect. Three papers investigate performance issues on shared-memory machines (IBM Cyclops-64, SGI Altix 3700, and Sun Fire E25K). Another three articles center around the analysis of applications on distributed memory architectures (IBM BlueGene/L, Linux Multi-clusters, Clusters with Infiniband interconnect). The second theme concerns advances in performance prediction with two papers about tools for predicting multi-processor system on a chip (MPSoC) performance and system for hierarchical model validation. Finally, we would like to thanks all contributing authors as well as all reviewers for their work.

Allen D. Malony | Thomas Fahringer | Luís Silva | Allan Snavely