A statistical approach for the analysis of the relation between low-level performance information, the code, and the environment

This paper presents a methodology for aiding a scientific programmer to evaluate the performance of parallel programs on advanced architectures. It applies well-defined design of experiments methods to the identification of relations among different levels in the process of mapping computational operations to high-performance computing systems. Statistical analysis is used for studying different factors that affect the mapping process of scientific computing algorithms to advanced architectures. In particular a case study on the numerical solution of finite element methods for the analysis of conformal antennas for electromagnetic radiation applications was used to test the proposed methodology. The use of statistics for identification of relationships among factors has formalized the solution of the problem and this novel approach allows unbiased conclusions about results. Subset selection based on principal components was used to determine the subset of metrics required to explain the behavior of the system.

[1]  Jeffrey S. Vetter,et al.  Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[2]  Adrian Cockcroft,et al.  Sun performance and tuning (2nd ed.): Java and the Internet , 1998 .

[3]  Raj Jain,et al.  The Art of Computer Systems Performance Analysis : Tech-niques for Experimental Design , 1991 .

[4]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[5]  Manoranjan Dash,et al.  Dimensionality reduction of unsupervised data , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[6]  Graham D. Riley,et al.  Knowledge Specification for Automatic Performance Analysis - APART Technical Report , 1999 .

[7]  Yong Luo,et al.  Adaptive multivariate regression for advanced memory system evaluation: application and experience , 2001, Perform. Evaluation.

[8]  Barton P. Miller,et al.  The Paradyn Parallel Performance Measurement Tool , 1995, Computer.

[9]  Bernd Mohr,et al.  Design and Prototype of a Performance Tool Interface for OpenMP , 2002, The Journal of Supercomputing.

[10]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[11]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[12]  Matthew J. Saltzman,et al.  Statistical Analysis of Computational Tests of Algorithms and Heuristics , 2000, INFORMS J. Comput..

[13]  Pedro E. López-de-Teruel,et al.  Nonlinear kernel-based statistical pattern analysis , 2001, IEEE Trans. Neural Networks.

[14]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[15]  M. Velez-Reyes,et al.  Subset selection analysis for the reduction of hyperspectral imagery , 1998, IGARSS '98. Sensing and Managing the Environment. 1998 IEEE International Geoscience and Remote Sensing. Symposium Proceedings. (Cat. No.98CH36174).

[16]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Jeffrey S. Vetter,et al.  Scalable Analysis Techniques for Microprocessor Performance Counter Metrics , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[18]  Adrian Cockcroft,et al.  Sun Performance and Tuning: Java and the Internet , 1998 .

[19]  R. Cranley,et al.  Multivariate Analysis—Methods and Applications , 1985 .

[20]  Ray Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[21]  Sam Kash Kachigan Statistical Analysis: An Interdisciplinary Introduction to Univariate & Multivariate Methods , 1986 .