论文信息 - A Machine Learning Framework for Performance Coverage Analysis of Proxy Applications

A Machine Learning Framework for Performance Coverage Analysis of Proxy Applications

Proxy applications are written to represent subsets of performance behaviors of larger, and more complex applications that often have distribution restrictions. They enable easy evaluation of these behaviors across systems, e.g., for procurement or co-design purposes. However, the intended correlation between the performance behaviors of proxy applications and their parent codes is often based solely on the developer's intuition. In this paper, we present novel machine learning techniques to methodically quantify the coverage of performance behaviors of parent codes by their proxy applications. We have developed a framework, VERITAS, to answer these questions in the context of on-node performance: a) which hardware resources are covered by a proxy application and how well, and b) which resources are important, but not covered. We present our techniques in the context of two benchmarks, STREAM and DGEMM, and two production applications, OpenMC and CMTnek, and their respective proxy applications.

[1] Laxmikant V. Kalé,et al. Identifying the Culprits Behind Network Congestion , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[2] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.

[3] Andrew Siegel,et al. XSBENCH - THE DEVELOPMENT AND VERIFICATION OF A PERFORMANCE ABSTRACTION FOR MONTE CARLO REACTOR ANALYSIS , 2014 .

[4] George Ho,et al. PAPI: A Portable Interface to Hardware Performance Counters , 1999 .

[5] Benoit Forget,et al. The OpenMC Monte Carlo particle transport code , 2012 .

[6] David L Donoho,et al. Compressed sensing , 2006, IEEE Transactions on Information Theory.

[7] Karthikeyan Natesan Ramamurthy,et al. Image Understanding Using Sparse Representations , 2014, Synthesis Lectures on Image, Video, and Multimedia Processing.

[8] S. Mallat,et al. Adaptive greedy approximations , 1997 .

[9] Jeffrey S. Vetter,et al. Scalable Analysis Techniques for Microprocessor Performance Counter Metrics , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[10] Joel A. Tropp,et al. Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[11] Karthikeyan Natesan Ramamurthy,et al. Boosted dictionaries for image restoration based on sparse representations , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12] R. Campbell,et al. Automated Fingerprinting of Performance Pathologies Using Performance Monitoring Units ( PMUs ) , 2011 .

[13] Isabelle Bloch,et al. Some aspects of Dempster-Shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account , 1996, Pattern Recognit. Lett..

[14] Martin Schulz,et al. Exploring Traditional and Emerging Parallel Programming Models Using a Proxy Application , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[15] David Zhang,et al. A Survey of Sparse Representation: Algorithms and Applications , 2015, IEEE Access.

[16] Isabelle Bloch,et al. Application of Dempster-Shafer evidence theory to unsupervised classification in multisource remote sensing , 1997, IEEE Trans. Geosci. Remote. Sens..

[17] Alan D. George,et al. CMT-bone: A Mini-App for Compressible Multiphase Turbulence Simulation Software , 2015, 2015 IEEE International Conference on Cluster Computing.

[18] Michael Elad,et al. On the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations , 2008, IEEE Transactions on Information Theory.

[19] Andrew R. Siegel,et al. Performance Analysis of a Reduced Data Movement Algorithm for Neutron Cross Section Data in Monte Carlo Simulations , 2014, EASC.

[20] Andrew R. Siegel,et al. Multi-core performance studies of a Monte Carlo neutron transport code , 2014, Int. J. High Perform. Comput. Appl..

[21] Lizy Kurian John,et al. Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite , 2007, ISCA '07.

[22] Laxmikant V. Kalé,et al. Predicting application performance using supervised learning on communication features , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).