论文信息 - Multi-Parameter Performance Modeling Based on Machine Learning with Basic Block Features

Multi-Parameter Performance Modeling Based on Machine Learning with Basic Block Features

Considering the increasing complexity and scale of HPC architecture and software, the performance modeling of parallel applications on large-scale HPC platforms has become increasingly important. It plays an important role in many areas, such as performance analysis, job management, and resource estimation. In this work, we propose a multi-parameter performance modeling and prediction framework called MPerfPred, which utilizes basic block frequencies as features and uses machine learning algorithms to automatically construct multi-parameter performance models with high generalization ability. To reduce the prediction overhead, we propose some feature-filtering strategies to reduce the number of features in the training stage and build a serial program called BBF collector for each target application to quickly collect feature values in the prediction stage. We demonstrate the use of MPerfPred on the TianHe-2 supercomputer with six parallel applications. Results show that MPerfPred with SVR achieves better prediction than other input parameter-based modeling methods. The average prediction error and average standard deviation of prediction errors of MPerfPred are 8.42% and 6.09%, respectively. In the prediction stage, the average prediction overhead of MPerfPred is less than 0.13% of the total execution time.

[1] Henri Casanova,et al. SimGrid: A Generic Framework for Large-Scale Distributed Experiments , 2008, Tenth International Conference on Computer Modeling and Simulation (uksim 2008).

[2] Weizhe Zhang,et al. PROFPRED: A Compiler-Level IR Based Performance Prediction Framework for MPI Industrial Applications , 2019, 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[3] Torsten Hoefler,et al. PEMOGEN: Automatic adaptive performance modeling during program runtime , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[4] Torsten Hoefler,et al. Performance modeling for systematic performance tuning , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[5] Thomas Lippert,et al. Trends in supercomputing: The European path to exascale , 2011, Comput. Phys. Commun..

[6] Wenguang Chen,et al. Performance Prediction for Large-Scale Parallel Applications Using Representative Replay , 2016, IEEE Transactions on Computers.

[7] R. Hornung,et al. HYDRODYNAMICS CHALLENGE PROBLEM , 2011 .

[8] Xiaohan Ma,et al. Statistical Power Consumption Analysis and Modeling for GPU-based Computing , 2011 .

[9] William Gropp,et al. Learning with Analytical Models , 2018, 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[10] James R. Larus,et al. Optimally profiling and tracing programs , 1992, POPL '92.

[11] Hui He,et al. Performance modeling for MPI applications with low overhead fine-grained profiling , 2019, Future Gener. Comput. Syst..

[12] Laurence T. Yang,et al. Automatic generation of benchmarks for I/O-intensive parallel applications , 2019, J. Parallel Distributed Comput..

[13] Sally A. McKee,et al. Methods of inference and learning for performance modeling of parallel applications , 2007, PPoPP.

[14] Zhiling Lan,et al. Trade-Off Between Prediction Accuracy and Underestimation Rate in Job Runtime Estimates , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).

[15] Weizhe Zhang,et al. DwarfCode: A Performance Prediction Tool for Parallel Applications , 2016, IEEE Transactions on Computers.

[16] Denis Trystram,et al. Improving backfilling by using machine learning to predict running times , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[17] Torsten Hoefler,et al. Using automated performance modeling to find scalability bugs in complex codes , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[18] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[19] Rob J Hyndman,et al. Another look at measures of forecast accuracy , 2006 .

[20] Adolfy Hoisie,et al. Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications , 2000, Int. J. High Perform. Comput. Appl..

[21] Keith D. Cooper,et al. Engineering a Compiler , 2003 .

[22] Kevin Leyton-Brown,et al. Algorithm runtime prediction: Methods & evaluation , 2012, Artif. Intell..

[23] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[24] Martin Schulz,et al. A regression-based approach to scalability prediction , 2008, ICS '08.