Influence of Noisy Environments on Behavior of HPC Applications

[1]  Alexander Antonov,et al.  Hierarchical Domain Representation in the AlgoWiki Encyclopedia: From Problems to Implementations , 2018 .

[2]  Carl E. Rasmussen,et al.  Gaussian Process Training with Input Noise , 2011, NIPS.

[3]  T. Alexey,et al.  Generalized Approach to Scalability Analysis of Parallel Applications , 2016 .

[4]  Yuichi Inadomi,et al.  Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  Alexander Antonov,et al.  Generalized Approach to Scalability Analysis of Parallel Applications , 2016, ICA3PP Workshops.

[6]  Dirk Schmidl,et al.  Score-P: A Unified Performance Measurement System for Petascale Applications , 2010, CHPC.

[7]  Bernd Mohr,et al.  The Scalasca performance toolset architecture , 2010 .

[8]  Felix Wolf,et al.  Estimating the Impact of External Interference on Application Performance , 2018, Euro-Par.

[9]  Torsten Hoefler,et al.  Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling , 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[10]  Tiemo Bang,et al.  DBMS Fitting: Why should we learn what we already know? , 2020, CIDR.

[11]  Alexander Antonov,et al.  Using Empirical Data for Scalability Analysis of Parallel Applications , 2019 .

[12]  Konstantin Stefanov,et al.  Supercomputer Lomonosov-2: Large Scale, Deep Monitoring and Fine Analytics for the User Community , 2019, Supercomput. Front. Innov..

[13]  Gerhard Wellein,et al.  Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study , 2019, 2019 IEEE International Conference on Cluster Computing (CLUSTER).

[14]  Martin Schulz,et al.  Performance Analysis Techniques for the Exascale Co-Design Process , 2013, PARCO.

[15]  F. Petrini,et al.  The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[16]  Gerhard Wellein,et al.  Automatic loop kernel analysis and performance modeling with Kerncraft , 2015, PMBS '15.

[17]  Simon Goldsmith,et al.  Measuring empirical computational complexity , 2007, ESEC-FSE '07.

[18]  Laura Carrington,et al.  A performance prediction framework for scientific applications , 2003, Future Gener. Comput. Syst..

[19]  Nisheeth K. Vishnoi,et al.  The Impact of Noise on the Scaling of Collectives: A Theoretical Approach , 2005, HiPC.

[20]  Adolfy Hoisie,et al.  Palm: easing the burden of analytical performance modeling , 2014, ICS '14.

[21]  Zhou Tong,et al.  Fast classification of MPI applications using Lamport's logical clocks , 2018, J. Parallel Distributed Comput..

[22]  Robert B. Ross,et al.  Watch Out for the Bully! Job Interference Study on Dragonfly Network , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[23]  Van Jacobson,et al.  The synchronization of periodic routing messages , 1994, TNET.

[24]  William J. Dally,et al.  Technology-Driven, Highly-Scalable Dragonfly Topology , 2008, 2008 International Symposium on Computer Architecture.

[25]  Marc Casas,et al.  Design Space Exploration of Next-Generation HPC Machines , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[26]  Torsten Hoefler,et al.  Performance modeling for systematic performance tuning , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[27]  Mark Giampapa,et al.  Experiences with a Lightweight Supercomputer Kernel: Lessons Learned from Blue Gene's CNK , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[28]  Torsten Hoefler,et al.  Scientific Benchmarking of Parallel Computing Systems Twelve ways to tell the masses when reporting performance results , 2017 .

[29]  Samuel Williams,et al.  Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.

[30]  Susan Coghlan,et al.  Benchmarking the effects of operating system interference on extreme-scale parallel machines , 2008, Cluster Computing.

[31]  Dong H. Ahn,et al.  Scalable I/O-Aware Job Scheduling for Burst Buffer Enabled HPC Clusters , 2016, HPDC.

[32]  Gunter Saake,et al.  SPL Conqueror: Toward optimization of non-functional properties in software product lines , 2012, Software Quality Journal.

[33]  Kevin Harms,et al.  Run-to-run Variability on Xeon Phi based Cray XC Systems , 2017, SC17: International Conference for High Performance Computing, Networking, Storage and Analysis.

[34]  Torsten Hoefler,et al.  Characterizing the Influence of System Noise on Large-Scale Applications by Simulation , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[35]  Sven Apel,et al.  Performance‐influence models of multigrid methods: A case study on triangular grids , 2017, Concurr. Comput. Pract. Exp..

[36]  Jeffrey S. Vetter,et al.  Aspen: A domain specific language for performance modeling , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[37]  Dieter an Mey,et al.  Brainware for green HPC , 2012, Computer Science - Research and Development.

[38]  R. Scott Studham,et al.  NWPerf: a system wide performance monitoring tool for large Linux clusters , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[39]  Nathan R. Tallent,et al.  HPCTOOLKIT: tools for performance analysis of optimized parallel programs , 2010, Concurr. Comput. Pract. Exp..

[40]  Bernd Mohr,et al.  The Scalasca performance toolset architecture , 2010, Concurr. Comput. Pract. Exp..

[41]  Tapasya Patki,et al.  Performance optimality or reproducibility: that is the question , 2019, SC.

[42]  Allen D. Malony,et al.  The Tau Parallel Performance System , 2006, Int. J. High Perform. Comput. Appl..

[43]  Katherine E. Isaacs,et al.  There goes the neighborhood: Performance degradation due to nearby jobs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[44]  Bernd Hamann,et al.  Combing the Communication Hairball: Visualizing Parallel Execution Traces using Logical Time , 2014, IEEE Transactions on Visualization and Computer Graphics.

[45]  Torsten Hoefler,et al.  Isoefficiency in Practice: Configuring and Understanding the Performance of Task-based Applications , 2017, PPoPP.

[46]  Sally A. McKee,et al.  Methods of inference and learning for performance modeling of parallel applications , 2007, PPoPP.

[47]  Robert Ricci,et al.  Active Learning in Performance Analysis , 2016, 2016 IEEE International Conference on Cluster Computing (CLUSTER).

[48]  Torsten Hoefler,et al.  Using automated performance modeling to find scalability bugs in complex codes , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[49]  Robert Ricci,et al.  Taming Performance Variability , 2018, OSDI.