The HPCG benchmark: analysis, shared memory preliminary improvements and evaluation on an Arm-based platform
暂无分享,去创建一个
Jesús Labarta | Marc Casas | Filippo Spiga | Filippo Mantovani | Daniel Ruiz | Jesús Labarta | F. Mantovani | Daniel Ruiz | Marc Casas | F. Spiga
[1] Jack Dongarra,et al. Introduction to the HPCChallenge Benchmark Suite , 2004 .
[2] Jean-François Méhaut,et al. The Mont-Blanc prototype: an alternative approach for high-performance computing systems , 2016 .
[3] Yousef Saad,et al. Iterative methods for sparse linear systems , 2003 .
[4] Takeshi Iwashita,et al. Algebraic multicolor ordering for parallelized ICCG solver in finite-element analyses , 2002 .
[5] George Bosilca,et al. UCX: An Open Source Framework for HPC Network APIs and Beyond , 2015, 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects.
[6] Massimiliano Fatica,et al. A CUDA Implementation of the High Performance Conjugate Gradient Benchmark , 2014, PMBS@SC.
[7] Alejandro Rico,et al. ARM HPC Ecosystem and the Reemergence of Vectors: Invited Paper , 2017, Conf. Computing Frontiers.
[8] Jack Dongarra,et al. Sunway TaihuLight supercomputer makes its appearance , 2016 .
[9] Christoph Hagleitner,et al. Boosting the Efficiency of HPCG and Graph500 with Near-Data Processing , 2017, 2017 46th International Conference on Parallel Processing (ICPP).
[10] Hiroaki Kobayashi,et al. Performance and Power Analysis of SX-ACE Using HP-X Benchmark Programs , 2017, 2017 IEEE International Conference on Cluster Computing (CLUSTER).
[11] Hiroshi Nakashima,et al. Algebraic Block Multi-Color Ordering Method for Parallel Multi-Threaded Sparse Triangular Solver in ICCG Method , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.
[12] Mateo Valero,et al. Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[13] Enrico Calore,et al. Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters , 2018 .
[14] C. W. Glass,et al. Performance Modeling of the HPCG Benchmark , 2014, PMBS@SC.
[15] Paul Walker,et al. The ARM Scalable Vector Extension , 2017, IEEE Micro.
[16] Gene H. Golub,et al. Matrix computations , 1983 .
[17] Naoya Maruyama,et al. High-performance conjugate gradient performance improvement on the K computer , 2016, Int. J. High Perform. Comput. Appl..
[18] Ananta Tiwari,et al. Characterizing the Performance-Energy Tradeoff of Small ARM Cores in HPC Computation , 2014, Euro-Par.
[19] Karl W. Schulz,et al. Cluster Computing with OpenHPC , 2016 .
[20] Filippo Mantovani,et al. Is Arm software ecosystem ready for HPC , 2017 .
[21] Sandia Report,et al. HPCG Technical Specification , 2013 .
[22] Pradeep Dubey,et al. Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[23] Steven Skiena,et al. The Algorithm Design Manual , 2020, Texts in Computer Science.
[24] Jun Zhou,et al. Use of SIMD Vector Operations to Accelerate Application Code Performance on Low-Powered ARM and Intel Platforms , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[25] Chao Yang,et al. Optimizing and Scaling HPCG on Tianhe-2: Early Experience , 2014, ICA3PP.
[26] Sandia Report,et al. Toward a New Metric for Ranking High Performance Computing Systems , 2013 .