Lattice-CSC: Optimizing and Building an Efficient Supercomputer for Lattice-QCD and to Achieve First Place in Green500
暂无分享,去创建一个
Volker Lindenstruth | David Rohr | Owe Philipsen | Matthias Bach | Christopher Pinke | Gvozden Neskovic | V. Lindenstruth | G. Neskovic | D. Rohr | M. Bach | O. Philipsen | C. Pinke
[1] Volker Lindenstruth,et al. A Flexible and Portable Large-Scale DGEMM Library for Linpack on Next-Generation Multi-GPU Systems , 2015, 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[2] Alan Gara,et al. QCDOC: A 10 Teraflops Computer for Tightly-Coupled Calculations , 2004, Proceedings of the ACM/IEEE SC2004 Conference.
[3] Rajan Gupta. Introduction to lattice QCD , 1998, hep-lat/9807028.
[4] Volker Lindenstruth,et al. A Comprehensive Approach for a Power Efficient General Purpose Supercomputer , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[5] Steven A. Gottlieb,et al. Scaling lattice QCD beyond 100 GPUs , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[6] Claude Gomez,et al. QPACE - a QCD parallel computer based on Cell processors , 2009, ArXiv.
[7] Kipton Barros,et al. Solving lattice QCD systems of equations using mixed precision solvers on GPUs , 2009, Comput. Phys. Commun..
[8] A. Sciarra,et al. Nature of the Roberge-Weiss transition in N f = 2 QCD with Wilson fermions , 2014, 1402.0838.
[9] Volker Lindenstruth,et al. An Energy-Efficient Multi-GPU Supercomputer , 2014, 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS).
[10] Thomas Sterling,et al. How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters 2nd Printing , 1999 .
[11] Volker Lindenstruth,et al. Multi-GPU DGEMM and High Performance Linpack on Highly Energy-Efficient Clusters , 2011, IEEE Micro.
[12] Volker Lindenstruth,et al. Optimized HPL for AMD GPU and multi-core CPU usage , 2011, Computer Science - Research and Development.
[13] Pradeep Dubey,et al. High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[14] Bálint Joó,et al. A Framework for Lattice QCD Calculations on GPUs , 2014, 2014 IEEE 28th International Parallel and Distributed Processing Symposium.
[15] Pier Stanislao Paolucci,et al. The APE-100 Computer: (I) the Architecture , 1993, Int. J. High Speed Comput..
[16] Volker Lindenstruth,et al. Lattice QCD based on OpenCL , 2012, Comput. Phys. Commun..
[17] Jack J. Dongarra,et al. The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..
[18] V. Lindenstruth,et al. Twisted-Mass Lattice QCD using OpenCL , 2014 .
[19] Wu-chun Feng,et al. Making a case for a Green500 list , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.
[20] M. Bach,et al. CL2QCD – Lattice QCD based on OpenCL , 2014, 1411.5219.