Lattice QCD on Intel Xeon Phi
暂无分享,去创建一个
Victor W. Lee | Kiran Pamnany | Mikhail Smelyanskiy | Dhiraj D. Kalamkar | Bálint Joó | K. Vaidyanathan | William Watson | Pradeep Dubey
[1] Pradeep Dubey,et al. 3.5-D Blocking Optimization for Stencil Computations on Modern CPUs and GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[2] Dong Chen,et al. QCDSP machines: design, performance and cost , 1998, SC '98.
[3] Henk A. van der Vorst,et al. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems , 1992, SIAM J. Sci. Comput..
[4] Bálint Joó. SciDAC-2 software infrastructure for lattice QCD , 2007 .
[5] Kipton Barros,et al. Solving lattice QCD systems of equations using mixed precision solvers on GPUs , 2009, Comput. Phys. Commun..
[6] Jun Doi. Peta-scale Lattice Quantum Chromodynamics on a Blue Gene/Q supercomputer , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.
[7] Peter A. Boyle,et al. The BlueGene/Q supercomputer , 2012 .
[8] Craig Pelissier,et al. Efficient Implementation of the Overlap Operator on Multi-GPUs , 2011, 2011 Symposium on Application Accelerators in High-Performance Computing.
[9] Antonino Zichichi,et al. New phenomena in subnuclear physics , 1977 .
[10] Xipeng Shen,et al. Implementing the Dslash Operator in OpenCL , 2010 .
[11] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[12] Peter A. Boyle,et al. The BAGEL assembler generation library , 2009, Comput. Phys. Commun..
[13] Jie Chen,et al. GMH: A Message Passing Toolkit for GPU Clusters , 2010, 2010 IEEE 16th International Conference on Parallel and Distributed Systems.
[14] Volker Lindenstruth,et al. Lattice QCD based on OpenCL , 2012, Comput. Phys. Commun..
[15] Andrew Pochinsky,et al. Writing Efficient QCD Code Made Simpler: qa0 , 2009 .
[16] M. A. Clark,et al. High-efficiency Lattice QCD computations on the Fermi architecture , 2012, 2012 Innovative Parallel Computing (InPar).
[17] Pradeep Dubey,et al. High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[18] Bálint Joó,et al. Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[19] Message P Forum,et al. MPI: A Message-Passing Interface Standard , 1994 .
[20] Philip Heidelberger,et al. The BlueGene/L supercomputer and quantum ChromoDynamics , 2006, SC.
[21] Pradeep Dubey,et al. Design and Implementation of the Linpack Benchmark for Single and Multi-node Systems Based on Intel® Xeon Phi Coprocessor , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.
[22] Kenneth G. Wilson,et al. Quarks and Strings on a Lattice , 1977 .
[23] Michael Lang,et al. The reverse-acceleration model for programming petascale hybrid systems , 2009, IBM J. Res. Dev..
[24] Barbara Horner-Miller,et al. Proceedings of the 2006 ACM/IEEE conference on Supercomputing , 2006 .
[25] Robert Strzodka,et al. Pipelined Mixed Precision Algorithms on FPGAs for Fast and Accurate PDE Solvers from Low Precision Components , 2006, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.