暂无分享,去创建一个
David A. Ham | J. Ramanujam | Paul H. J. Kelly | Ana Lucia Varbanescu | Gheorghe-Teodor Bercea | Fabio Luporini | Florian Rathgeber | J. Ramanujam | P. Kelly | Gheorghe-Teodor Bercea | D. Ham | Florian Rathgeber | F. Luporini | A. Varbanescu
[1] Paul H. J. Kelly,et al. Optimized code generation for finite element local assembly using symbolic manipulation , 2013, TOMS.
[2] Garth N. Wells,et al. Optimizations for quadrature representations of finite element tensors through automated code generation , 2011, TOMS.
[3] Chun Chen,et al. Speeding up Nek5000 with autotuning and specialization , 2010, ICS '10.
[4] Richard Veras,et al. A stencil compiler for short-vector SIMD architectures , 2013, ICS '13.
[5] I. Z. Reguly,et al. Vectorizing Unstructured Mesh Computations for Many-core Architectures , 2014, PMAM'14.
[6] Uday Bondhugula,et al. A practical automatic polyhedral parallelizer and locality optimizer , 2008, PLDI '08.
[7] Robert Michael Kirby,et al. From h to p efficiently: Implementing finite and spectral/hp element methods to achieve optimal performance for low- and high-order discretisations , 2010, J. Comput. Phys..
[8] Franz Franchetti,et al. SPIRAL: Code Generation for DSP Transforms , 2005, Proceedings of the IEEE.
[9] Sriram Krishnamoorthy,et al. Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry. , 2009, The journal of physical chemistry. A.
[10] Matthew G. Knepley,et al. Finite Element Integration on GPUs , 2013, TOMS.
[11] Nikolaus A. Adams,et al. 11 PFLOP/s simulations of cloud cavitation collapse , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[12] Robert J. Harrison,et al. Model-Driven SIMD Code Generation for a Multi-resolution Tensor Kernel , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[13] David A. Ham,et al. Towards generating optimised finite element solvers for GPUs from high-level specifications , 2010, ICCS.
[14] Matthew G. Knepley,et al. Optimizing the Evaluation of Finite Element Matrices , 2005, SIAM J. Sci. Comput..
[15] Lawrence Mitchell,et al. PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured Meshes , 2012, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis.
[16] Krzysztof Banas,et al. Vectorized OpenCL implementation of numerical integration for higher order finite elements , 2013, Comput. Math. Appl..
[17] Anders Logg,et al. A compiler for variational forms , 2006, TOMS.
[18] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[19] Anders Logg,et al. Unified form language: A domain-specific language for weak formulations of partial differential equations , 2012, TOMS.
[20] Anders Logg,et al. Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book , 2012 .
[21] Bradley C. Kuszmaul,et al. The pochoir stencil compiler , 2011, SPAA '11.
[22] Eric Darve,et al. Liszt: A domain specific language for building portable mesh-based PDE solvers , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[23] Steven G. Johnson,et al. The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.
[24] Markus Püschel,et al. A Basic Linear Algebra Compiler , 2014, CGO '14.
[25] Lawrence Mitchell,et al. Performance-Portable Finite Element Assembly Using PyOP2 and FEniCS , 2013, ISC.