Bridging the Gap Between OpenMP and Task-Based Runtime Systems for the Fast Multipole Method
暂无分享,去创建一个
Emmanuel Agullo | Bérenger Bramas | Olivier Coulaud | Olivier Aumage | Samuel Pitoiset | Bérenger Bramas | O. Coulaud | E. Agullo | Olivier Aumage | Samuel Pitoiset
[1] Samuel Williams,et al. The Landscape of Parallel Computing Research: A View from Berkeley , 2006 .
[2] Alejandro Duran,et al. The Design of OpenMP Tasks , 2009, IEEE Transactions on Parallel and Distributed Systems.
[3] Hatem Ltaief,et al. Data‐driven execution of fast multipole methods , 2012, Concurr. Comput. Pract. Exp..
[4] Bronis R. de Supinski,et al. A ROSE-Based OpenMP 3.0 Research Compiler Supporting Multiple Runtime Libraries , 2010, IWOMP.
[5] Alexander Aiken,et al. Regent: a high-productivity programming language for HPC with logical regions , 2015, SC15: International Conference for High Performance Computing, Networking, Storage and Analysis.
[6] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[7] Laxmikant V. Kale,et al. Programming heterogeneous clusters with accelerators using object-based programming , 2011 .
[8] Benoit Lange,et al. Parallel Dual Tree Traversal on Multi-core and Many-core Architectures for Astrophysical N-body Simulations , 2014, Euro-Par.
[9] Leslie Greengard,et al. A fast algorithm for particle simulations , 1987 .
[10] Eduard Ayguadé,et al. Implementing OmpSs support for regions of data in architectures with multiple address spaces , 2013, ICS '13.
[11] Eduard Ayguadé,et al. OpenMP tasks in IBM XL compilers , 2008, CASCON '08.
[12] Samuel Williams,et al. Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).
[13] James LaGrone,et al. OpenMP 3 . 0 Tasking Implementation in OpenUH ∗ , 2009 .
[14] Eric F Darve,et al. Fast hierarchical algorithms for generating Gaussian random fields , 2015 .
[15] Richard W. Vuduc,et al. Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[16] Thomas Hérault,et al. DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[17] Henri Casanova,et al. Parallel Algorithms , 2019, Design and Analysis of Algorithms.
[18] Lorena A. Barba,et al. A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems , 2011, Int. J. High Perform. Comput. Appl..
[19] Bruno Raffin,et al. Locality-Aware Work Stealing on Multi-CPU and Multi-GPU Architectures , 2013 .
[20] Emmanuel Agullo,et al. Task-Based FMM for Multicore Architectures , 2014, SIAM J. Sci. Comput..
[21] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[22] Emmanuel Agullo,et al. Task‐based FMM for heterogeneous architectures , 2016, Concurr. Comput. Pract. Exp..
[23] Alejandro Duran,et al. Mercurium: Design Decisions for a S2S Compiler , 2011 .