A Portability Layer of an All-pairs Operation for Hierarchical N-Body Algorithm Framework Tapas
暂无分享,去创建一个
Motohiko Matsuda | Keisuke Fukuda | Naoya Maruyama | N. Maruyama | Motohiko Matsuda | Keisuke Fukuda
[1] W. Paul Cockshott,et al. Array languages and the N‐body problem , 2014, Concurr. Comput. Pract. Exp..
[2] Simon L. Peyton Jones,et al. Harnessing the Multicores: Nested Data Parallelism in Haskell , 2008, FSTTCS.
[3] Zhenjiang Hu,et al. A library of constructive skeletons for sequential style of parallel programming , 2006, InfoScale '06.
[4] Hans-Wolfgang Loidl,et al. Parallel Haskell implementations of the N‐body problem , 2014, Concurr. Comput. Pract. Exp..
[5] Christoph W. Kessler,et al. SkePU: a multi-backend skeleton programming library for multi-GPU systems , 2010, HLPP '10.
[6] Ade Miller,et al. C++ AMP: Accelerated Massive Parallelism with Microsoft Visual C++ , 2012 .
[7] Chee Keong Kwoh,et al. Pairwise Distance Matrix Computation for Multiple Sequence Alignment on the Cell Broadband Engine , 2009, ICCS.
[8] David Tarditi,et al. Accelerator: using data parallelism to program GPUs for general-purpose uses , 2006, ASPLOS XII.
[9] Rio Yokota,et al. An FMM Based on Dual Tree Traversal for Many-Core Architectures , 2012, ArXiv.
[10] Srinivas Aluru,et al. All-pairs computations on many-core graphics processors , 2013, Parallel Comput..
[11] Simon L. Peyton Jones,et al. Work efficient higher-order vectorisation , 2012, ICFP '12.
[12] Clemens Grelck,et al. Merging Compositions of Array Skeletons in SAC , 2005, PARCO.
[13] Spencer Rugaber,et al. Programming with idioms in APL , 1979, APL '79.
[14] Clemens Grelck,et al. SAC—A Functional Array Language for Efficient Multi-threaded Execution , 2006, International Journal of Parallel Programming.
[15] Satoshi Matsuoka,et al. Tapas: An Implicitly Parallel Programming Framework for Hierarchical N-Body Algorithms , 2016, 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS).
[16] H. Carter Edwards,et al. Kokkos: Enabling Performance Portability Across Manycore Architectures , 2013, 2013 Extreme Scaling Workshop (xsw 2013).
[17] Horacio González-Vélez,et al. N‐body computations using skeletal frameworks on multicore CPU/graphics processing unit architectures: an empirical performance evaluation , 2014, Concurr. Comput. Pract. Exp..
[18] Nathan Bell,et al. Thrust: A Productivity-Oriented Library for CUDA , 2012 .
[19] Kenneth E. Iverson,et al. A programming language , 1899, AIEE-IRE '62 (Spring).
[20] Laxmikant V. Kalé,et al. Scaling Hierarchical N-body Simulations on GPU Clusters , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.
[21] Michael S. Warren,et al. A parallel hashed oct-tree N-body algorithm , 1993, Supercomputing '93. Proceedings.
[22] Clemens Grelck,et al. SaC/C formulations of the all‐pairs N‐body problem and their performance on SMPs and GPGPUs , 2014, Concurr. Comput. Pract. Exp..
[23] David C. Cann,et al. A Report on the Sisal Language Project , 1990, J. Parallel Distributed Comput..
[24] Sergei Gorlatch,et al. Introducing and Implementing the Allpairs Skeleton for Programming Multi-GPU Systems , 2013, International Journal of Parallel Programming.
[25] Daniel Sunderland,et al. Kokkos Array performance-portable manycore programming model , 2012, PMAM '12.
[26] Richard S. Bird,et al. Two exercises found in a book on algorithmics , 1987 .
[27] Ming Ouyang,et al. Compute Pairwise Manhattan Distance and Pearson Correlation Coefficient of Data Points with GPU , 2009, 2009 10th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing.