DuctTeip : A task-based parallel programming framework for distributed memory architectures
暂无分享,去创建一个
[1] Jack J. Dongarra,et al. Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead , 2006, PARA.
[2] Cédric Augonnet,et al. StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators , 2012, EuroMPI.
[3] Katherine A. Yelick,et al. Hybrid PGAS runtime support for multicore nodes , 2010, PGAS '10.
[4] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[5] David Padua,et al. Encyclopedia of Parallel Computing , 2011 .
[6] Thomas Hérault,et al. PaRSEC: Exploiting Heterogeneity to Enhance Scalability , 2013, Computing in Science & Engineering.
[7] Pavol Bauer,et al. Fast event-based epidemiological simulations on national scales , 2015, Int. J. High Perform. Comput. Appl..
[8] Martin Tillenius,et al. SuperGlue: A Shared Memory Framework Using Data Versioning for Dependency-Aware Task-Based Parallelization , 2015, SIAM J. Sci. Comput..
[9] Elisabeth Larsson,et al. Resource-Aware Task Scheduling , 2015, ACM Trans. Embed. Comput. Syst..
[10] Thomas Hérault,et al. Algorithm-based fault tolerance for dense matrix factorizations , 2012, PPoPP '12.
[11] Elisabeth Larsson,et al. Programming Models Based on Data Versioning for Dependency-aware Task-based Parallelisation , 2012, 2012 IEEE 15th International Conference on Computational Science and Engineering.
[12] Thomas Hérault,et al. Algorithm-Based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy , 2015, ACM Trans. Parallel Comput..
[13] David Black-Schaffer,et al. Towards more efficient execution: a decoupled access-execute approach , 2013, ICS '13.
[14] Thomas Hérault,et al. DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[15] Laxmikant V. Kalé,et al. CHARM++: a portable concurrent object oriented system based on C++ , 1993, OOPSLA '93.
[16] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[17] George Almási. PGAS (Partitioned Global Address Space) Languages , 2011, Encyclopedia of Parallel Computing.
[18] Jesús Labarta,et al. A dependency-aware task-based programming environment for multi-core architectures , 2008, 2008 IEEE International Conference on Cluster Computing.
[19] Jesús Labarta,et al. ClusterSs: a task-based programming model for clusters , 2011, HPDC '11.
[20] Guillaume Mercier,et al. hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.
[21] Sverker Holmgren,et al. Dynamic Autotuning of Adaptive Fast Multipole Methods on Hybrid Multicore CPU and GPU Systems , 2013, SIAM J. Sci. Comput..
[22] Emanuel H. Rubensson,et al. Chunks and Tasks: A programming model for parallelization of dynamic algorithms , 2012, Parallel Comput..
[23] Elisabeth Larsson,et al. A scalable RBF-FD method for atmospheric flow , 2015, J. Comput. Phys..