A LIGHTWEIGHT RUN-TIME SUPPORT FOR FAST DENSE LINEAR ALGEBRA ON MULTI-CORE
暂无分享,去创建一个
Marco Danelutto | Tiziano De Matteis | Gabriele Mencagli | Massimo Torquati | Daniele Buono | M. Torquati | M. Danelutto | Daniele Buono | G. Mencagli | T. D. Matteis
[1] Heinrich Meyr,et al. High level software synthesis for signal processing systems , 1992, [1992] Proceedings of the International Conference on Application Specific Array Processors.
[2] Jack Dongarra,et al. Fully Dynamic Scheduler for Numerical Computing on Multicore Processors , 2009 .
[3] Jesús Labarta,et al. A dependency-aware task-based programming environment for multi-core architectures , 2008, 2008 IEEE International Conference on Cluster Computing.
[4] Thomas Hérault,et al. Flexible Development of Dense Linear Algebra Algorithms on Massively Parallel Architectures with DPLASMA , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[5] Arthur H. Veen,et al. Dataflow machine architecture , 1986, CSUR.
[6] Jack Dongarra,et al. Parallel tiled QR factorization for multicore architectures , 2008 .
[7] Peter Kilpatrick,et al. Accelerating Code on Multi-cores with FastFlow , 2011, Euro-Par.
[8] Julien Langou,et al. A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..
[9] Gabriele Mencagli,et al. EVALUATION OF ARCHITECTURAL SUPPORTS FOR FINE-GRAINED SYNCHRONIZATION MECHANISMS , 2013 .
[10] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[11] Peter Kilpatrick,et al. An Efficient Unbounded Lock-Free Queue for Multi-core Systems , 2012, Euro-Par.
[12] Horacio González-Vélez,et al. A survey of algorithmic skeleton frameworks: high‐level structured parallel programming enablers , 2010, Softw. Pract. Exp..
[13] Jack J. Dongarra,et al. A scalable framework for heterogeneous GPU-based clusters , 2012, SPAA '12.
[14] Cédric Augonnet,et al. StarPU: a unified platform for task scheduling on heterogeneous multicore architectures , 2011, Concurr. Comput. Pract. Exp..
[15] Jack J. Dongarra,et al. Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization , 2008, IEEE Transactions on Parallel and Distributed Systems.
[16] Jack Dongarra,et al. QUARK Users' Guide: QUeueing And Runtime for Kernels , 2011 .
[17] Thomas Hérault,et al. DAGuE: A Generic Distributed DAG Engine for High Performance Computing , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.
[18] Murray Cole,et al. Algorithmic Skeletons: Structured Management of Parallel Computation , 1989 .
[19] Marco Danelutto,et al. Parallel Patterns for General Purpose Many-Core , 2013, 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.
[20] Jack J. Dongarra,et al. Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architectures , 2011, Concurr. Comput. Pract. Exp..