libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms
暂无分享,去创建一个
Thierry Gautier | Vincent Danjean | François Broquedis | T. Gautier | Vincent Danjean | François Broquedis
[1] Jesper Larsson Träff,et al. Euro-Par 2010 Parallel Processing Workshops - HeteroPar, HPCC, HiBB, CoreGrid, UCHPC, HPCF, PROPER, CCPI, VHPC, Ischia, Italy, August 31-September 3, 2010, Revised Selected Papers , 2011, Euro-Par Workshops.
[2] Alejandro Duran,et al. Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP , 2009, 2009 International Conference on Parallel Processing.
[3] Thierry Gautier,et al. The X-Kaapi's Application Programming Interface. Part I: Data Flow Programming , 2011 .
[4] Stephen L. Olivier,et al. Scheduling task parallelism on multi-socket multicore systems , 2011, ROSS '11.
[5] Alejandro Duran,et al. Evaluation of OpenMP Task Scheduling Strategies , 2008, IWOMP.
[6] Jesús Labarta,et al. Parallelizing dense and banded linear algebra libraries using SMPSs , 2009, Concurr. Comput. Pract. Exp..
[7] Jack Dongarra,et al. QUARK Users' Guide: QUeueing And Runtime for Kernels , 2011 .
[8] Barbara M. Chapman,et al. A Runtime Implementation of OpenMP Tasks , 2011, IWOMP.
[9] Emilio Luque,et al. Euro-Par 2008 - Parallel Processing, 14th International Euro-Par Conference, Las Palmas de Gran Canaria, Spain, August 26-29, 2008, Proceedings , 2008, Euro-Par.
[10] Jack Dongarra,et al. Scheduling dense linear algebra operations on multicore processors , 2010 .
[11] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[12] Maged M. Michael,et al. Idempotent work stealing , 2009, PPoPP '09.
[13] Bronis R. de Supinski,et al. Evolving OpenMP in an Age of Extreme Parallelism, 5th International Workshop on OpenMP, IWOMP 2009, Dresden, Germany, June 3-5, 2009, Proceedings , 2009, IWOMP.
[14] William Gropp,et al. OpenMP in the Petascale Era - 7th International Workshop on OpenMP, IWOMP 2011, Chicago, IL, USA, June 13-15, 2011. Proceedings , 2011, IWOMP.
[15] Alejandro Duran,et al. A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures , 2009, IWOMP.
[16] Jesús Labarta,et al. Parallelizing dense and banded linear algebra libraries using SMPSs , 2009 .
[17] Nir Shavit,et al. Flat combining and the synchronization-parallelism tradeoff , 2010, SPAA '10.
[18] Bogdan Dumitrescu,et al. Two-dimensional block partitionings for the parallel sparse Cholesky factorization , 2004, Numerical Algorithms.
[19] Thierry Gautier,et al. X-Kaapi C programming interface , 2011 .
[20] Vivek Sarkar,et al. X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.
[21] Michael Voss,et al. Optimization via Reflection on Work Stealing in TBB , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[22] Jérémie Allard,et al. Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations , 2010, Euro-Par.
[23] Nir Shavit,et al. Non-blocking steal-half work queues , 2002, PODC '02.
[24] Denis Trystram,et al. A Tighter Analysis of Work Stealing , 2010, ISAAC.
[25] Gerson G. H. Cavalheiro,et al. Athapascan-1: On-line building data flow graph in a parallel language , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).
[26] Allan Porterfield,et al. OpenMP task scheduling strategies for multicore NUMA systems , 2012, Int. J. High Perform. Comput. Appl..
[27] Julien Langou,et al. A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..
[28] Thierry Gautier,et al. KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors , 2007, PASCO '07.
[29] Bradford L. Chamberlain,et al. Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..
[30] Jack Dongarra,et al. Dynamically scheduled Cholesky factorization on multicore architectures with GPU accelerators , 2010, HiPC 2010.
[31] C. Greg Plaxton,et al. Thread Scheduling for Multiprogrammed Multiprocessors , 1998, SPAA '98.
[32] Domenico Talia,et al. Euro-Par 2010 - Parallel Processing , 2010, Lecture Notes in Computer Science.
[33] Thierry Gautier,et al. Fine Grain Distributed Implementation of a Dataflow Language with Provable Performances , 2007, International Conference on Computational Science.
[34] Bruno Raffin,et al. A Work Stealing Scheduler for Parallel Loops on Shared Cache Multicores , 2010, Euro-Par Workshops.
[35] Spiros N. Agathos,et al. Design and Implementation of OpenMP Tasks in the OMPi Compiler , 2011, 2011 15th Panhellenic Conference on Informatics.
[36] Matteo Frigo,et al. The implementation of the Cilk-5 multithreaded language , 1998, PLDI.
[37] Thierry Gautier,et al. Deque-Free Work-Optimal Parallel STL Algorithms , 2008, Euro-Par.
[38] Alejandro Duran,et al. Extending the OpenMP Tasking Model to Allow Dependent Tasks , 2008, IWOMP.
[39] Bronis R. de Supinski,et al. OpenMP in a New Era of Parallelism, 4th International Workshop, IWOMP 2008, West Lafayette, IN, USA, May 12-14, 2008, Proceedings , 2008, IWOMP.