An optimized task-based runtime system for resource-constrained parallel accelerators
暂无分享,去创建一个
[1] Spiros N. Agathos,et al. Design and Implementation of OpenMP Tasks in the OMPi Compiler , 2011, 2011 15th Panhellenic Conference on Informatics.
[2] Alejandro Duran,et al. An adaptive cut-off for task parallelism , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.
[3] Barbara M. Chapman,et al. Implementing OpenMP on a high performance embedded multicore MPSoC , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.
[4] Mats Brorsson,et al. A comparative performance study of common and popular task‐centric programming frameworks , 2015, Concurr. Comput. Pract. Exp..
[5] Alejandro Duran,et al. Evaluation of OpenMP Task Scheduling Strategies , 2008, IWOMP.
[6] Alejandro Duran,et al. The Design of OpenMP Tasks , 2009, IEEE Transactions on Parallel and Distributed Systems.
[7] Cheng Wang,et al. libEOMP: a portable OpenMP runtime library based on MCA APIs for embedded systems , 2013, PMAM '13.
[8] Kazuki Sakamoto,et al. Grand Central Dispatch , 2012 .
[9] Luca Benini,et al. Simplifying Many-Core-Based Heterogeneous SoC Programming With Offload Directives , 2015, IEEE Transactions on Industrial Informatics.
[10] Karl-Filip Faxén,et al. Wool-A work stealing library , 2008, CARN.
[11] Luca Benini,et al. VirtualSoC: A Full-System Simulation Environment for Massively Parallel Heterogeneous System-on-Chip , 2013, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum.
[12] Luca Benini,et al. Platform 2012, a many-core computing accelerator for embedded SoCs: Performance evaluation of visual analytics applications , 2012, DAC Design Automation Conference 2012.
[13] Bradley C. Kuszmaul,et al. Cilk: an efficient multithreaded runtime system , 1995, PPOPP '95.
[14] James Reinders,et al. Intel® threading building blocks , 2008 .
[15] Luca Benini,et al. Enabling fine-grained OpenMP tasking on tightly-coupled shared memory clusters , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[16] Chris D. Marlin. Coroutines: A Programming Methodology, a Language Design and an Implementation , 1980, Lecture Notes in Computer Science.
[17] Christopher J. Hughes,et al. Carbon: architectural support for fine-grained parallelism on chip multiprocessors , 2007, ISCA '07.