OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure
暂无分享,去创建一个
Kentaro Sano | Andreas Koch | Jens Huthmann | Artur Podobas | Lukas Sommer | Lukas Sommer | A. Koch | Artur Podobas | K. Sano | Jens Huthmann
[1] Jan Langer,et al. OmpSs@Zynq all-programmable SoC ecosystem , 2014, FPGA.
[2] Thomas Steinke,et al. OpenMP to FPGA Offloading Prototype Using OpenCL SDK , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[3] John L. Gustafson,et al. Beating Floating Point at its Own Game: Posit Arithmetic , 2017, Supercomput. Front. Innov..
[4] Andreas Koch,et al. Synthesis of interleaved multithreaded accelerators from OpenMP loops , 2017, 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig).
[5] M. Mitchell Waldrop,et al. The chips are down for Moore’s law , 2016, Nature.
[6] Peter Lindstrom. Universal Coding of the Reals using Bisection , 2019, CoNGA'19.
[7] Andreas Koch,et al. Hardware/software co-compilation with the Nymble system , 2013, 2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC).
[8] R. C. Whaley,et al. Minimizing development and maintenance costs in supporting persistently optimized BLAS , 2005, Softw. Pract. Exp..
[9] Artur Podobas. Accelerating Parallel Computations with OpenMP-Driven System-on-Chip Generation for FPGAs , 2014, 2014 IEEE 8th International Symposium on Embedded Multicore/Manycore SoCs.
[10] Eduard Ayguadé,et al. Application Acceleration on FPGAs with OmpSs@FPGA , 2018, 2018 International Conference on Field-Programmable Technology (FPT).
[11] Ben H. H. Juurlink,et al. Nexus#: A Distributed Hardware Task Manager for Task-Based Programming Models , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.
[12] Woody Sherman,et al. Molecular Dynamics Range-Limited Force Evaluation Optimized for FPGAs , 2019, 2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[13] Andreas Koch,et al. Optimized high-level synthesis of SMT multi-threaded hardware accelerators , 2015, 2015 International Conference on Field Programmable Technology (FPT).
[14] Weng-Fai Wong,et al. Generating hardware from OpenMP programs , 2006, 2006 IEEE International Conference on Field Programmable Technology.
[15] Satoshi Matsuoka,et al. Designing and accelerating spiking neural networks using OpenCL for FPGAs , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).
[16] Satoshi Matsuoka,et al. Hardware Implementation of POSITs and Their Application in FPGAs , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[17] Hiroyuki Takizawa,et al. Scaling Performance for N-Body Stream Computation with a Ring of FPGAs , 2019, HEART.
[18] Mats Brorsson,et al. Empowering OpenMP with automatically generated hardware , 2016, 2016 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS).
[19] Michael Philippsen,et al. OpenMP on FPGAs - A Survey , 2019, IWOMP.
[20] Daniel D. Gajski,et al. High ― Level Synthesis: Introduction to Chip and System Design , 1992 .
[21] Satoshi Matsuoka,et al. Evaluating and Optimizing OpenCL Kernels for High Performance Computing with FPGAs , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[22] Toni Cortes,et al. PARAVER: A Tool to Visualize and Analyze Parallel Code , 2007 .
[23] Tian Jin,et al. Offloading Support for OpenMP in Clang and LLVM , 2016, 2016 Third Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC).
[24] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[25] John Freeman,et al. From opencl to high-performance hardware on FPGAS , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).
[26] Satoshi Matsuoka,et al. High-Performance High-Order Stencil Computation on FPGAs Using OpenCL , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[27] Jason Helge Anderson,et al. From software threads to parallel hardware in high-level synthesis for FPGAs , 2013, 2013 International Conference on Field-Programmable Technology (FPT).
[28] Eduard Ayguadé,et al. OpenMP extensions for FPGA accelerators , 2009, 2009 International Symposium on Systems, Architectures, Modeling, and Simulation.
[29] Minh N. Do,et al. Youn-Long Steve Lin , 1992 .
[30] Guido Araujo,et al. Automatic Offloading of Cluster Accelerators , 2018, 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[31] Vikram S. Adve,et al. LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..
[32] Alessandro Cilardo,et al. Efficient and scalable OpenMP-based system-level design , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[33] Andreas Koch,et al. OpenMP device offloading to FPGA accelerators , 2017, 2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP).