暂无分享,去创建一个
Torsten Hoefler | Tiziano De Matteis | Johannes de Fine Licht | T. Hoefler | J. D. F. Licht | T. D. Matteis
[1] Satoshi Matsuoka,et al. Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL , 2018, FPGA.
[2] Bingsheng He,et al. Multikernel Data Partitioning With Channel on OpenCL-Based FPGAs , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[3] Veljko M. Milutinovic,et al. FPGA accelerator for floating-point matrix multiplication , 2012, IET Comput. Digit. Tech..
[4] Eriko Nurvitadhi,et al. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.
[5] Pingfan Meng,et al. Spector: An OpenCL FPGA benchmark suite , 2016, 2016 International Conference on Field-Programmable Technology (FPT).
[6] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[7] M. Mitchell Waldrop,et al. The chips are down for Moore’s law , 2016, Nature.
[8] Yun Liang,et al. Lin-Analyzer: A high-level performance analysis tool for FPGA-based accelerators , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[9] Yao-Wen Chang,et al. FPGA placement and routing , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[10] Ernst Houtgast,et al. High Performance Streaming Smith-Waterman Implementation with Implicit Synchronization on Intel FPGA using OpenCL , 2017, 2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE).
[11] Marco D. Santambrogio,et al. Architectural optimizations for high performance and energy efficient Smith-Waterman implementation on FPGAs using OpenCL , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[12] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[13] Eriko Nurvitadhi,et al. Customizable FPGA OpenCL matrix multiply design template for deep neural networks , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).
[14] Peng Zhang,et al. HLScope+,: Fast and accurate performance estimation for FPGA HLS , 2017, 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[15] H. T. Kung,et al. Systolic Arrays for (VLSI). , 1978 .
[16] Marco Danelutto,et al. Mammut: High-level management of system knobs and sensors , 2017, SoftwareX.
[17] Eriko Nurvitadhi,et al. A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform: A Deep Learning Case Study , 2018, FPGA.
[18] Viktor K. Prasanna,et al. High-Performance Designs for Linear Algebra Operations on Reconfigurable Hardware , 2008, IEEE Transactions on Computers.
[19] Torsten Hoefler,et al. Flexible Communication Avoiding Matrix Multiplication on FPGA with High-Level Synthesis , 2019, FPGA.
[20] Erik H. D'Hollander. High-Level Synthesis Optimization for Blocked Floating-Point Matrix Multiplication , 2017, CARN.
[21] Olivier Giroux,et al. Volta: Performance and Programmability , 2018, IEEE Micro.
[22] John D. Davis,et al. BLAS Comparison on FPGA, CPU and GPU , 2010, 2010 IEEE Computer Society Annual Symposium on VLSI.
[23] Mikhail J. Atallah,et al. Algorithms and Theory of Computation Handbook , 2009, Chapman & Hall/CRC Applied Algorithms and Data Structures series.
[24] Christian Plessl,et al. Flexible FPGA design for FDTD using OpenCL , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).
[25] Torsten Hoefler,et al. Transformations of High-Level Synthesis Codes for High-Performance Computing , 2018, IEEE Transactions on Parallel and Distributed Systems.
[26] Christian Plessl,et al. OpenCL Implementation of Cannon’s Matrix Multiplication Algorithm on Intel Stratix 10 FPGAs , 2019, 2019 International Conference on Field-Programmable Technology (ICFPT).
[27] Ralph Wittig,et al. OpenCL library of stream memory components targeting FPGAs , 2015, 2015 International Conference on Field Programmable Technology (FPT).
[28] J. Demmel,et al. Sun Microsystems , 1996 .
[29] David A. Patterson,et al. A new golden age for computer architecture , 2019, Commun. ACM.
[30] Philip Brisk,et al. HLSPredict: Cross Platform Performance Prediction for FPGA High-Level Synthesis , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[31] Yoshiki Yamaguchi,et al. A Block-Based Systolic Array on an HBM2 FPGA for DNA Sequence Alignment , 2020, ARC.