MKPipe: a compiler framework for optimizing multi-kernel workloads in OpenCL for FPGA
暂无分享,去创建一个
Ji Liu | Huiyang Zhou | Xipeng Shen | Abdullah-Al Kafi | Xipeng Shen | Huiyang Zhou | A. Kafi | Ji Liu
[1] Mike Hutton. Stratix® 10: 14nm FPGA delivering 1GHz , 2015, 2015 IEEE Hot Chips 27 Symposium (HCS).
[2] Martin C. Herbordt,et al. An Empirically Guided Optimization Framework for FPGA OpenCL , 2018, 2018 International Conference on Field-Programmable Technology (FPT).
[3] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[4] Kevin Skadron,et al. Pannotia: Understanding irregular GPGPU graph applications , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).
[5] Wei Zheng,et al. Design of FPGA based high-speed data acquisition and real-time data processing system on J-TEXT tokamak , 2014 .
[6] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[7] Dieter Schmalstieg,et al. Whippletree , 2014, ACM Trans. Graph..
[8] Wu-chun Feng,et al. Accelerating Workloads on FPGAs via OpenCL: A Case Study with OpenDwarfs , 2016 .
[9] Timo Aila,et al. Understanding the efficiency of ray traversal on GPUs , 2009, High Performance Graphics.
[10] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[11] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools (2nd Edition) , 2006 .
[12] Dong Wang,et al. PipeCNN: An OpenCL-based open-source FPGA accelerator for convolution neural networks , 2017, 2017 International Conference on Field Programmable Technology (ICFPT).
[13] Pingfan Meng,et al. Spector: An OpenCL FPGA benchmark suite , 2016, 2016 International Conference on Field-Programmable Technology (FPT).
[14] Alan D. George,et al. Comparative analysis of OpenCL vs. HDL with image-processing kernels on Stratix-V FPGA , 2015, 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP).
[15] Adel A. El-Zoghabi,et al. Optimized implementation of OpenCL kernels on FPGAs , 2019, J. Syst. Archit..
[16] Hari Angepat,et al. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave , 2018, IEEE Micro.
[17] Wei Zhang,et al. A performance analysis framework for optimizing OpenCL applications on FPGAs , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[18] Wen-mei W. Hwu,et al. Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing , 2012 .
[19] Wei Zhang,et al. A study of data partitioning on OpenCL-based FPGAs , 2015, 2015 25th International Conference on Field Programmable Logic and Applications (FPL).
[20] Chen Yang,et al. OpenCL for HPC with FPGAs: Case study in molecular electrostatics , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).
[21] Satoshi Matsuoka,et al. Evaluating and Optimizing OpenCL Kernels for High Performance Computing with FPGAs , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.
[22] Satoshi Matsuoka,et al. Combined Spatial and Temporal Blocking for High-Performance Stencil Computation on FPGAs Using OpenCL , 2018, FPGA.
[23] Chris Lattner,et al. LLVM: AN INFRASTRUCTURE FOR MULTI-STAGE OPTIMIZATION , 2000 .
[24] Jeff A. Stuart,et al. A study of Persistent Threads style GPU programming for GPGPU workloads , 2012, 2012 Innovative Parallel Computing (InPar).
[25] Huiyang Zhou,et al. Tuning Stencil codes in OpenCL for FPGAs , 2016, 2016 IEEE 34th International Conference on Computer Design (ICCD).
[26] Wenguang Chen,et al. VersaPipe: A Versatile Programming Framework for Pipelined Computing on GPU , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).