Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[2] Robert D. Blumofe,et al. Scheduling multithreaded computations by work stealing , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.
[3] Siddharth Joshi,et al. FPGA Based High Performance Double-Precision Matrix Multiplication , 2009, VLSI Design.
[4] Shijie Li,et al. Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks , 2017, ACM Trans. Reconfigurable Technol. Syst..
[5] Viktor K. Prasanna,et al. Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems , 2007, IEEE Transactions on Parallel and Distributed Systems.
[6] Viktor K. Prasanna,et al. A Library of Parameterizable Floating-Point Cores for FPGAs and Their Application to Scientific Computing , 2005, ERSA.
[7] Viktor K. Prasanna,et al. Energy- and time-efficient matrix multiplication on FPGAs , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[8] Veljko M. Milutinovic,et al. FPGA accelerator for floating-point matrix multiplication , 2012, IET Comput. Digit. Tech..
[9] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[10] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.
[11] Viktor K. Prasanna,et al. Area and time efficient implementations of matrix multiplication on FPGAs , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..
[12] Yong Dou,et al. An FPGA Implementation for Solving the Large Single-Source-Shortest-Path Problem , 2016, IEEE Transactions on Circuits and Systems II: Express Briefs.
[13] Viktor K. Prasanna,et al. Scalable and modular algorithms for floating-point matrix multiplication on FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..
[14] Yong Dou,et al. High performance and memory efficient implementation of matrix multiplication on FPGAs , 2010, 2010 International Conference on Field-Programmable Technology.