MALMM: A multi-array architecture for large-scale matrix multiplication on FPGA
暂无分享,去创建一个
[1] Veljko M. Milutinovic,et al. FPGA accelerator for floating-point matrix multiplication , 2012, IET Comput. Digit. Tech..
[2] Yong Dou,et al. 64-bit floating-point FPGA matrix multiplication , 2005, FPGA '05.
[3] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[4] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[5] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[7] Junzhong Shen,et al. Towards a Multi-array Architecture for Accelerating Large-scale Matrix Multiplication on FPGAs , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).
[8] Siddharth Joshi,et al. FPGA Based High Performance Double-Precision Matrix Multiplication , 2009, VLSI Design.
[9] Viktor K. Prasanna,et al. Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing Systems , 2007, IEEE Transactions on Parallel and Distributed Systems.
[10] Junzhong Shen,et al. FPGA‐accelerated deep convolutional neural networks for high throughput and energy efficiency , 2017, Concurr. Comput. Pract. Exp..
[11] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).