Feedforward-Cutset-Free Pipelined Multiply–Accumulate Unit for the Machine Learning Accelerator
暂无分享,去创建一个
[1] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Tung Thanh Hoang,et al. A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application to a Double-Throughput MAC Unit , 2010, IEEE Transactions on Circuits and Systems I: Regular Papers.
[3] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[4] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[5] Vojin G. Oklobdzija,et al. Implementing multiply-accumulate operation in multiplication time , 1997, Proceedings 13th IEEE Sympsoium on Computer Arithmetic.
[6] Igor Carron,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .
[7] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[8] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[9] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[10] Earl E. Swartzlander,et al. A comparison of Dadda and Wallace multiplier delays , 2003, SPIE Optics + Photonics.
[11] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[12] Keshab K. Parhi,et al. VLSI digital signal processing systems , 1999 .
[13] Marian Verhelst,et al. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[14] D. H. Jacobsohn,et al. A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..