DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI Accelerator
暂无分享,去创建一个
Hiroshi Inoue | Swagath Venkataramani | Moriyoshi Ohara | Vijayalakshmi Srinivasan | Wei Wang | Kazuaki Ishizaki | Leland Chang | Jungwook Choi | Marcel Schaal | Mauricio J. Serrano | Eri Ogawa | Jintao Zhang | Kailash Gopalakrishnan | K. Gopalakrishnan | Jungwook Choi | Swagath Venkataramani | Leland Chang | V. Srinivasan | H. Inoue | M. Serrano | Kazuaki Ishizaki | Moriyoshi Ohara | Jintao Zhang | Eri Ogawa | M. Schaal | Wei Wang
[1] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[2] Joel Silberman,et al. A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference , 2018, 2018 IEEE Symposium on VLSI Circuits.
[3] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[4] Pradeep Dubey,et al. SCALEDEEP: A scalable compute architecture for learning and evaluating deep networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[5] Swagath Venkataramani,et al. POSTER: Design Space Exploration for Performance Optimization of Deep Neural Networks on Shared Memory Accelerators , 2017, 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT).
[6] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[7] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[8] D. Scott Cyphers,et al. Intel® nGraphTM , 2018 .
[9] Bertrand A. Maher,et al. Glow: Graph Lowering Compiler Techniques for Neural Networks , 2018, ArXiv.