TVM : End-to-End Compilation Stack for Deep Learning
暂无分享,去创建一个
Eddie Q. Yan | Carlos Guestrin | Tianqi Chen | A. Krishnamurthy | T. Moreau | Ziheng Jiang | Haichen Shen | Leyuan Wang | Yuwei Hu | L. Ceze | Leyuan Wang
[1] Lane Schwartz,et al. DLVM: A modern compiler infrastructure for deep learning systems , 2017, ICLR.
[2] Shoaib Kamil,et al. The tensor algebra compiler , 2017, Proc. ACM Program. Lang..
[3] Samuel Madden,et al. Weld: Rethinking the Interface Between Data-Intensive Applications , 2017, ArXiv.
[4] Martin Elsman,et al. Futhark: purely functional GPU-programming with nested parallelism and in-place array updates , 2017, PLDI.
[5] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[6] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[7] Michel Steuwer,et al. LIFT: A functional data-parallel IR for high-performance GPU code generation , 2017, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[8] Vivienne Sze,et al. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[9] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[10] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[11] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.
[12] Elnar Hajiyev,et al. PENCIL: A Platform-Neutral Compute Intermediate Language for Accelerator Programming , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[13] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[14] Geoffrey Zweig,et al. An introduction to computational networks and the computational network toolkit (invited talk) , 2014, INTERSPEECH.
[15] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI.
[16] Francky Catthoor,et al. Polyhedral parallel code generation for CUDA , 2013, TACO.
[17] Razvan Pascanu,et al. Theano: new features and speed improvements , 2012, ArXiv.