论文信息 - EcoRNN: Efficient Computing of LSTM RNN on GPUs - 字舞流文

EcoRNN: Efficient Computing of LSTM RNN on GPUs

Gennady Pekhimenko | Bojian Zheng | Gennady Pekhimenko | Bojian Zheng

[1] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[2] Ken Kennedy,et al. Automatic data layout for distributed-memory machines , 1998, TOPL.

[3] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[4] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[5] Phil Blunsom,et al. Optimizing Performance of Recurrent Neural Networks on GPUs , 2016, ArXiv.

[6] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018 .

[7] Albert Cohen,et al. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions , 2018, ArXiv.

[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[9] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .