暂无分享,去创建一个
Ilya Sutskever | Alec Radford | Scott Gray | Rewon Child | Alec Radford | Ilya Sutskever | Scott Gray | Rewon Child | S. Gray | I. Sutskever | R. Child
[1] Nal Kalchbrenner,et al. Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling , 2018, ICLR.
[2] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.
[3] Kevin Gimpel,et al. Gaussian Error Linear Units (GELUs) , 2016 .
[4] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[5] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[6] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[7] Xi Chen,et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.
[8] Yoshua Bengio,et al. SampleRNN: An Unconditional End-to-End Neural Audio Generation Model , 2016, ICLR.
[9] Yiming Yang,et al. Transformer-XL: Language Modeling with Longer-Term Dependency , 2018 .
[10] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[11] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[12] Prafulla Dhariwal,et al. Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.
[13] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[14] Dustin Tran,et al. Image Transformer , 2018, ICML.
[15] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.
[16] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[17] Tianqi Chen,et al. Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.
[18] Douglas Eck,et al. An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation , 2018, ArXiv.
[19] Kevin Gimpel,et al. Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units , 2016, ArXiv.
[20] Denny Britz,et al. Efficient Attention using a Fixed-Size Memory Representation , 2017, EMNLP.
[21] Karen Simonyan,et al. The challenge of realistic music generation: modelling raw audio at scale , 2018, NeurIPS.
[22] Alex Graves,et al. Memory-Efficient Backpropagation Through Time , 2016, NIPS.
[23] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[24] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.
[25] Colin Raffel,et al. Monotonic Chunkwise Attention , 2017, ICLR.
[26] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[27] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[28] Pieter Abbeel,et al. PixelSNAIL: An Improved Autoregressive Generative Model , 2017, ICML.
[29] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[30] Sergio Gomez Colmenarejo,et al. Parallel Multiscale Autoregressive Density Estimation , 2017, ICML.