暂无分享,去创建一个
Geoffrey E. Hinton | Lukasz Kaiser | George Tucker | Jan Chorowski | Gabriel Pereyra | Lukasz Kaiser | G. Tucker | J. Chorowski | Gabriel Pereyra
[1] E. Jaynes. Information Theory and Statistical Mechanics , 1957 .
[2] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .
[3] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[4] Kenneth Rose,et al. A global optimization technique for statistical classifier design , 1996, IEEE Trans. Signal Process..
[5] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.
[6] K. Rose. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems , 1998, Proc. IEEE.
[7] Rich Caruana,et al. Model compression , 2006, KDD '06.
[8] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.
[9] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[10] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[12] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[13] Navdeep Jaitly,et al. Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.
[14] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[15] László Tóth,et al. Combining time- and frequency-domain convolution in convolutional neural network-based phone recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Kenneth Heafield,et al. N-gram Counts and Language Models from the Common Crawl , 2014, LREC.
[17] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[18] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[19] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.
[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[21] Dumitru Erhan,et al. Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.
[22] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[23] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[24] Shin Ishii,et al. Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.
[25] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[26] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[28] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[29] Dale Schuurmans,et al. Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.
[30] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[33] Shin Ishii,et al. Distributional Smoothing by Virtual Adversarial Examples , 2015, ICLR.
[34] Kilian Q. Weinberger,et al. Deep Networks with Stochastic Depth , 2016, ECCV.
[35] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[36] Qi Tian,et al. DisturbLabel: Regularizing CNN on the Loss Layer , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[37] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[38] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .
[39] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[40] Wei Xu,et al. Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation , 2016, TACL.
[41] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[42] Yu Zhang,et al. Latent Sequence Decompositions , 2016, ICLR.
[43] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[45] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[46] Navdeep Jaitly,et al. Learning online alignments with continuous rewards policy gradient , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[47] Gregory Shakhnarovich,et al. FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.
[48] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.