Fast Decoding in Sequence Models using Discrete Latent Variables
暂无分享,去创建一个
Aurko Roy | Samy Bengio | Ashish Vaswani | Jakob Uszkoreit | Noam Shazeer | Niki Parmar | Łukasz Kaiser | Aurko Roy | Samy Bengio | Ashish Vaswani | Noam M. Shazeer | Niki Parmar | Jakob Uszkoreit | Łukasz Kaiser
[1] Ruslan Salakhutdinov,et al. Importance Weighted Autoencoders , 2015, ICLR.
[2] Maximilian Lam,et al. Word2Bits - Quantized Word Vectors , 2018, ArXiv.
[3] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.
[4] Xuedong Huang,et al. Unified techniques for vector quantization and hidden Markov modeling using semi-continuous models , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[5] Alexander H. Waibel,et al. Learning state-dependent stream weights for multi-codebook HMM speech recognition systems , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[6] Samy Bengio,et al. Discrete Autoencoders for Sequence Models , 2018, ArXiv.
[7] Karol Gregor,et al. Neural Variational Inference and Learning in Belief Networks , 2014, ICML.
[8] José L. Pérez-Córdoba,et al. Discriminative codebook design using multiple vector quantization in HMM-based speech recognizers , 1996, IEEE Trans. Speech Audio Process..
[9] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[10] Samy Bengio,et al. Can Active Memory Replace Attention? , 2016, NIPS.
[11] Lukasz Kaiser,et al. Neural GPUs Learn Algorithms , 2015, ICLR.
[12] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[13] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[14] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[15] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[16] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[17] Zhiting Hu,et al. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.
[18] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[19] David Duvenaud,et al. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation , 2017, ICLR.
[20] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[21] Geoffrey E. Hinton. Reducing the Dimensionality of Data with Neural , 2008 .
[22] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.
[23] Geoffrey E. Hinton,et al. Grammar as a Foreign Language , 2014, NIPS.
[24] Mei-Yuh Hwang,et al. The SPHINX speech recognition system , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[25] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[26] Andriy Mnih,et al. Variational Inference for Monte Carlo Objectives , 2016, ICML.
[27] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[28] Jascha Sohl-Dickstein,et al. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models , 2017, NIPS.
[29] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[30] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.
[31] Pascal Vincent,et al. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..
[32] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[33] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[34] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[35] Lukasz Kaiser,et al. Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.
[36] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[37] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[38] Hsiao-Wuen Hon,et al. Multiple codebook semi-continuous hidden Markov models for speaker-independent continuous speech recognition , 1989 .
[39] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[40] David J. Fleet,et al. Cartesian K-Means , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.
[41] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..
[42] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[43] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[44] Qun Liu,et al. Encoding Source Language with Convolutional Neural Network for Machine Translation , 2015, ACL.
[45] Hideki Nakayama,et al. Compressing Word Embeddings via Deep Compositional Code Learning , 2017, ICLR.
[46] Oluwasanmi Koyejo,et al. Learning the Base Distribution in Implicit Generative Models , 2018, ArXiv.