暂无分享,去创建一个
Moustapha Cissé | Hervé Jégou | Edouard Grave | Armand Joulin | David Grangier | David Grangier | H. Jégou | Moustapha Cissé | Edouard Grave | Armand Joulin
[1] George Kingsley Zipf,et al. Human behavior and the principle of least effort , 1949 .
[2] Lalit R. Bahl,et al. A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[3] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..
[4] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.
[5] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[6] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[7] Renato De Mori,et al. A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[8] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[9] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[10] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[11] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[12] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..
[13] Joshua Goodman,et al. Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[14] Blockin Blockin,et al. Quick Training of Probabilistic Neural Nets by Importance Sampling , 2003 .
[15] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.
[16] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[17] Yoshua Bengio,et al. Neural Probabilistic Language Models , 2006 .
[18] Holger Schwenk,et al. Continuous space language models , 2007, Comput. Speech Lang..
[19] Yoshua Bengio,et al. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model , 2008, IEEE Transactions on Neural Networks.
[20] Geoffrey E. Hinton,et al. A Scalable Hierarchical Distributed Language Model , 2008, NIPS.
[21] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[22] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[23] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[24] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Alexandre Allauzen,et al. Structured Output Layer neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Lukás Burget,et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.
[27] Lukás Burget,et al. Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[28] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[29] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.
[30] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[31] Holger Schwenk,et al. Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.
[32] Geoffrey Zweig,et al. Speed regularization and optimality in word classing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[33] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[34] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[35] Ashish Vaswani,et al. Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.
[36] Richard M. Schwartz,et al. Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.
[37] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[38] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[39] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[40] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.
[41] Joris Pelemans,et al. Sparse non-negative matrix language modeling for skip-grams , 2015, INTERSPEECH.
[42] Dan Klein,et al. When and why are log-linear models self-normalizing? , 2015, NAACL.
[43] Xavier Bouthillier,et al. Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets , 2014, NIPS.
[44] Marc'Aurelio Ranzato,et al. Learning Longer Memory in Recurrent Neural Networks , 2014, ICLR.
[45] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[46] Pradeep Dubey,et al. BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies , 2015, ICLR.
[47] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[48] Wenlin Chen,et al. Strategies for Training Large Vocabulary Neural Language Models , 2015, ACL.
[49] Allan Jabri,et al. Learning Visual Features from Large Weakly Supervised Data , 2015, ECCV.
[50] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .