暂无分享,去创建一个
[1] R. E. Wengert,et al. A simple automatic derivative evaluation program , 1964, Commun. ACM.
[2] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[3] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[4] Kunihiko Fukushima,et al. Neocognitron: A hierarchical neural network capable of visual pattern recognition , 1988, Neural Networks.
[5] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[6] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[7] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[8] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..
[9] Kiyohiro Shikano,et al. Neural Network Approach to Word Category Prediction for English Texts , 1990, COLING.
[10] Renato De Mori,et al. A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[11] Ian H. Witten,et al. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.
[12] Lonnie Chrisman,et al. Learning Recursive Distributed Representations for Holistic Computation , 1991 .
[13] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[14] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.
[15] G. Kane. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .
[16] A. Griewank,et al. Automatic differentiation of algorithms : theory, implementation, and application , 1994 .
[17] Ronald Rosenfeld,et al. A maximum entropy approach to adaptive statistical language modelling , 1996, Comput. Speech Lang..
[18] C. Bendtsen. FADBAD, a flexible C++ package for automatic differentiation - using the forward and backward method , 1996 .
[19] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.
[20] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[21] Mikel L. Forcada,et al. Recursive Hetero-associative Memories for Translation , 1997, IWANN.
[22] Philip Resnik,et al. Selectional Preference and Sense Disambiguation , 1997 .
[23] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[24] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[25] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[26] Ronald Rosenfeld,et al. A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..
[27] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..
[28] Joshua Goodman,et al. Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[29] Ronald Rosenfeld,et al. Whole-sentence exponential language models: a vehicle for linguistic-statistical integration , 2001, Comput. Speech Lang..
[30] Samy Bengio,et al. Torch: a modular machine learning software library , 2002 .
[31] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[32] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..
[33] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[34] Brian Roark,et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm , 2004, ACL.
[35] Jerome R. Bellegarda,et al. Statistical language model adaptation: review and perspectives , 2004, Speech Commun..
[36] Yoshua Bengio,et al. Neural Probabilistic Language Models , 2006 .
[37] Jun'ichi Tsujii,et al. A discriminative language model with pseudo-negative samples , 2007, ACL.
[38] Thorsten Brants,et al. Large Language Models in Machine Translation , 2007, EMNLP.
[39] Jinxi Xu,et al. A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.
[40] Yoshua Bengio,et al. Adaptive Importance Sampling to Accelerate Training of a Neural Probabilistic Language Model , 2008, IEEE Transactions on Neural Networks.
[41] Thorsten Brants,et al. Randomized Language Models via Perfect Hash Functions , 2008, ACL.
[42] Stanley F. Chen,et al. Shrinking Exponential Language Models , 2009, NAACL.
[43] Patrick Pantel,et al. From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..
[44] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[45] Razvan Pascanu,et al. Theano: A CPU and GPU Math Compiler in Python , 2010, SciPy.
[46] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[47] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.
[48] Dan Klein,et al. Faster and Smaller N-Gram Language Models , 2011, ACL.
[49] Kenneth Heafield,et al. KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.
[50] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.
[51] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[52] Tom M. Mitchell,et al. Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding , 2012, COLING.
[53] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.
[54] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[55] Ashish Vaswani,et al. Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.
[56] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[57] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.
[58] Andrew Y. Ng,et al. Parsing with Compositional Vector Grammars , 2013, ACL.
[59] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[60] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[61] Phil Blunsom,et al. A Convolutional Neural Network for Modelling Sentences , 2014, ACL.
[62] Georgiana Dinu,et al. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors , 2014, ACL.
[63] Robin J. Hogan,et al. Fast Reverse-Mode Automatic Differentiation using Expression Templates in C++ , 2014, ACM Trans. Math. Softw..
[64] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[65] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[66] Quoc V. Le,et al. Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.
[67] Christopher D. Manning,et al. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.
[68] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.
[69] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[70] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.
[71] Kenta Oono,et al. Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .
[72] Xavier Bouthillier,et al. Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets , 2014, NIPS.
[73] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.
[74] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[75] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[76] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[77] Regina Barzilay,et al. Molding CNNs for text: non-linear, non-consecutive convolutions , 2015, EMNLP.
[78] Noah A. Smith,et al. Transition-Based Dependency Parsing with Stack Long Short-Term Memory , 2015, ACL.
[79] Hang Li,et al. Neural Responding Machine for Short-Text Conversation , 2015, ACL.
[80] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[81] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[82] Eduard H. Hovy,et al. When Are Tree Structures Necessary for Deep Learning of Representations? , 2015, EMNLP.
[83] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.
[84] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[85] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[86] Dale Schuurmans,et al. Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.
[87] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[88] Jan Niehues,et al. Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder , 2016, IWSLT.
[89] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[90] Yang Liu,et al. Modeling Coverage for Neural Machine Translation , 2016, ACL.
[91] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[92] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.
[93] Regina Barzilay,et al. Rationalizing Neural Predictions , 2016, EMNLP.
[94] Charles A. Sutton,et al. A Convolutional Attention Network for Extreme Summarization of Source Code , 2016, ICML.
[95] Alexander M. Rush,et al. Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.
[96] Yoav Goldberg,et al. A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..
[97] Qun Liu,et al. Memory-enhanced Decoder for Neural Machine Translation , 2016, EMNLP.
[98] Noah A. Smith,et al. Recurrent Neural Network Grammars , 2016, NAACL.
[99] Sebastian Ruder,et al. An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.
[100] Graham Neubig,et al. Generalizing and Hybridizing Count-based and Neural Language Models , 2016, EMNLP.
[101] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.
[102] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[103] Yoshua Bengio,et al. Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism , 2016, NAACL.
[104] Yang Liu,et al. Minimum Risk Training for Neural Machine Translation , 2015, ACL.
[105] Yoshua Bengio,et al. A Character-level Decoder without Explicit Segmentation for Neural Machine Translation , 2016, ACL.
[106] Gholamreza Haffari,et al. Incorporating Structural Alignment Biases into an Attentional Neural Translation Model , 2016, NAACL.
[107] Zhiguo Wang,et al. Supervised Attentions for Neural Machine Translation , 2016, EMNLP.
[108] Xing Shi,et al. Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.
[109] Yoshimasa Tsuruoka,et al. Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.
[110] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[111] Satoshi Nakamura,et al. Incorporating Discrete Translation Lexicons into Neural Machine Translation , 2016, EMNLP.
[112] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[113] Lei Yu,et al. Online Segment to Segment Neural Transduction , 2016, EMNLP.
[114] Zhiguo Wang,et al. Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.
[115] Xinlei Chen,et al. Visualizing and Understanding Neural Models in NLP , 2015, NAACL.
[116] Alexander M. Rush,et al. Structured Attention Networks , 2017, ICLR.
[117] Kevin Duh,et al. DyNet: The Dynamic Neural Network Toolkit , 2017, ArXiv.
[118] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[119] Alexander J. Smola,et al. Neural Machine Translation with Recurrent Attention Modeling , 2016, EACL.
[120] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.
[121] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[122] Graham Neubig,et al. Learning to Translate in Real-time with Neural Machine Translation , 2016, EACL.