论文信息 - A Primer on Neural Network Models for Natural Language Processing

A Primer on Neural Network Models for Natural Language Processing

Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.

Yoav Goldberg | Yoav Goldberg

[1] Nello Cristianini,et al. Kernel Methods for Pattern Analysis , 2003, ICTAI.

[2] Ming Zhou,et al. Question Answering over Freebase with Multi-Column Convolutional Neural Networks , 2015, ACL.

[3] Holger Schwenk,et al. Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[4] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.

[5] Zellig S. Harris,et al. Distributional Structure , 1954 .

[6] Takashi Chikayama,et al. Simple Customization of Recursive Neural Networks for Semantic Relation Classification , 2013, EMNLP.

[7] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .

[8] Hal Daumé,et al. Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[9] Baobao Chang,et al. An Effective Neural Network Model for Graph-based Dependency Parsing , 2015, ACL.

[10] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.

[11] Ming Zhou,et al. Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification , 2014, ACL.

[12] Yoav Goldberg,et al. An Efficient Algorithm for Easy-First Non-Directional Dependency Parsing , 2010, NAACL.

[13] Yue Zhang,et al. Tagging The Web: Building A Robust Web Tagger with Neural Network , 2014, ACL.

[14] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[15] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[16] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[17] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[18] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[19] Quoc V. Le,et al. Multi-task Sequence to Sequence Learning , 2015, ICLR.

[20] Manaal Faruqui,et al. Improving Vector Space Word Representations Using Multilingual Correlation , 2014, EACL.

[21] Yoshua Bengio,et al. BilBOWA: Fast Bilingual Distributed Representations without Word Alignments , 2014, ICML.

[22] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[23] Yuan Zhang,et al. Stack-propagation: Improved Representation Learning for Syntax , 2016, ACL.

[24] Tong Zhang,et al. A High-Performance Semi-Supervised Learning Method for Text Chunking , 2005, ACL.

[25] Alex Graves,et al. Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[26] Wang Ling,et al. Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[27] Noah A. Smith,et al. Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[28] Jianfeng Gao,et al. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[29] Joan Cabestany,et al. Biological and Artificial Computation: From Neuroscience to Technology , 1997, Lecture Notes in Computer Science.

[30] Mikel L. Forcada,et al. Recursive Hetero-associative Memories for Translation , 1997, IWANN.

[31] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32] Noah A. Smith,et al. Training with Exploration Improves a Greedy Stack LSTM Parser , 2016, EMNLP.

[33] Trevor Cohn,et al. Non-Linear Text Regression with a Deep Convolutional Neural Network , 2015, ACL.

[34] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[35] Kevin Duh,et al. Adaptation Data Selection using Neural Language Models: Experiments in Machine Translation , 2013, ACL.

[36] Hermann Ney,et al. LSTM Neural Networks for Language Modeling , 2012, INTERSPEECH.

[37] Phil Blunsom,et al. Multilingual Models for Compositional Distributed Semantics , 2014, ACL.

[38] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[39] Mathias Creutz,et al. Unsupervised models for morpheme segmentation and morphology learning , 2007, TSLP.

[40] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[41] Eric P. Xing,et al. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2014, ACL 2014.

[42] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .

[43] Yoav Goldberg,et al. Efficient Implementation of Beam-Search Incremental Parsers , 2013, ACL.