论文信息 - Deep Learning for NLP (without Magic) References

Deep Learning for NLP (without Magic) References

References Ando, Rie Kubota and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. A neural probabilistic language model.

Yoshua Bengio | Christopher D. Manning | Richard Socher | Yoshua Bengio | R. Socher

[1] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Noah A. Smith,et al. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.

[3] Jason Weston,et al. Learning Structured Embeddings of Knowledge Bases , 2011, AAAI.

[4] Yoshua Bengio,et al. Large-Scale Learning of Embeddings with Reconstruction Sampling , 2011, ICML.

[5] Alex Acero,et al. Hidden conditional random fields for phone classification , 2005, INTERSPEECH.

[6] Quoc V. Le,et al. On optimization methods for deep learning , 2011, ICML.

[7] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[8] Pascal Vincent,et al. Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[9] Xavier Carreras,et al. Simple Semi-supervised Dependency Parsing , 2008, ACL.

[10] Pascal Vincent,et al. A Generative Process for Contractive Auto-Encoders , 2012, ICML.

[11] Andrew Y. Ng,et al. Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[12] Jeffrey Pennington,et al. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection , 2011, NIPS.

[13] Dan Klein,et al. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[14] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[15] Quoc V. Le,et al. Measuring Invariances in Deep Networks , 2009, NIPS.

[16] Jeffrey Pennington,et al. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions , 2011, EMNLP.

[17] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.

[18] Geoffrey E. Hinton,et al. A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[19] Jean-Luc Gauvain,et al. Connectionist language modeling for large vocabulary continuous speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[20] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..

[21] Lukás Burget,et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.

[22] Peter Glöckner,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2013 .

[23] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[24] Jason Weston,et al. Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing , 2012, AISTATS.

[25] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[26] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[27] Stephen Gould,et al. Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28] Christoph Goller,et al. Learning task-dependent distributed representations by backpropagation through structure , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[29] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[30] Geoffrey E. Hinton. Mapping Part-Whole Hierarchies into Connectionist Networks , 1990, Artif. Intell..

[31] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[32] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[33] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[34] Alexander Clark,et al. Combining Distributional and Morphological Information for Part of Speech Induction , 2003, EACL.

[35] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[36] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[37] Giovanni Soda,et al. Towards Incremental Parsing of Natural Language Using Recursive Neural Networks , 2003, Applied Intelligence.

[38] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[39] Chris Callison-Burch,et al. Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[40] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[41] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.

[42] Andrew Y. Ng,et al. Improving Word Representations via Global Context and Multiple Word Prototypes , 2012, ACL.

[43] John Blitzer,et al. Hierarchical Distributed Representations for Statistical Language Modeling , 2004, NIPS.

[44] Holger Schwenk,et al. Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.

[45] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[46] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[47] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[48] Nathan Ratliff,et al. Online) Subgradient Methods for Structured Prediction , 2007 .

[49] Haizhou Li,et al. 13th Annual Conference of the International Speech Communication Association (Interspeech 2012) , 2012, ISCA 2012.

[50] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[51] Sanda M. Harabagiu,et al. UTD: Classifying Semantic Relations by Combining Lexical and Semantic Resources , 2010, *SEMEVAL.

[52] J. R. Firth,et al. A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[53] Raymond J. Mooney,et al. Multi-Prototype Vector-Space Models of Word Meaning , 2010, NAACL.

[54] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[55] Christopher D. Manning,et al. Learning Continuous Phrase Representations and Syntactic Parsing with Recursive Neural Networks , 2010 .

[56] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[57] Trevor Darrell,et al. Conditional Random Fields for Object Recognition , 2004, NIPS.

[58] Holger Schwenk,et al. Continuous space language models , 2007, Comput. Speech Lang..

[59] Paolo Frasconi,et al. Wide coverage natural language processing using kernel methods and neural networks for structured data , 2005, Pattern Recognit. Lett..

[60] Geoffrey E. Hinton,et al. Three new graphical models for statistical language modelling , 2007, ICML '07.

[61] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[62] Hermann Ney,et al. Algorithms for bigram and trigram word clustering , 1995, Speech Commun..