Natural Language Processing (Almost) from Scratch
暂无分享,去创建一个
Jason Weston | Koray Kavukcuoglu | Léon Bottou | Ronan Collobert | Pavel P. Kuksa | Michael Karlen | J. Weston | K. Kavukcuoglu | L. Bottou | Ronan Collobert | Michael Karlen | P. Kuksa | R. Collobert
[1] Claude E. Shannon,et al. Prediction and Entropy of Printed English , 1951 .
[2] Zellig S. Harris,et al. Mathematical structures of language , 1968, Interscience tracts in pure and applied mathematics.
[3] F. Jelinek,et al. Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.
[4] Thomas M. Cover,et al. A convergent gambling estimate of the entropy of English , 1978, IEEE Trans. Inf. Theory.
[5] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.
[6] Zellig S. Harris,et al. A Grammar of English on Mathematical Principles , 1982 .
[7] Yann LeCun,et al. Une procedure d'apprentissage pour reseau a seuil asymmetrique (A learning scheme for asymmetric threshold networks) , 1985 .
[8] D. Rumelhart. Learning internal representations by back-propagating errors , 1986 .
[9] Geoffrey E. Hinton,et al. Learning sets of filters using back-propagation , 1987 .
[10] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[11] Judea Pearl,et al. Probabilistic reasoning in intelligent systems , 1988 .
[12] John Scott Bridle,et al. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition , 1989, NATO Neurocomputing.
[13] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[14] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[15] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[16] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[17] Patrick Gallinari,et al. A Framework for the Cooperation of Learning Algorithms , 1990, NIPS.
[18] L. Bottou. Stochastic Gradient Learning in Neural Networks , 1991 .
[19] Steven C. Suddarth,et al. Symbolic-Neural Systems and the Use of Hints for Developing Complex Systems , 1991, Int. J. Man Mach. Stud..
[20] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[21] Robert L. Mercer,et al. An Estimate of an Upper Bound for the Entropy of English , 1992, CL.
[22] Hinrich Schütze. Distributional Part-of-Speech Tagging , 1995, EACL.
[23] Geoffrey E. Hinton,et al. Bayesian Learning for Neural Networks , 1995 .
[24] John G. Cleary,et al. The entropy of English using PPM-based models , 1996, Proceedings of Data Compression Conference - DCC '96.
[25] Adwait Ratnaparkhi,et al. A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.
[26] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[27] Yoram Singer,et al. Learning to Order Things , 1997, NIPS.
[28] Yoshua Bengio,et al. Global training of document processing systems using graph transformer networks , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[29] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[30] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[31] Ralph Grishman,et al. A Maximum Entropy Approach to Named Entity Recognition , 1999 .
[32] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.
[33] David Maxwell Chickering,et al. Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..
[34] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[35] Eugene Charniak,et al. A Maximum-Entropy-Inspired Parser , 2000, ANLP.
[36] Yuji Matsumoto,et al. Use of Support Vector Learning for Chunk Identification , 2000, CoNLL/LLL.
[37] Daniel Gildea,et al. Automatic Labeling of Semantic Roles , 2000, ACL.
[38] Scott Miller,et al. A Novel Use of Statistical Parsing to Extract Information from Text , 2000, ANLP.
[39] Dan Klein,et al. Natural Language Grammar Induction Using a Constituent-Context Model , 2001, NIPS.
[40] Yuji Matsumoto,et al. Chunking with Support Vector Machines , 2001, NAACL.
[41] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[42] Daniel Gildea,et al. The Necessity of Parsing for Predicate Argument Recognition , 2002, ACL.
[43] Jean-Luc Gauvain,et al. Connectionist language modeling for large vocabulary continuous speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[44] Hwee Tou Ng,et al. Named Entity Recognition with a Maximum Entropy Approach , 2003, CoNLL.
[45] Michael Collins,et al. Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.
[46] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.
[47] Wei Li,et al. Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.
[48] Tong Zhang,et al. Named Entity Recognition through Classifier Combination , 2003, CoNLL.
[49] Dan Klein,et al. Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.
[50] Andrew McCallum,et al. Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..
[51] Yiming Yang,et al. RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..
[52] Lluís Màrquez i Villodre,et al. SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.
[53] Daniel Jurafsky,et al. Shallow Semantic Parsing using Support Vector Machines , 2004, NAACL.
[54] Scott Miller,et al. Name Tagging with Word Clusters and Discriminative Training , 2004, NAACL.
[55] Dan Roth,et al. Generalized Inference with Multiple Semantic Role Labeling Systems , 2005, CoNLL.
[56] Dan Roth,et al. The Necessity of Syntactic Parsing for Semantic Role Labeling , 2005, IJCAI.
[57] Percy Liang,et al. Semi-Supervised Learning for Natural Language , 2005 .
[58] Andrew McCallum,et al. Joint Parsing and Semantic Role Labeling , 2005, CoNLL.
[59] Koby Crammer,et al. Flexible Text Segmentation with Structured Multilabel Classification , 2005, HLT.
[60] Christopher D. Manning,et al. A Joint Model for Semantic Role Labeling , 2005, CoNLL.
[61] Hong Shen,et al. Voting Between Multiple Data Representations for Text Chunking , 2005, Canadian AI.
[62] Brian Roark,et al. Comparing and Combining Finite-State and Context-Free Parsers , 2005, HLT/EMNLP.
[63] Phil Blunsom,et al. Semantic Role Labelling with Tree Conditional Random Fields , 2005, CoNLL.
[64] Tong Zhang,et al. A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..
[65] Noah A. Smith,et al. Contrastive Estimation: Training Log-Linear Models on Unlabeled Data , 2005, ACL.
[66] Daniel Gildea,et al. The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.
[67] Andrew McCallum,et al. Composition of Conditional Random Fields for Transfer Learning , 2005, HLT.
[68] Daniel Jurafsky,et al. Semantic Role Chunking Combining Complementary Syntactic Views , 2005, CoNLL.
[69] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[70] Bernhard Schölkopf,et al. Semi-Supervised Learning (Adaptive Computation and Machine Learning) , 2006 .
[71] Alaa A. Kharbouch,et al. Three models for the description of language , 1956, IRE Trans. Inf. Theory.
[72] Eugene Charniak,et al. Effective Self-Training for Parsing , 2006, NAACL.
[73] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[74] Gabriele Musillo,et al. Robust Parsing of the Proposition Bank , 2006, Workshop On ROMAND Robust Methods In Analysis Of Natural Language Data.
[75] Quoc V. Le,et al. Learning to Rank with Nonsmooth Cost Functions , 2006, NIPS.
[76] Gholamreza Haffari,et al. Transductive learning for statistical machine translation , 2007, ACL.
[77] Jun'ichi Tsujii,et al. A discriminative language model with pseudo-negative samples , 2007, ACL.
[78] Ronen Feldman,et al. Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web , 2007, ACL.
[79] McCallumAndrew,et al. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data , 2007 .
[80] Giorgio Satta,et al. Guided Learning for Bidirectional Sequence Classification , 2007, ACL.
[81] Stéphan Clémençon,et al. Ranking the Best Instances , 2006, J. Mach. Learn. Res..
[82] Geoffrey E. Hinton,et al. Three new graphical models for statistical language modelling , 2007, ICML '07.
[83] Yehuda Koren,et al. The BellKor solution to the Netflix Prize , 2007 .
[84] Dan Klein,et al. Structure compilation: trading structure for features , 2008, ICML '08.
[85] Xavier Carreras,et al. Simple Semi-supervised Dependency Parsing , 2008, ACL.
[86] Jun Suzuki,et al. Semi-Supervised Sequential Labeling and Segmentation Using Giga-Word Scale Unlabeled Data , 2008, ACL.
[87] Jason Weston,et al. Deep learning via semi-supervised embedding , 2008, ICML '08.
[88] Xu Sun,et al. Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference , 2008, COLING.
[89] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[90] Dan Roth,et al. Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.
[91] Dekang Lin,et al. Phrase Clustering for Discriminative Learning , 2009, ACL.
[92] Alexander Yates,et al. Distributional Representations for Handling Sparsity in Supervised Sequence-Labeling , 2009, ACL.
[93] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.
[94] Ronan Collobert,et al. Deep Learning for Efficient Discriminative Parsing , 2011, AISTATS.
[95] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.