Natural Language Processing, Moving from Rules to Data

During the last decade, we assist to a major change in the direction that theoretical models used in natural language processing follow. We are moving from rule-based systems to corpus-oriented paradigms. In this paper, we analyze several generative formalisms together with newer statistical and data-oriented linguistic methodologies. We review existing methods belonging to deep or shallow learning applied in various subfields of computational linguistics. The continuous, fast improvements obtained by practical, applied machine learning techniques may lead us to new theoretical developments in the classic models as well. We discuss several scenarios for future approaches.

[1]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[2]  Stanley Peters,et al.  Conversational In-Vehicle Dialog Systems: The past, present, and future , 2016, IEEE Signal Processing Magazine.

[3]  Gennaro Chierchia,et al.  Anaphora and dynamic binding , 1992 .

[4]  Leonor Becerra-Bonache,et al.  Learning Finite Automata Using Label Queries , 2009, ALT.

[5]  Thomas Sudkamp Languages and Machines: An Introduction to the Theory of Computer Science , 2005 .

[6]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[7]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[8]  A. Mateescu,et al.  Contexts and the Concept of Mild Context-Sensitivity , 2003 .

[9]  K. Vijay-Shankar,et al.  SOME COMPUTATIONAL PROPERTIES OF TREE ADJOINING GRAMMERS , 1985, ACL 1985.

[10]  Burghard B. Rieger,et al.  Distributed Semantic Representations of Word Meanings , 1989, Parallelism, Learning, Evolution.

[11]  Michael O'Neill,et al.  Grammatical evolution - evolutionary automatic programming in an arbitrary language , 2003, Genetic programming.

[12]  Leonor Becerra-Bonache,et al.  Learning Meaning Before Syntax , 2008, ICGI.

[13]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[14]  Claudio Gutierrez,et al.  Survey of graph database models , 2008, CSUR.

[15]  Paola Velardi,et al.  Structural semantic interconnections: a knowledge-based approach to word sense disambiguation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Franz Rothlauf,et al.  Design of Modern Heuristics , 2011, Natural Computing Series.

[17]  Paul Dekker,et al.  Coreference and Representationalism , 2000 .

[18]  Dominic Widdows,et al.  Discovering Corpus-Specific Word Senses , 2003, EACL.

[19]  Christopher D. Manning Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? , 2011, CICLing.

[20]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[21]  Ronitt Rubinfeld,et al.  Efficient Learning of Typical Finite Automata from Random Walks , 1997, Inf. Comput..

[22]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[23]  Christof Monz,et al.  Automatic Single-Document Key Fact Extraction from Newswire Articles , 2009, EACL.

[24]  George A. Miller,et al.  Dictionaries of the Mind , 1985, ACL.

[25]  Dan I. Moldovan,et al.  Lexical Chains on WordNet and Extensions , 2013, FLAIRS Conference.

[26]  Danielle S McNamara,et al.  The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion , 2015, Behavior Research Methods.

[27]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[28]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[29]  Tuomo Kakkonen,et al.  Framework and Resources for Natural Language Parser Evaluation , 2007, ArXiv.

[30]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[31]  Uwe Reyle,et al.  From discourse to logic , 1993 .

[32]  R. Beaugrande,et al.  Introduction to text linguistics , 1981 .

[33]  Adam Lopez,et al.  Statistical machine translation , 2008, AMTA.

[34]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[35]  Magnus Sahlgren,et al.  The Distributional Hypothesis , 2008 .

[36]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[37]  Dana Angluin,et al.  Learning Regular Sets from Queries and Counterexamples , 1987, Inf. Comput..

[38]  Dan I. Moldovan,et al.  Identification of Textual Contexts , 2005, CONTEXT.

[39]  Ferenc Gécseg,et al.  Tree Languages , 1997, Handbook of Formal Languages.

[40]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[41]  Daniel M. Bikel,et al.  Intricacies of Collins’ Parsing Model , 2004, CL.

[42]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[43]  Dominic Widdows,et al.  A Graph Model for Unsupervised Lexical Acquisition , 2002, COLING.

[44]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[45]  Christopher D. Manning,et al.  Advances in natural language processing , 2015, Science.

[46]  Richard M. Schwartz,et al.  A Sentence-Trimming Approach to Multi-Document Summarization , 2005 .

[47]  Anthony Brabazon,et al.  Foundations in Grammatical Evolution for Dynamic Environments , 2009, Studies in Computational Intelligence.

[48]  Alok Ranjan Pal,et al.  Word sense disambiguation: a survey , 2015, ArXiv.

[49]  Aravind K. Joshi,et al.  An Earley-Type Parsing Algorithm for Tree Adjoining Grammars , 1988, ACL.

[50]  Erik Hemberg An exploration of learning and grammars in grammatical evolution , 2009, GECCO '09.

[51]  Cyrille Jégourel,et al.  Measuring Global Similarity Between Texts , 2014, SLSP.

[52]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[53]  Jerome R. Bellegarda,et al.  State of the art in statistical methods for language and speech processing , 2016, Comput. Speech Lang..

[54]  Ib Ulbaek Second Order Coherence: A new way of looking at incoherence in texts , 2016 .

[55]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[56]  Franz Rothlauf,et al.  Design of Modern Heuristics: Principles and Application , 2011 .

[57]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[58]  Ferenc Gécseg,et al.  Tree Automata , 2015, ArXiv.

[59]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[60]  José B. Mariño,et al.  N-gram-based Machine Translation , 2006, CL.

[61]  Sanda M. Harabagiu From Lexical Cohesion to Textual Coherence: A Data Driven Perspective , 1999, Int. J. Pattern Recognit. Artif. Intell..

[62]  Erik Anders,et al.  An Exploration of Grammars in Grammatical Evolution , 2010 .

[63]  Hans-Jörg Kreowski,et al.  Contextual Hypergraph Grammars - A New Approach to the Generation of Hypergraph Languages , 2006, Developments in Language Theory.

[64]  D. Roen The Effects of Cohesive Conjunctions, Reference, Response Rhetorical Predicates, and Topic on Reading Rate and Written Free Recall , 1984 .

[65]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[66]  Solomon Marcus,et al.  Contextual Grammars , 1969, COLING.

[67]  Roberto Navigli,et al.  Automatic Construction and Evaluation of a Large Semantically Enriched Wikipedia , 2016, IJCAI.

[68]  John McCarthy,et al.  Notes on Formalizing Context , 1993, IJCAI.

[69]  Klaas Sikkel,et al.  Parsing Schemata: A Framework for Specification and Analysis of Parsing Algorithms , 2002 .

[70]  Reinhard Muskens,et al.  Combining Montague semantics and discourse representation , 1996 .

[71]  Jean Véronis,et al.  HyperLex: lexical cartography for information retrieval , 2004, Comput. Speech Lang..

[72]  Martin Kay,et al.  Machine translation: the disappointing past and present , 1997 .