Learning Computational Grammars

This paper reports on the LEARNING COMPUTATIONAL GRAMMARS (LCG) project, a postdoc network devoted to studying the application of machine learning techniques to grammars suitable for computational use. We were interested in a more systematic survey to understand the relevance of many factors to the success of learning, esp. the availability of annotated data, the kind of dependencies in the data, and the availability of knowledge bases (grammars). We focused on syntax, esp. noun phrase (NP) syntax.

[1]  Gertjan van Noord,et al.  Transducers from Rewrite Rules with Backreferences , 1999, EACL.

[2]  James Hammerton,et al.  Combining a self-organising map with memory-based learning , 2001, CoNLL.

[3]  Rob Malouf,et al.  The Order of Prenominal Adjectives in Natural Language Generation , 2000, ACL.

[4]  Miles Osborne,et al.  Shallow Parsing as Part-of-Speech Tagging , 2000, CoNLL/LLL.

[5]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[6]  Julie Carson-Berndsen,et al.  Defining constraints for multilinear speech processing , 2001, INTERSPEECH.

[7]  John Nerbonne,et al.  Exploring Phonotactics with Simple Recurrent Networks , 1999 .

[8]  ApplicationsJulie Carson,et al.  Visualising Lexical Prosodic Representations forSpeech , 1999 .

[9]  Anne Cutler,et al.  Prosody and the word boundary problem , 1996 .

[10]  Miles Osborne,et al.  Estimation of Stochastic Attribute-Value Grammars using an Informative Sample , 2000, COLING.

[11]  Nicola Cancedda,et al.  Experiments with Corpus-based LFG Specialization , 2000, ANLP.

[12]  Rob Koeling Chunking with Maximum Entropy Models , 2000, CoNLL/LLL.

[13]  Éric Gaussier,et al.  Probabilistic models for PP-attachment resolution and NP analysis , 2001, CoNLL.

[14]  Miles Osborne,et al.  DCG Induction Using MDL and Pased Corpora , 2001, Learning Language in Logic.

[15]  Vincent Claveau,et al.  Apprentissage en corpus de couples nom-verbe pour la construction d'un lexique génératif , 2000 .

[16]  Nicola Cancedda,et al.  Corpus-Based Grammar Specialization , 2000, CoNLL/LLL.

[17]  John Nerbonne,et al.  Computer-Assisted Language Learning And Natural Language Processing , 2002 .

[18]  Hervé Déjean Learning Syntactic Structures with XML , 2000, CoNLL/LLL.

[19]  John Nerbonne,et al.  An intelligent word-based language learning assistant , 1999 .

[20]  N. Fakotakis,et al.  Memory-Based Text Chunking , 1999 .

[21]  Walter Daelemans,et al.  Cascaded Grammatical Relation Assignment , 1999, EMNLP.

[22]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[23]  Erhard W. Hinrichs,et al.  From Chunks to function-Argument Structure: A Similarity-Based Approach , 2001, ACL.

[24]  Erik Fajoen Tjong-Kim-Sang Machine Learning of Phonotactics , 1998 .

[25]  John Nerbonne,et al.  Validating Dialect Comparison Methods , 2002 .

[26]  John Nerbonne,et al.  An FGREP Investigation into Phonotactics , 1999 .

[27]  Walter Daelemans,et al.  Applying System Combination to Base Noun Phrase Identification , 2000, COLING.

[28]  John Nerbonne Learning Simple Phonotactics , 1999, IJCAI 1999.

[29]  Stasinos Konstantopoulos,et al.  NP chunking using ILP , 1999, CLIN.

[30]  S. Buchholz,et al.  Distinguishing complements from adjuncts using memory-based learning , 1998 .

[31]  Shlomo Argamon,et al.  A Memory-Based Approach to Learning Shallow Natural Language Patterns , 1998, ACL.

[32]  Anja Belz Optimisation of corpus-derived probabilistic grammars , 2001 .

[33]  Remko Scha,et al.  Data-oriented language processing , 1997 .

[34]  L. Miles,et al.  2000 , 2000, RDH.

[35]  Gertjan van Noord,et al.  Alpino: Wide-coverage Computational Analysis of Dutch , 2000, CLIN.

[36]  Gertjan van Noord,et al.  Finite State Transducers with Predicates and Identities , 2001, Grammars.

[37]  Hervé Déjean How To Evaluate and Compare Tagsets? A Proposal , 2000, LREC.

[38]  Erhard W. Hinrichs,et al.  TüSBL: A Similarity-Based Chunk Parser for Robust Syntactic Processing , 2001, HLT.

[39]  Yuji Matsumoto,et al.  Use of Support Vector Learning for Chunk Identification , 2000, CoNLL/LLL.

[40]  Ido Dagan,et al.  Incorporating Compositional Evidence in Memory-Based Partial Parsing , 2000, ACL.

[41]  Ronan G. Reilly,et al.  A Case Study of Transient Dyslexia , 1999, Brain and Language.

[42]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[43]  Gosse Bouma,et al.  A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion , 2000, ANLP.

[44]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[45]  Anja Belz Multi-Syllable Phonotactic Modelling , 2000, ACL 2000.

[46]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[47]  Miles Osborne,et al.  MDL-based DCG Induction for NP Identification , 1999, CoNLL.

[48]  Shlomo Argamon,et al.  A Memory-Based Approach to Learning Shallow Natural Language Patterns , 1999, COLING.

[49]  Jason Eisner,et al.  Bilexical Grammars and their Cubic-Time Parsing Algorithms , 2000 .

[50]  Luc Dehaspe Maximum Entropy Modeling with Clausal Constraints , 1997, ILP.

[51]  Erik F. Tjong Kim Sang,et al.  Memory-Based Shallow Parsing , 2002, J. Mach. Learn. Res..

[52]  Julie Carson-Berndsen,et al.  Phonotactic Constraint Ranking for Speech Recognition , 2000, CLIN.

[53]  Kris Popat,et al.  A Hierarchical Model for Clustering and Categorising Documents , 2002, ECIR.

[54]  Caroline F. Rowland,et al.  Review of Brent, M.R. (Ed). (1997). Computational approaches to language acquisition. , 1999 .

[55]  Cécile Fabre,et al.  Apprentissage de ressources lexicales pour l'extension de requêtes , 2000 .

[56]  Anja Belz,et al.  An Approach to the Automatic Acquisition of Phonotactic Constraints , 1998, SIGPHON@COLING/ACL.

[57]  David Sankoff,et al.  Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[58]  Erik F. Tjong Kim Sang,et al.  Memory-based clause identification , 2001, CoNLL.

[59]  Erhard W. Hinrichs,et al.  A Hybrid Architecture for Robust Parsing of German , 2002, LREC.

[60]  André Kempe,et al.  Experiments in Unsupervised Entropy-Based Corpus Segmentation , 1999, CoNLL.

[61]  W. Heeringa,et al.  Computational Comparison and Classification of Dialects , 2001 .

[62]  Stasinos Konstantopoulos,et al.  Learning Phonotactics Using ILP , 2007, ArXiv.

[63]  Walter Daelemans,et al.  Meta-Learning for Phonemic Annotation of Corpora , 2000, ICML.

[64]  Gertjan van Noord,et al.  Statistical Parsing of Dutch using Maximum Entropy Models with Feature Merging , 2001, NLPRS.

[65]  Ronan G. Reilly,et al.  Sound and function regularities in interjections , 2001, DiSS.

[66]  Julie Carson-Berndsen,et al.  An embodiment paradigm for speech recognition systems , 2001, INTERSPEECH.

[67]  James Hammerton Holistic Symbol Processing , 1999 .

[68]  Hans van Halteren,et al.  Improving Data Driven Wordclass Tagging by System Combination , 1998, ACL.

[69]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[70]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[71]  Christer Samuelsson,et al.  A Statistical Theory of Dependency Syntax , 2000, COLING.

[72]  Erik F. Tjong Kim Sang,et al.  Text Chunking by System Combination , 2000, CoNLL/LLL.

[73]  Éric Gaussier,et al.  Probabilistic models for terminology extraction and knowledge structuring from documents , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[74]  Tony Plate,et al.  Holographic Reduced Representations: Convolution Algebra for Compositional Distributed Representations , 1991, IJCAI.

[75]  J. van den Herik,et al.  Unsupervised Learning of Subcategorisation Information and its Application in a Parsing Subtask , 1998 .

[76]  Hervé Déjean,et al.  Theory Refinement and Natural Language Learning , 2000, COLING.

[77]  Rob Malouf,et al.  Efficient feature structure operations without compilation , 2000, Natural Language Engineering.

[78]  Martin J. Adamson,et al.  B-RAAM: A Connectionist Model which Develops Holistic Internal Representations of Symbolic Structures , 1999, Connect. Sci..

[79]  John Nerbonne,et al.  Connectionist learning to read aloud and comparison to human data , 1999 .

[80]  Eric Gaussier,et al.  Unsupervised learning of derivational morphology from inflectional lexicons , 1999 .

[81]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[82]  Erik F. Tjong Kim Sang,et al.  Noun Phrase Recognition by System Combination , 2000, ANLP.

[83]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[84]  Erik F. Tjong Kim Sang,et al.  Transforming a Chunker to a Parser , 2000, CLIN.

[85]  S. Konstantopoulous NP Chunking using ILP , 2000 .

[86]  Deb Roy,et al.  Using Synchronous Speech to Minimize Variability in Pause Placement : Cummins and , 2001 .

[87]  John Nerbonne,et al.  Learning the Logic of Simple Phonotactics , 1999, Learning Language in Logic.

[88]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[89]  G. Kurtz,et al.  Deutsche Syntax deklarativ. Head- Driven Phrase Structure Grammar für das Deutsche , 2001 .

[90]  Mitchell P. Marcus,et al.  Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[91]  Yuval Krymolowski Using the Distribution of Performance for Studying Statistical NLP Systems and Corpora , 2001, ACL 2001.

[92]  Fred Cummins On synchronous speech , 2002 .

[93]  Dominic Palmer-Brown,et al.  (S)RAAM: An Analytical Technique for Fast and Reliable Derivation of Connectionist Symbol Structure Representations , 1997, Connect. Sci..

[94]  Risto Miikkulainen,et al.  SARDNET: A Self-Organizing Feature Map for Sequences , 1994, NIPS.

[95]  Franck Thollard Improving Probabilistic Grammatical Inference Core Algorithms with Post-processing Techniques , 2001, ICML.

[96]  Risto Miikkulainen,et al.  SARDSRN: A Neural Network Shift-Reduce Parser , 1999, IJCAI.

[97]  Christer Samuelsson A Theory of Stochastic Grammars , 2000, Natural Language Processing.

[98]  James Alistair Hammerton Exploiting holistic computation : an evaluation of the sequential RAAM , 1999 .

[99]  Ronan G. Reilly,et al.  Enriched lexical representations, large corpora, and the performance of SRNs , 1998 .

[100]  Dafydd Gibbon,et al.  Visualising lexical prosodic representations for speech applications , 2002 .

[101]  Rob Koeling,et al.  Dialogue-based disambiguation: using dialogue status to improve speech understanding , 2002 .

[102]  Fred Cummins Reducing expressive variation in speech with synchronous speech , 2001 .

[103]  Ted Briscoe,et al.  Parser evaluation: a survey and a new proposal , 1998, LREC.

[104]  Walter Daelemans Toward an exemplar-based computational model for cognitive grammar , 1998 .

[105]  Jorn Veenstra Sabine Buchholz Fast NP Chunking Using Memory-Based Learning Techniques , 1998 .

[106]  Erik F. Tjong Kim Sang,et al.  Representing Text Chunks , 1999, EACL.

[107]  R. Reilly The relationship between object manipulation and language development in Broca's area: A connectionist simulation of Greenfield's hypothesis , 2001 .

[108]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[109]  Paul A. Watters,et al.  Computational Approaches to Language Acquisition , 1999 .

[110]  Walter Daelemans,et al.  TiMBL: Tilburg Memory-Based Learner, version 2.0, Reference guide , 1998 .