A corpus-based connectionist architecture for large-scale natural language parsing

We describe a deterministic shift-reduce parsing model that combines the advantages of connectionism with those of traditional symbolic models for parsing realistic sub-domains of natural language. It is a modular system that learns to annotate natural language texts with syntactic structure. The parser acquires its linguistic knowledge directly from pre-parsed sentence examples extracted from an annotated corpus. The connectionist modules enable the automatic learning of linguistic constraints and provide a distributed representation of linguistic information that exhibits tolerance to grammatical variation. The inputs and outputs of the connectionist modules represent symbolic information which can be easily manipulated and interpreted and provide the basis for organizing the parse. Performance is evaluated using labelled precision and recall. (For a test set of 4128 words, precision and recall of 75% and 69%, respectively, were achieved.) The work presented represents a significant step towards demonstrating that broad coverage parsing of natural language can be achieved with simple hybrid connectionist architectures which approximate shift-reduce parsing behaviours. Crucially, the model is adaptable to the grammatical framework of the training corpus used and so is not predisposed to a particular grammatical formalism.

[1]  Jonathan A. Tepper,et al.  Corpus-based connectionist parsing , 2001 .

[2]  Stuart M. Shieber,et al.  Sentence Disambiguation by a Shift-Reduce Parsing Technique , 1983, ACL.

[3]  Jonathan A. Tepper,et al.  Ambiguity resolution in a connectionist parser , 1995 .

[4]  Suzanne Stevenson Paolo Merlo Lexical structure and parsing complexity , 1997 .

[5]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[6]  Risto Miikkulainen,et al.  Subsymbolic Parsing of Embedded Structures , 1995 .

[7]  James L. McClelland,et al.  Finite State Automata and Simple Recurrent Networks , 1989, Neural Computation.

[8]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[9]  Johansson. Stig,et al.  Manual of information to accompany the Lancaster-Oslo : Bergen Corpus of British English, for use with digital computers , 1978 .

[10]  Michael G. Dyer,et al.  Connectionist Natural Language Processing: A Status Report , 1995 .

[11]  Alexander H. Waibel,et al.  Learning complex output representations in connectionist parsing of spoken language , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Eric Brill,et al.  Beyond N-Grams: Can Linguistic Sophistication Improve Language Modeling? , 1998, COLING-ACL.

[13]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[14]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[15]  Terry Winograd,et al.  Language as a Cognitive Process , 1983, CL.

[16]  Stefan Wermter,et al.  Learning Fault-Tolerant Speech Parsing with SCREEN , 1994, AAAI.

[17]  Suzanne Stevenson,et al.  Competition and recency in a hybrid network model of syntactic disambiguation , 1994 .

[18]  James Henderson,et al.  Connectionist syntactic parsing using temporal variable binding , 1994 .

[19]  Geoffrey Leech Corpus Annotation Schemes , 1993 .

[20]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[21]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[22]  Paul Smolensky,et al.  Connectionism and the foundations of AI , 1990 .

[23]  L. Shastri,et al.  From simple associations to systematic reasoning: A connectionist representation of rules, variables and dynamic bindings using temporal synchrony , 1993, Behavioral and Brain Sciences.

[24]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[25]  Edward Gibson,et al.  A computational theory of human linguistic processing: memory limitations and processing breakdown , 1991 .

[26]  Barry L. Kalman,et al.  Tail-recursive Distributed Representations and Simple Recurrent Networks , 1995 .

[27]  Stefan Wermter,et al.  A Novel Modular Neural Architecture for Rule-Based and Similarity-Based Reasoning , 1998, Hybrid Neural Systems.

[28]  Risto Miikkulainen,et al.  A Pdp Architecture for Processing Sentences With Relative Clauses , 1990, COLING.

[29]  Rens Bod Monte Carlo Parsing , 1993, IWPT.

[30]  James Henderson,et al.  A Connectionist Architecture for Learning to Parse , 1998, ACL.

[31]  Rens Bod,et al.  Parsing with the Shortest Derivation , 2000, COLING.

[32]  C. Lee Giles,et al.  Higher Order Recurrent Networks and Grammatical Inference , 1989, NIPS.

[33]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[34]  Jonathan A. Tepper,et al.  Connectionist natural language parsing , 2002, Trends in Cognitive Sciences.

[35]  Dominic Palmer-Brown,et al.  (S)RAAM: An Analytical Technique for Fast and Reliable Derivation of Connectionist Symbol Structure Representations , 1997, Connect. Sci..

[36]  Alexander H. Waibel,et al.  FeasPar - A Feature Structure Parser Learning to Parse Spoken Language , 1996, COLING.

[37]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[38]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[39]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[40]  Jerome A. Feldman,et al.  Connectionist Models and Their Properties , 1982, Cogn. Sci..

[41]  Sandiway Fong,et al.  Natural Language Grammatical Inference with Recurrent Neural Networks , 2000, IEEE Trans. Knowl. Data Eng..

[42]  George Berg,et al.  A Connectionist Parser with Recursive Sentence Structure and Lexical Disambiguation , 1992, AAAI.

[43]  Paul Smolensky,et al.  Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1990, Artif. Intell..

[44]  Ronan G. Reilly,et al.  Connectionist technique for on-line parsing , 1992 .

[45]  Risto Miikkulainen,et al.  Combining Maps and Distributed Representations for Shift-Reduce Parsing , 1998, Hybrid Neural Systems.

[46]  R. Sun On Variable Binding in Connectionist Networks , 1992 .

[47]  Jonathan A. Tepper,et al.  Integrating Symbolic and Subsymbolic Architectures for Parsing Arithmetic Expressions and Natural Language Sentences , 1995, SNN Symposium on Neural Networks.

[48]  Tony A. Plate,et al.  Holographic reduced representations , 1995, IEEE Trans. Neural Networks.

[49]  Zoubin Ghahramani,et al.  Temporal processing with connectionist networks , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[50]  A. Friederici Towards a neural basis of auditory sentence processing , 2002, Trends in Cognitive Sciences.

[51]  Francis Jack Smith,et al.  A Review of Statistical Language Processing Techniques , 1998, Artificial Intelligence Review.

[52]  Janet D. Fodor,et al.  The sausage machine: A new two-stage parsing model , 1978, Cognition.

[53]  Lai-Wan Chan,et al.  Confluent Preorder Parsing of Deterministic Grammars , 1997, Connect. Sci..

[54]  Eric Brill,et al.  A corpus-based approach to language learning , 1993 .

[55]  Ralph Grishman,et al.  Evaluating syntax performance of parser/grammars , 1991 .