Combining Maps and Distributed Representations for Shift-Reduce Parsing

Simple Recurrent Networks (Srns) have been widely used in natural language processing tasks. However, their ability to handle long-term dependencies between sentence constituents is rather limited. Narx networks have recently been shown to outperform Srns by preserving past information in explicit delays from the network’s prior output. Determining the number of delays, however, is problematic in itself. In this study on a shift-reduce parsing task, we demonstrate a hybrid localist-distributed approach that yields comparable performance in a more concise manner. A SardNet self-organizing map is used to represent the details of the input sequence in addition to the recurrent distributed representations of the Srn and Narx networks. The resulting architectures can represent arbitrarily long sequences and are cognitively more plausible.

[1]  Masaru Tomita,et al.  Efficient parsing for natural language , 1985 .

[2]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[3]  James L. McClelland,et al.  Mechanisms of Sentence Processing: Assigning Roles to Constituents of Sentences , 1986 .

[4]  James L. McClelland,et al.  Parallel Distributed Processing: Explorations in the Microstructure of Cognition : Psychological and Biological Models , 1986 .

[5]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[6]  Robert B. Allen,et al.  Several Studies on Natural Language ·and Back-Propagation , 1987 .

[7]  David S. Touretzky Connectionism and Compositional Semantics , 1989 .

[8]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[9]  James L. McClelland,et al.  Learning and Applying Contextual Constraints in Sentence Comprehension , 1990, Artif. Intell..

[10]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[11]  Jordan B. Pollack,et al.  Recursive Distributed Representations , 1990, Artif. Intell..

[12]  Stephen A. Billings,et al.  Non-linear system identification using neural networks , 1990 .

[13]  David J. Chalmers,et al.  Syntactic Transformations on Distributed Representations , 1990 .

[14]  Les E. Atlas,et al.  Recurrent Networks and NARMA Modeling , 1991, NIPS.

[15]  Robert F. Simmons,et al.  The Acquisition and Application of Context Sensitive Grammar for English , 1991, ACL.

[16]  Jeffrey L. Elman,et al.  Distributed Representations, Simple Recurrent Networks, and Grammatical Structure , 1991, Mach. Learn..

[17]  Paul W. Munro,et al.  A Network for Encoding, Decoding and Translating Locative Prepositions , 1991 .

[18]  Robert F. Simmons,et al.  The Acquisition and Use of Context-Dependent Grammars for English , 1992, Comput. Linguistics.

[19]  David C. Plaut,et al.  Connectionist neuropsychology: the breakdown and recovery of behavior in lesioned attractor networks , 1992 .

[20]  George Berg,et al.  A Connectionist Parser with Recursive Sentence Structure and Lexical Disambiguation , 1992, AAAI.

[21]  T. Shallice,et al.  Perseverative and Semantic Influences on Visual Object Naming Errors in Optic Aphasia: A Connectionist Account , 1993, Journal of Cognitive Neuroscience.

[22]  Risto Miikkulainen Subsymbolic Case-Role Analysis of Sentences with Embedded Clauses , 1993 .

[23]  Risto Miikkulainen,et al.  Subsymbolic natural language processing - an integrated model of scripts, lexicon, and memory , 1993, Neural network modeling and connectionism.

[24]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[25]  Risto Miikkulainen,et al.  SARDNET: A Self-Organizing Feature Map for Sequences , 1994, NIPS.

[26]  C. Lee Giles,et al.  An experimental comparison of recurrent neural networks , 1994, NIPS.

[27]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[28]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[29]  Ellen Riloff,et al.  Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing , 1996, Lecture Notes in Computer Science.

[30]  Raymond J. Mooney,et al.  Comparative results on using inductive logic programming for corpus-based parser construction , 1995, Learning for Natural Language Processing.

[31]  Peter Tiño,et al.  Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.

[32]  R. Miikkulainen Dyslexic and Category-Specific Aphasic Impairments in a Self-Organizing Feature Map Model of the Lexicon , 1997, Brain and Language.

[33]  Sun-Yuan Kung,et al.  A delay damage model selection algorithm for NARX neural networks , 1997, IEEE Trans. Signal Process..

[34]  Raymond J. Mooney,et al.  Learning Parse and Translation Decisions from Examples with Rich Context , 1997, ACL.

[35]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[36]  C. Lee Giles,et al.  How embedded memory in recurrent neural network architectures helps learning long-term temporal dependencies , 1998, Neural Networks.