A Stochastic Parser Based on a Structural Word Prediction Model

In this paper, we present a stochastic language model using dependency. This model considers a sentence as a word sequence and predicts each word from left to right. The history at each step of prediction is a sequence of partial parse trees covering the preceding words. First our model predicts the partial parse trees which have a dependency relation with the next word among them and then predicts the next word from only the trees which have a dependency relation with the next word. Our model is a generative stochastic model, thus this can be used not only as a parser but also as a language model of a speech recognizer. In our experiment, we prepared about 1,000 syntactically annotated Japanese sentences extracted from a financial newspaper and estimated the parameters of our model. We built a parser based on our model and tested it on approximately 100 sentences of the same newspaper. The accuracy of the dependency relation was 89.9%, the highest accuracy level obtained by Japanese stochastic parsers.

[1]  Frederick Jelinek,et al.  Exploiting Syntactic Structure for Language Modeling , 1998, ACL.

[2]  Penelope Sibun,et al.  A Practical Part-of-Speech Tagger , 1992, ANLP.

[3]  Jason Eisner,et al.  Three New Probabilistic Models for Dependency Parsing: An Exploration , 1996, COLING.

[4]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[5]  Masakazu Fujio,et al.  Japanese Dependency Structure Analysis based on Lexicalized Statistics , 1998, EMNLP.

[6]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[7]  Julian Kupiec,et al.  Augmenting a Hidden Markov Model for Phrase-Dependent Word Tagging , 1989, HLT.

[8]  Bernard Mérialdo,et al.  Tagging English Text with a Probabilistic Model , 1994, CL.

[9]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[10]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[11]  John Cocke,et al.  Probabilistic Parsing Method for Sentence Disambiguation , 1989, IWPT.

[12]  Victor H. Yngve,et al.  A model and an hypothesis for language structure , 1960 .

[13]  Kemal Oflazer Dependency Parsing with an Extended Finite State Approach , 1999, ACL.

[14]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[15]  George K. Kokkinakis,et al.  Automatic Stochastic Tagging of Natural Language Texts , 1995, Comput. Linguistics.

[16]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[17]  Makoto Nagao,et al.  A Stochastic Language Model using Dependency and Its Improvement by Word Clustering , 1998, COLING.