Learning random walk models for inducing word dependency distributions

Many NLP tasks rely on accurately estimating word dependency probabilities P(ω1|ω2), where the words w1 and w2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge are important challenges. For example, if the probability P(N|V) of noun N being the subject of verb V is high, and V takes similar objects to V', and V' is synonymous to V", then we want to conclude that P(N|V") should also be reasonably high---even when those words did not cooccur in the training data.To capture these higher order relationships, we propose a Markov chain model, whose stationary distribution is used to give word probability estimates. Unlike the manually defined random walks used in some link analysis algorithms, we show how to automatically learn a rich set of parameters for the Markov chain's transition probabilities. We apply this model to the task of prepositional phrase attachment, obtaining an accuracy of 87.54%.

[1]  Istituto Studi Giuridici Economici e dell'Ambiente Codice dell'ambiente , 1977 .

[2]  C. R. Rao,et al.  Diversity: its measurement, decomposition, apportionment and analysis , 1982 .

[3]  Mats Rooth,et al.  Structural Ambiguity and Lexical Relations , 1991, ACL.

[4]  Volker Steinbiss,et al.  Cooccurrence smoothing for stochastic language modeling , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Prepositional Phrase Attachment , 1994, HLT.

[6]  Eric Brill,et al.  A Rule-Based Approach to Prepositional Phrase Attachment Disambiguation , 1994, COLING.

[7]  David A. Hull Stemming Algorithms: A Case Study for Detailed Evaluation , 1996, J. Am. Soc. Inf. Sci..

[8]  Eugene Charniak,et al.  Statistical Parsing with a Context-Free Grammar and Word Statistics , 1997, AAAI/IAAI.

[9]  Makoto Nagao,et al.  Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary , 1997, VLC.

[10]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[11]  M. KleinbergJon Authoritative sources in a hyperlinked environment , 1999 .

[12]  Lillian Lee,et al.  Measures of Distributional Similarity , 1999, ACL.

[13]  Sanda M. Harabagiu,et al.  Integrating Symbolic and Statistical Methods for Prepositional Phrase Attachment , 1999, FLAIRS Conference.

[14]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[15]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[16]  Patrick Pantel,et al.  An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words , 2000, ACL.

[17]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[18]  Michael I. Jordan,et al.  Link Analysis, Eigenvectors and Stability , 2001, IJCAI.

[19]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[20]  Daniel M. Bikel,et al.  Intricacies of Collins’ Parsing Model , 2004, CL.

[21]  Ido Dagan,et al.  Similarity-Based Models of Word Cooccurrence Probabilities , 1998, Machine Learning.