An Unsupervised Approach to Prepositional Phrase Attachment using Contextually Similar Words

Prepositional phrase attachment is a common source of ambiguity in natural language processing. We present an unsupervised corpus-based approach to prepositional phrase attachment that achieves similar performance to supervised methods. Unlike previous unsupervised approaches in which training data is obtained by heuristic extraction of unambiguous examples from a corpus, we use an iterative process to extract training data from an automatically parsed corpus. Attachment decisions are made using a linear combination of features and low frequency events are approximated using contextually similar words.