Verb Sense and Subcategorization: Using Joint Inference to Improve Performance on Complementary Task

We propose a general model for joint inference in correlated natural language processing tasks when fully annotated training data is not available, and apply this model to the dual tasks of word sense disambiguation and verb subcategorization frame determination. The model uses the EM algorithm to simultaneously complete partially annotated training sets and learn a generative probabilistic model over multiple annotations. When applied to the word sense and verb subcategorization frame determination tasks, the model learns sharp joint probability distributions which correspond to linguistic intuitions about the correlations of the variables. Use of the joint model leads to error reductions over competitive independent models on these tasks.