Probabilistic Tree-Adjoining Grammar as a Framework for Statistical Natural Language Processing

In this paper, I argue for the use of a probabilistic form of tree-adjoining grammar (TAG) in statistical natural language processing. I first discuss two previous statistical approaches --- one that concentrates on the probabilities of structural operations, and another that emphasizes co-occurrence relationships between words. I argue that a purely structural apprach, exemplified by probabilistic context-free grammar, lacks sufficient sensitivity to lexical context, and, conversely, that lexical co-occurence analyses require a richer notion of locality that is best provided by importing some notion of structure.I then propose probabilistic TAG as a framework for statistical language modelling, arguing that it provides an advantageous combination of structure, locality, and lexical sensitivity. Issues in the acquisition of probabilistic TAG and parameter estimation are briefly considered.

[1]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[2]  Aravind K. Joshi,et al.  Remarks on some aspects of language structure and their relevance to pattern analysis , 1973, Pattern Recognit..

[3]  Aravind K. Joshi,et al.  Tree Adjunct Grammars , 1975, J. Comput. Syst. Sci..

[4]  J. Baker Trainable grammars for speech recognition , 1979 .

[5]  R. Burchfield Frequency Analysis of English Usage: Lexicon and Grammar. By W. Nelson Francis and Henry Kučera with the assistance of Andrew W. Mackie. Boston: Houghton Mifflin. 1982. x + 561 , 1985 .

[6]  Stuart M. Shieber,et al.  Evidence against the context-freeness of natural language , 1985 .

[7]  Anthony S. Kroch,et al.  The Linguistic Relevance of Tree Adjoining Grammar , 1985 .

[8]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[9]  Anne Abeillé,et al.  Parsing Idioms in Lexicalized TAGs , 1989, EACL.

[10]  Kathleen McKeown,et al.  Automatically Extracting and Representing Collocations for Language Generation , 1990, ACL.

[11]  Aravind K. Joshi,et al.  Mathematical and computational aspects of lexicalized grammars , 1990 .

[12]  Donald Hindle,et al.  Noun Classification From Predicate-Argument Structures , 1990, ACL.

[13]  Robert L. Mercer,et al.  A Statistical Approach to Sense Disambiguation in Machine Translation , 1991, HLT.

[14]  Mitchell P. Marcus,et al.  Pearl: A Probabilistic Chart Parser , 1991, EACL.

[15]  Frederick Jelinek,et al.  Basic Methods of Probabilistic Context Free Grammars , 1992 .

[16]  Yves Schabes,et al.  Stochastic Tree-Adjoining Grammars , 1992, HLT.

[17]  Yves Schabes,et al.  Stochastic Lexicalized Tree-adjoining Grammars , 1992, COLING.