Constructing Linguistically Motivated Structures from Statistical Grammars

This paper discusses two Hidden Markov Models (HMM) for linking linguistically motivated XTAG grammar and the automatically extracted LTAG used by MICA parser. The former grammar is a detailed LTAG enriched with feature structures. And the latter one is a huge size LTAG that due to its statistical nature is well suited to be used in statistical approaches. Lack of an e cient parser and sparseness in the supertags set are the main obstacles in using XTAG and MICA grammars respectively. The models were trained by the standard HMM training algorithm, BaumWelch. To converge the training algorithm to a better local optimum, the initial state of the models also were estimated using two semi-supervised EM-based algorithms. The resulting accuracy of the model (about 91%) shows that the models can provide a satisfactory way for linking these grammars to share their capabilities together.

[1]  XTAG Research Group,et al.  A Lexicalized Tree Adjoining Grammar for English , 1998, ArXiv.

[2]  Martha Palmer,et al.  Integrating compositional semantics into a verb lexicon , 2000, COLING.

[3]  Anoop Sarkar Combining Supertagging and Lexicalized Tree-Adjoining Grammar Parsing∗ , 2006 .

[4]  Laura Kallmeyer,et al.  Parsing Beyond Context-Free Grammars , 2010, Cognitive Technologies.

[5]  Heshaam Faili From Partial toward Full Parsing , 2009, RANLP.

[6]  Aravind K. Joshi,et al.  Tree-adjoining grammars and lexicalized grammars , 1992, Tree Automata and Languages.

[7]  Martha Palmer,et al.  Class-Based Construction of a Verb Lexicon , 2000, AAAI/IAAI.

[8]  Heshaam Faili,et al.  An Unsupervised Approach for Linking Automatically Extracted and Manually Crafted LTAGs , 2011, CICLing.

[9]  Owen Rambow,et al.  The Hidden TAG Model: Synchronous Grammars for Parsing Resource-Poor Languages , 2006, TAG.

[10]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[11]  Heshaam Faili,et al.  Augmenting the automated extracted tree adjoining grammars by semantic representation , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Vijay K. Shanker,et al.  Towards efficient statistical parsing using lexicalized grammatical information , 2002 .

[14]  Fei Xia,et al.  Automatic grammar generation from two different perspectives , 2001 .

[15]  Neville Ryant,et al.  Assigning XTAG Trees to VerbNet , 2004, TAG+.

[16]  Anne Abeillé,et al.  A Lexicalized Tree Adjoining Grammar for English , 1990 .

[17]  Fei Xia,et al.  Evaluating the Coverage of LTAGs on Annotated Corpora , 2009 .

[18]  Alexis Nasr,et al.  MICA: A Probabilistic Dependency Parser Based on Tree Insertion Grammars (Application Note) , 2009, HLT-NAACL.