论文信息 - An Unsupervised Approach for Linking Automatically Extracted and Manually Crafted LTAGs

An Unsupervised Approach for Linking Automatically Extracted and Manually Crafted LTAGs

Though the lack of semantic representation of automatically extracted LTAGs is an obstacle in using these formalism, due to the advent of some powerful statistical parsers that were trained on them, these grammars have been taken into consideration more than before. Against of this grammatical class, there are some widely usage manually crafted LTAGs that are enriched with semantic representation but suffer from the lack of efficient parsers. The available representation of latter grammars beside the statistical capabilities of former encouraged us in constructing a link between them. Here, by focusing on the automatically extracted LTAG used by MICA [4] and the manually crafted English LTAG namely XTAG grammar [32], a statistical approach based on HMM is proposed that maps each sequence of former elementary trees onto a sequence of later elementary trees. To avoid of converging the HMM training algorithm in a local optimum state, an EM-based learning process for initializing the HMM parameters were proposed too. Experimental results show that the mapping method can provide a satisfactory way to cover the deficiencies arises in one grammar by the available capabilities of the other.

Heshaam Faili | Ali Basirat

[1] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[2] Martha Palmer,et al. Integrating compositional semantics into a verb lexicon , 2000, COLING.

[3] Nizar Habash,et al. Extracting a Tree Adjoining Grammar from the Penn Arabic Treebank , 2004 .

[4] C WatersRichard,et al. Tree insertion grammar , 1995 .

[5] Srinivas Bangalore,et al. Supertagging: An Approach to Almost Parsing , 1999, CL.

[6] Aravind K. Joshi,et al. Incremental LTAG Parsing , 2005, HLT/EMNLP.

[7] Heshaam Faili. From Partial toward Full Parsing , 2009, RANLP.

[8] George R. Doddington,et al. The ATIS Spoken Language Systems Pilot Corpus , 1990, HLT.

[9] Anne Abeillé,et al. A Lexicalized Tree Adjoining Grammar for English , 1990 .

[10] Vijay K. Shanker,et al. Towards efficient statistical parsing using lexicalized grammatical information , 2002 .

[11] Srinivas Bangalore,et al. New Models for Improving Supertag Disambiguation , 1999, EACL.