论文信息 - A Categorial Variation Database for English - 字舞流文

A Categorial Variation Database for English

We describe our approach to the construction and evaluation of a large-scale database called "CatVar" which contains categorial variations of English lexemes. Due to the prevalence of cross-language categorial variation in multilingual applications, our categorial-variation resource may serve as an integral part of a diverse range of natural language applications. Thus, the research reported herein overlaps heavily with that of the machine-translation, lexicon-construction, and information-retrieval communities.We apply the information-retrieval metrics of precision and recall to evaluate the accuracy and coverage of our database with respect to a human-produced gold standard. This evaluation reveals that the categorial database achieves a high degree of precision and recall. Additionally, we demonstrate that the database improves on the linkability of Porter stemmer by over 30%.

Nizar Habash | Bonnie J. Dorr | B. Dorr | Nizar Habash

[1] Philip Resnik,et al. Mapping Lexical Entries in a Verbs Database to WordNet Senses , 2001, ACL.

[2] Marc Light,et al. Morphological Cues for Lexical Semantics , 1996, ACL.

[3] Martin F. Porter,et al. An algorithm for suffix stripping , 1997, Program.

[4] Nizar Habash,et al. DUSTer: a method for unraveling cross-language divergences for statistical word-level alignment , 2002, AMTA.

[5] Nizar Habash,et al. Efficient Language Independent Generation from Lexical Conceptual Structures , 2001 .

[6] Ellen M. Voorhees,et al. Using WordNet to disambiguate word senses for text retrieval , 1993, SIGIR.

[7] Richard Sproat,et al. Review of PC-KIMMO: a two-level processor for morphological analysis by Evan L. Antworth. Summer Institute of Linguistics 1990 , 1991 .

[8] Nizar Habash,et al. Generation-Heavy Hybrid Machine Translation , 2002, INLG.

[9] R. Schwartz,et al. Automatic Headline Generation for Newspaper Stories , 2002 .

[10] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[11] Philip Resnik,et al. Disambiguating Noun Groupings with Respect to Wordnet Senses , 1995, VLC@ACL.

[12] Pascale Fung,et al. Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet , 2002 .

[13] Kevin Knight,et al. Building a Large-Scale Knowledge Base for Machine Translation , 1994, AAAI.

[14] Igor Mel’čuk,et al. Dependency Syntax: Theory and Practice , 1987 .

[15] W. Bruce Croft,et al. Corpus-based stemming using cooccurrence of word variants , 1998, TOIS.

[16] Marti A. Hearst. Automated Discovery of WordNet Relations , 2004 .

[17] Srinivas Bangalore,et al. Exploiting a Probabilistic Hierarchical Model for Generation , 2000, COLING.

[18] Jean Paul Ballerini,et al. Experiments in multilingual information retrieval using the SPIDER system , 1996, SIGIR '96.

[19] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[20] Kevin Knight,et al. Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[21] Bonnie J. Dorr,et al. Mapping WorldNet Senses to a Lexical Database of Verbs , 2001 .

[22] Ralph Grishman,et al. NOMLEX: a lexicon of nominalizations , 1998 .

[23] Nizar Habash,et al. Handling translation divergences: combining statistical and symbolic techniques in generation-heavy machine translation , 2002, AMTA.

[24] Margarita Alonso Ramos,et al. Computational lexical semantics: Lexical functions of the Explanatory Combinatorial Dictionary for lexicalization in text generation , 1995 .

[25] Gina-Anne Levow,et al. Building a Chinese-English mapping between verb concepts for multilingual applications , 2000, AMTA.

[26] Dania Egedi,et al. A freely available wide coverage morphological analyzer for English , 1992, COLING 1992.

[27] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[28] Jean Véronis,et al. A study of polysemy judgements and inter-annotator agreement , 1999 .

[29] Robert Krovetz,et al. Viewing morphology as an inference process , 1993, Artif. Intell..