Combining Contextual and Structural Information for Supersense Tagging of Chinese Unknown Words

Supersense tagging classifies unknown words into semantic categories defined by lexicographers and inserts them into a thesaurus. Previous studies on supersense tagging show that context-based methods perform well for English unknown words while structure-based methods perform well for Chinese unknown words. The challenge before us is how to successfully combine contextual and structural information together for supersense tagging of Chinese unknown words. We propose a simple yet effective approach to address the challenge. In this approach, contextual information is used for measuring contextual similarity between words while structural information is used to filter candidate synonyms and adjusting contextual similarity score. Experiment results show that the proposed approach outperforms the state-of-art context-based method and structure-based method.

[1]  Chao-jan Chen Character-Sense Association and Compounding Template Similarity: Automatic Semantic Classification of Chinese Compounds , 2004, SIGHAN@ACL.

[2]  Keh-Jiann Chen,et al.  Automatic Semantic Classification for Chinese Unknown Compound Nouns , 2000, COLING.

[3]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[4]  Andrea Esuli,et al.  PageRanking WordNet Synsets: An Application to Opinion Mining , 2007, ACL.

[5]  Stephen Clark,et al.  Class-Based Probability Estimation Using a Semantic Hierarchy , 2002, CL.

[6]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[7]  M. Felisa Verdejo,et al.  Textual Entailment Recognition Based on Dependency Analysis and WordNet , 2005, MLCW.

[8]  Simone Paolo Ponzetto,et al.  Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution , 2006, NAACL.

[9]  Dominic Widdows,et al.  Unsupervised methods for developing taxonomies by combining syntactic and statistical information , 2003, NAACL.

[10]  Jungi Kim,et al.  Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis , 2009, ACL/IJCNLP.

[11]  Likun Qiu,et al.  A Method for Automatic POS Guessing of Chinese Unknown Words , 2008, COLING.

[12]  Hsin-Hsi Chen,et al.  Sense-Tagging Chinese Corpus , 2000, ACL 2000.

[13]  Massimiliano Ciaramita,et al.  Supersense Tagging of Unknown Nouns in WordNet , 2003, EMNLP.

[14]  Steffen Staab,et al.  Word classification based on combined measures of distributional and semantic similarity , 2003, EACL.

[15]  Xiaofei Lu Hybrid Models for Semantic Classification of Chinese Unknown Words , 2007, HLT-NAACL.

[16]  James R. Curran,et al.  Supersense Tagging of Unknown Nouns Using Semantic Similarity , 2005, ACL.

[17]  Huihsin Tseng Semantic Classification of Chinese Unknown Words , 2003, ACL.