Learning Chinese Attribute Nouns Using Lexico-Syntactic Patterns

Lexical knowledge sources or lexical ontologies are very important to the knowledge grid and semantic computing. Attribute information are key elements for defining concepts in lexical sources. This paper explores the idea of creating corpus-based attribute classifiers using lexico-syntactic patterns. Two novel attribute classifiers are proposed, one is a likelihood ratio classifier using hand-coded lexico- syntactic patterns, the other is a maximum entropy (ME) classifier exploiting automatic extracted patterns. The performance of the method is compared to both the direct pattern matching approach and the human performance, which indicates that the proposed method for attribute learning is very effective.

[1]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[2]  Hsin-Hsi Chen,et al.  Mining Tables from Large Scale HTML Texts , 2000, COLING.

[3]  Massimo Poesio,et al.  Acquiring Lexical Knowledge for Anaphora Resolution , 2002, LREC.

[4]  Robert Alfred Amsler The Structure of the Merriam-Webster Pocket Dictionary , 1980 .

[5]  Massimo Poesio,et al.  Attribute-Based and Value-Based Clustering: An Evaluation , 2004, EMNLP.

[6]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[7]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[8]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[9]  Chih-Jen Lin,et al.  Combining SVMs with Various Feature Selection Strategies , 2006, Feature Extraction.

[10]  Yorick Wilks,et al.  A tractable machine dictionary as a resource for computational semantics , 1989 .

[11]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[12]  P. Pantel,et al.  A Bootstrapping Algorithm for Automatically Harvesting Semantic Relations , 2006, Proceedings of the Fifth International Workshop on Inference in Computational Semantics.

[13]  Bernice W. Polemis Nonparametric Statistics for the Behavioral Sciences , 1959 .

[14]  Liu Shaoming System of Implementing Chinese Corpus Segmentation and Tagging Algorithms , 2004 .

[15]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[16]  Martin Chodorow,et al.  Extracting Semantic Hierarchies from a Large On-Line Dictionary , 1985, ACL.

[17]  Jun'ichi Tsujii,et al.  A method to integrate tables of the World Wide Web , 2001 .

[18]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[19]  Dan I. Moldovan,et al.  On the semantics of noun compounds , 2005, Comput. Speech Lang..

[20]  Mark Lauer,et al.  Designing Statistical Language Learners: Experiments on Noun Compounds , 1996, ArXiv.

[21]  F. Mosteller,et al.  Inference and Disputed Authorship: The Federalist , 1966 .

[22]  Jean Véronis,et al.  EXTRACTING KNOWLEDGE BASES FROM MACHINE- READABLE DICTIONARIES : HAVE WE WASTED OUR TIME? , 1999 .

[23]  Qiang Dong,et al.  Hownet And The Computation Of Meaning , 2006 .

[24]  Jinglei Zhao,et al.  Semantic Labeling of Compound Nominalization in Chinese , 2007 .

[25]  Hai Zhuge,et al.  The knowledge grid , 2004 .

[26]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[27]  William A. Woods,et al.  What's in a Link: Foundations for Semantic Networks , 1975 .

[28]  H. Alshawi,et al.  Analysing the dictionary definitions , 1989 .

[29]  Massimo Poesio,et al.  Identifying Concept Attributes Using a Classifier , 2005, ACL 2005.