Class Label Enhancement via Related Instances

Class-instance label propagation algorithms have been successfully used to fuse information from multiple sources in order to enrich a set of unlabeled instances with class labels. Yet, nobody has explored the relationships between the instances themselves to enhance an initial set of class-instance pairs. We propose two graph-theoretic methods (centrality and regularization), which start with a small set of labeled class-instance pairs and use the instance-instance network to extend the class labels to all instances in the network. We carry out a comparative study with state-of-the-art knowledge harvesting algorithm and show that our approach can learn additional class labels while maintaining high accuracy. We conduct a comparative study between class-instance and instance-instance graphs used to propagate the class labels and show that the latter one achieves higher accuracy.

[1]  Oren Etzioni,et al.  Structured Querying of Web Text A Technical Challenge , 2006 .

[2]  Marius Pasca,et al.  Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds , 2007, WWW '07.

[3]  Patrick Pantel,et al.  DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[4]  Bernhard Schölkopf,et al.  Learning from labeled and unlabeled data on a directed graph , 2005, ICML.

[5]  Ellen Riloff,et al.  Toward Completeness in Concept Extraction and Classification , 2009, EMNLP.

[6]  Ari Rappoport,et al.  Geo-mining: Discovery of Road and Transport Networks Using Directional Patterns , 2009, EMNLP.

[7]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[8]  Oren Etzioni,et al.  What Is This, Anyway: Automatic Hypernym Discovery , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[9]  Patrick Pantel,et al.  Concept Discovery from Text , 2002, COLING.

[10]  Patrick Pantel,et al.  FactRank: Random Walks on a Web of Facts , 2010, COLING.

[11]  Daniel Jurafsky,et al.  Semantic Taxonomy Induction from Heterogenous Evidence , 2006, ACL.

[12]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[13]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[14]  Partha Pratim Talukdar,et al.  Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition , 2010, ACL.

[15]  Patrick Pantel,et al.  Entity Extraction via Ensemble Semantics , 2009, EMNLP.

[16]  Oren Etzioni,et al.  Structured querying of web text , 2007 .

[17]  Eric Crestan,et al.  Helping editors choose better seed sets for entity set expansion , 2009, CIKM.

[18]  Ulrik Brandes,et al.  Social Networks , 2013, Handbook of Graph Drawing and Visualization.

[19]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[20]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[21]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[22]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[23]  Ari Rappoport,et al.  Classification of Semantic Relationships between Nominals Using Pattern Clusters , 2008, ACL.

[24]  Fabio Massimo Zanzotto,et al.  Discovering Asymmetric Entailment Relations between Verbs Using Selectional Preferences , 2006, ACL.

[25]  Marius Pasca,et al.  Acquisition of categorized named entities for web search , 2004, CIKM '04.

[26]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[27]  Oren Etzioni,et al.  Navigating Extracted Data with Schema Discovery , 2007, WebDB.

[28]  Ellen Riloff,et al.  A Corpus-Based Approach for Building Semantic Lexicons , 1997, EMNLP.

[29]  Enrique Alfonseca,et al.  Acquisition of instance attributes via labeled and related instances , 2010, SIGIR.

[30]  Jimmy J. Lin,et al.  Integrating Web-based and Corpus-based Techniques for Question Answering , 2003, TREC.

[31]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[32]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[33]  Partha Pratim Talukdar,et al.  Weakly-Supervised Acquisition of Labeled Class Instances using Graph Random Walks , 2008, EMNLP.

[34]  Zornitsa Kozareva,et al.  A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web , 2010, EMNLP.