SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble

Corpus-based set expansion (i.e., finding the “complete” set of entities belonging to the same semantic class, based on a given corpus and a tiny set of seeds) is a critical task in knowledge discovery. It may facilitate numerous downstream applications, such as information extraction, taxonomy induction, question answering, and web search.

[1]  Xiao Li,et al.  Semi-supervised learning of semantic classes for query understanding: from the web and for the web , 2009, CIKM.

[2]  James R. Curran,et al.  Experiments in Mutual Exclusion Bootstrapping , 2007, ALTA.

[3]  William W. Cohen,et al.  Language-Independent Set Expansion of Named Entities Using the Web , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[4]  Yeye He,et al.  Concept Expansion Using Web Tables , 2015, WWW.

[5]  Zhe Chen,et al.  Long-tail Vocabulary Dictionary Extraction from the Web , 2016, WSDM.

[6]  William W. Cohen,et al.  Iterative Set Expansion of Named Entities Using the Web , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[7]  Dan Roth,et al.  Learning from Negative Examples in Set-Expansion , 2011, 2011 IEEE 11th International Conference on Data Mining.

[8]  Daniel S. Weld,et al.  Fine-Grained Entity Recognition , 2012, AAAI.

[9]  Dekang Lin,et al.  Phrase Clustering for Discriminative Learning , 2009, ACL.

[10]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[11]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Jeffrey Heer,et al.  Research and applications: Induced lexico-syntactic patterns improve information extraction from online medical forums , 2014, J. Am. Medical Informatics Assoc..

[14]  W. Marsden I and J , 2012 .

[15]  Sergei Vassilvitskii,et al.  Finding the Jaccard median , 2010, SODA '10.

[16]  William W. Cohen,et al.  From Topic Models to Semi-supervised Learning: Biasing Mixed-Membership Models to Exploit Topic-Indicative Features in Entity Clustering , 2013, ECML/PKDD.

[17]  Zhe Chen,et al.  EgoSet: Exploiting Word Ego-networks and User-generated Ontology for Multifaceted Set Expansion , 2016, WSDM.

[18]  Partha Pratim Talukdar,et al.  Weakly-Supervised Acquisition of Labeled Class Instances using Graph Random Walks , 2008, EMNLP.

[19]  Jiawei Han,et al.  Mining Quality Phrases from Massive Text Corpora , 2015, SIGMOD Conference.

[20]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[21]  Clare R. Voss,et al.  ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering , 2015, KDD.

[22]  Jiawei Han,et al.  Comparative Document Analysis for Large Text Corpora , 2015, WSDM.

[23]  James R. Curran,et al.  Weighted Mutual Exclusion Bootstrapping for Domain Independent Lexicon and Template Acquisition , 2008, ALTA.

[24]  Xianpei Han,et al.  A Probabilistic Co-Bootstrapping Method for Entity Set Expansion , 2014, COLING.

[25]  Christopher D. Manning,et al.  Improved Pattern Learning for Bootstrapped Entity Extraction , 2014, CoNLL.

[26]  Xiaojie Yuan,et al.  Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches , 2010, COLING.

[27]  Christopher D. Manning,et al.  Distributed Representations of Words to Guide Bootstrapped Entity Classifiers , 2015, NAACL.

[28]  Katherine A. Heller,et al.  Bayesian Sets , 2005, NIPS.

[29]  Eric Crestan,et al.  Web-Scale Distributional Similarity and Entity Set Expansion , 2009, EMNLP.

[30]  Yeye He,et al.  SEISA: set expansion by iterative similarity aggregation , 2011, WWW.

[31]  Stefano Faralli,et al.  OntoLearn Reloaded: A Graph-Based Algorithm for Taxonomy Induction , 2013, CL.

[32]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..