Lightweight Lexical and Semantic Evidence for Detecting Classes Among Wikipedia Articles

A supervised method relies on simple, lightweight features in order to distinguish Wikipedia articles that are classes (Shield volcano) from other articles (Kilauea). The features are lexical or semantic in nature. Experimental results in multiple languages over multiple evaluation sets demonstrate the superiority of the proposed method over previous work.

[1]  Krisztian Balog,et al.  Ad Hoc Table Retrieval using Semantic Similarity , 2018, WWW.

[2]  Simone Paolo Ponzetto,et al.  Deriving a Large-Scale Taxonomy from Wikipedia , 2007, AAAI.

[3]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[4]  Gerhard Weikum,et al.  HighLife: Higher-arity Fact Harvesting , 2018, WWW.

[5]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[6]  Simone Paolo Ponzetto,et al.  Large-Scale Taxonomy Mapping for Restructuring and Integrating Wikipedia , 2009, IJCAI.

[7]  Xinya Du,et al.  Harvesting Paragraph-level Question-Answer Pairs from Wikipedia , 2018, ACL.

[8]  Michael Strube,et al.  Decoding Wikipedia Categories for Knowledge Acquisition , 2008, AAAI.

[9]  Michael Strube,et al.  Transforming Wikipedia into a large scale multilingual concept network , 2013, Artif. Intell..

[10]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[11]  Xuchen Yao,et al.  Information Extraction over Structured Data: Question Answering with Freebase , 2014, ACL.

[12]  Weblog Wikipedia,et al.  In Wikipedia the Free Encyclopedia , 2005 .

[13]  Giuseppe Ottaviano,et al.  Fast and Space-Efficient Entity Linking for Queries , 2015, WSDM.

[14]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[15]  Yu Zhang,et al.  Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning , 2017, WWW.

[16]  Marius Pasca,et al.  Dissecting German Grammar and Swiss Passports: Open-Domain Decomposition of Compositional Entries in Large-Scale Knowledge Repositories , 2015, IJCAI.

[17]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[18]  Rada Mihalcea,et al.  Using Wikipedia for Automatic Word Sense Disambiguation , 2007, NAACL.

[19]  Ebrahim Bagheri,et al.  Document Retrieval Model Through Semantic Linking , 2017, WSDM.

[20]  Jens Lehmann,et al.  DBpedia - A crystallization point for the Web of Data , 2009, J. Web Semant..

[21]  Ming Zhou,et al.  Entity Linking for Queries by Searching Wikipedia Sentences , 2017, EMNLP.

[22]  Heng Ji,et al.  Unsupervised Entity Linking with Abstract Meaning Representation , 2015, NAACL.

[23]  Dan Roth,et al.  Learning-based Multi-Sieve Co-reference Resolution with Knowledge , 2012, EMNLP-CoNLL.

[24]  Karl Aberer,et al.  280 Birds with One Stone: Inducing Multilingual Taxonomies from Wikipedia using Character-level Classification , 2018, AAAI.

[25]  Xiaoyong Du,et al.  Leveraging Fine-Grained Wikipedia Categories for Entity Search , 2018, WWW.

[26]  Pamela A. Downing On the Creation and Use of English Compound Nouns. , 1977 .

[27]  Thomas Hofmann,et al.  Probabilistic Bag-Of-Hyperlinks Model for Entity Linking , 2015, WWW.

[28]  Dafna Shahaf,et al.  Fun Facts: Automatic Trivia Fact Extraction from Wikipedia , 2016, WSDM.

[29]  Jeff Z. Pan,et al.  Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia , 2013, ACL.

[30]  Naoaki Okazaki,et al.  Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web , 2009, ACL.

[31]  Gang Wang,et al.  Understanding user's query intent with wikipedia , 2009, WWW '09.

[32]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[33]  Marius Pasca Finding Needles in an Encyclopedic Haystack: Detecting Classes Among Wikipedia Articles , 2018, WWW.

[34]  Ben Hachey,et al.  Entity Disambiguation with Web Links , 2015, TACL.

[35]  Michael Strube,et al.  Distinguishing between Instances and Classes in the Wikipedia Taxonomy , 2008, ESWC.

[36]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[37]  Andrea Marino,et al.  Topical clustering of search results , 2012, WSDM '12.

[38]  Tiziano Flati,et al.  Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project , 2014, ACL.

[39]  Miao Fan,et al.  Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction , 2018, WSDM.

[40]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[41]  Brendan T. O'Connor,et al.  Learning to Extract Events from Knowledge Base Revisions , 2017, WWW.

[42]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[43]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[44]  Robert West,et al.  Structuring Wikipedia Articles with Section Recommendations , 2018, SIGIR.

[45]  Gerhard Weikum,et al.  YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia: Extended Abstract , 2013, IJCAI.

[46]  Thomas Hofmann,et al.  Deep Joint Entity Disambiguation with Local Neural Attention , 2017, EMNLP.

[47]  Steven Schockaert,et al.  MEmbER: Max-Margin Based Embeddings for Entity Retrieval , 2017, SIGIR.