Barbecued Opakapaka: Using Semantic Preferences for Ontology Population

This paper investigates the use of semantic preferences for ontology population. It draws on a new resource, the Pattern Dictionary of English Verbs, which lists semantic categories expected in each syntactic slot of a verb pattern. Knowledge of semantic preferences is used to drive and control bootstrapped pattern extraction techniques on the EnClueWeb09 corpus with the aim of identifying common nouns belonging to twelve semantic types. Evaluation reveals that syntactic patterns perform better than lexical and surface patterns, at the same time raising issues about assessing ontology population candidates out of context.

[1]  James Pustejovsky,et al.  A Pattern Dictionary for Natural Language Processing , 2005 .

[2]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[3]  Ellen Riloff,et al.  Learning and Evaluating the Content and Structure of a Term Taxonomy , 2009, AAAI Spring Symposium: Learning by Reading and Learning to Read.

[4]  James Pustejovsky,et al.  Towards a Generative Lexical Resource: The Brandeis Semantic Ontology , 2006, LREC.

[5]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[6]  Anna Wierzbicka,et al.  apples are not a “kind of fruit”: the semantics of human categorization , 1984 .

[7]  Patrick Hanks Corpus pattern analysis , 2004 .

[8]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[9]  Vít Baisa,et al.  Disambiguating Verbs by Collocation: Corpus Lexicography meets Natural Language Processing , 2014, LREC.

[10]  Patrick Hanks,et al.  Lexical Analysis: Norms and Exploitations , 2013 .

[11]  Philip Resnik,et al.  Selectional Preference and Sense Disambiguation , 1997 .

[12]  Ellen Riloff,et al.  A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts , 2002, EMNLP.

[13]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[14]  Adam Kilgarriff,et al.  The Sketch Engine: ten years on , 2014 .

[15]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[16]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[17]  Pavel Rychlý,et al.  Building a 70 billion word corpus of English from ClueWeb , 2012, LREC.

[18]  Daniel Jurafsky,et al.  Learning Syntactic Patterns for Automatic Hypernym Discovery , 2004, NIPS.

[19]  Yuji Matsumoto,et al.  Graph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms , 2008, EMNLP.

[20]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.