Learning Semantic Relations from Text

Every non-trivial text describes interactions and relations between people, institutions, activities, events and so on. What we know about the world consists in large part of such relations, and that knowledge contributes to the understanding of what texts refer to. Newly found relations can in turn become part of this knowledge that is stored for future use.To grasp a text’s semantic content, an automatic system must be able to recognize relations in texts and reason about them. This may be done by applying and updating previously acquired knowledge. We focus here in particular on semantic relations which describe the interactions among nouns and compact noun phrases, and we present such relations from both a theoretical and a practical perspective. The theoretical exploration sketches the historical path which has brought us to the contemporary view and interpretation of semantic relations. We discuss a wide range of relation inventories proposed by linguists and by language processing people. Such inventories vary by domain, granularity and suitability for downstream applications.On the practical side, we investigate the recognition and acquisition of relations from texts. In a look at supervised learning methods, we present available datasets, the variety of features which can describe relation instances, and learning algorithms found appropriate for the task. Next, we present weakly supervised and unsupervised learning methods of acquiring relations from large corpora with little or no previously annotated data. We show how enduring the bootstrapping algorithm based on seed examples or patterns has proved to be, and how it has been adapted to tackle Web-scale text collections. We also show a few machine learning techniques which can perform fast and reliable relation extraction by taking advantage of data redundancy and variability.

[1]  Barbara Rosario,et al.  The Descent of Hierarchy, and Selection in Relational Semantics , 2002, ACL.

[2]  Mark Lauer,et al.  Designing Statistical Language Learners: Experiments on Noun Compounds , 1996, ArXiv.

[3]  Siddharth Patwardhan,et al.  Effective Information Extraction with Semantic Affinity Patterns and Relevant Regions , 2007, EMNLP.

[4]  Patrick Pantel,et al.  Automatically Labeling Semantic Classes , 2004, NAACL.

[5]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[6]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[7]  Jun Suzuki,et al.  Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data , 2003, ACL.

[8]  Barbara Rosario,et al.  Classifying the Semantic Relations in Noun Compounds via a Domain-Specific Lexical Hierarchy , 2001, EMNLP.

[9]  Preslav Nakov,et al.  Combining Relational and Attributional Similarity for Semantic Relation Classification , 2011, RANLP.

[10]  Eugene Charniak,et al.  Finding Parts in Very Large Corpora , 1999, ACL.

[11]  Martin Chodorow,et al.  Extracting Semantic Hierarchies from a Large On-Line Dictionary , 1985, ACL.

[12]  Beatrice Warren,et al.  Semantic patterns of noun-noun compounds , 1978 .

[13]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[14]  Preslav Nakov,et al.  UCB: System Description for SemEval Task #4 , 2007, SemEval@ACL.

[15]  Lucien Tesnière Éléments de syntaxe structurale , 1959 .

[16]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[17]  Dan Moldovan,et al.  Models for the Semantic Classification of Noun Phrases , 2004, HLT-NAACL 2004.

[18]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[19]  Chu-Ren Huang,et al.  22nd International Conference on Computational Linguistics , 2008 .

[20]  James Pustejovsky,et al.  Robust Relational Parsing Over Biomedical Literature: Extracting Inhibit Relations , 2001, Pacific Symposium on Biocomputing.

[21]  Lucy Vanderwende,et al.  Algorithm for Automatic Interpretation of Noun Sequences , 1994, COLING.

[22]  Nicoletta Calzolari,et al.  Principles for encoding machine readable dictionaries , 1992 .

[23]  Timothy W. Finin The Semantic Interpretation of Nominal Compounds , 1980, AAAI.

[24]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[25]  Patrick Pantel,et al.  Discovering word senses from text , 2002, KDD.

[26]  M. Ross Quillian,et al.  A revised design for an understanding machine , 1962, Mech. Transl. Comput. Linguistics.

[27]  Marius Pasca,et al.  Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds , 2007, WWW '07.

[28]  Estevam R. Hruschka,et al.  Discovering Relations between Noun Categories , 2011, EMNLP.

[29]  Oren Etzioni,et al.  Relational Web Search , 2006 .

[30]  Michael L. Littman,et al.  Corpus-based Learning of Analogies and Semantic Relations , 2005, Machine Learning.

[31]  Alla Rozovskaya,et al.  UIUC: A Knowledge-rich Approach to Identifying Semantic Relations between Nominals , 2007, ACL 2007.

[32]  Tony Veale,et al.  A Concept-Centered Approach to Noun-Compound Interpretation , 2008, COLING.

[33]  Preslav Nakov,et al.  Solving Relational Similarity Problems Using the Web as a Corpus , 2008, ACL.

[34]  Oren Etzioni,et al.  Identifying Functional Relations in Web Text , 2010, EMNLP.

[35]  Timothy Baldwin,et al.  Automatic Interpretation of Noun Compounds Using WordNet Similarity , 2005, IJCNLP.

[36]  Maria Lapata,et al.  The Disambiguation of Nominalizations , 2002, CL.

[37]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[38]  Justin Buchler The Philosophy of Peirce: Selected Writings , 1941 .

[39]  Timothy Baldwin,et al.  Interpreting Semantic Relations in Noun Compounds via Verb Semantics , 2006, ACL.

[40]  Diarmuid Ó Séaghdha Designing and Evaluating a Semantic Annotation Scheme for Compound Nouns , 2007 .

[41]  Guodong Zhou,et al.  Tree Kernel-Based Relation Extraction with Context-Sensitive Structured Parse Tree Information , 2007, EMNLP.

[42]  Eugene Charniak,et al.  Toward a model of children's story comprehension , 1972 .

[43]  Gosse Bouma,et al.  48th Annual Meeting of the Association for Computational Linguistics , 2010, ACL 2010.

[44]  Miguel A. Andrade-Navarro,et al.  Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions , 1999, ISMB.

[45]  Dan I. Moldovan,et al.  Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations , 2003, NAACL.

[46]  Joseph Weizenbaum,et al.  ELIZA—a computer program for the study of natural language communication between man and machine , 1966, CACM.

[47]  Yang Jin,et al.  Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE , 2005, ACL.

[48]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[49]  Alessandro Moschitti,et al.  Efficient Convolution Kernels for Dependency and Constituent Syntactic Trees , 2006, ECML.

[50]  Jean-Michel Renders,et al.  Word-Sequence Kernels , 2003, J. Mach. Learn. Res..

[51]  Ellen Riloff,et al.  Semantic Class Learning from the Web with Hyponym Pattern Linkage Graphs , 2008, ACL.

[52]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[53]  Estevam R. Hruschka,et al.  Coupled semi-supervised learning for information extraction , 2010, WSDM '10.

[54]  Eduard H. Hovy,et al.  A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation , 2010, ACL.

[55]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[56]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..

[57]  Roxana Girju,et al.  Improving the Interpretation of Noun Phrases with Cross-linguistic Information , 2007, ACL.

[58]  Don R. Swanson,et al.  Two medical literatures that are logically but not bibliographically connected , 1987, J. Am. Soc. Inf. Sci..

[59]  Oren Etzioni,et al.  Unsupervised Resolution of Objects and Relations on the Web , 2007, NAACL.

[60]  R. Chaffin,et al.  The similarity and diversity of semantic relations , 1984, Memory & cognition.

[61]  Zornitsa Kozareva,et al.  A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web , 2010, EMNLP.

[62]  Wei Li,et al.  Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons , 2003, CoNLL.

[63]  Nina Wacholder,et al.  Building a Knowledge Base from Parsed Definitions , 1993, Natural Language Processing.

[64]  John McCarthy,et al.  Programs with common sense , 1960 .

[65]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[66]  Frank Keller,et al.  The Web as a Baseline: Evaluating the Performance of Unsupervised Web-based Models for a Range of NLP Tasks , 2004, NAACL.

[67]  Patrick Pantel,et al.  Discovery of inference rules for question-answering , 2001, Natural Language Engineering.

[68]  Dan I. Moldovan,et al.  On the semantics of noun compounds , 2005, Comput. Speech Lang..

[69]  Aldo Gangemi,et al.  Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology , 2005, IJCAI.

[70]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.