Relation Extraction : A Survey

With the advent of the Internet, large amount of digital text is generated everyday in the form of news articles, research publications, blogs, question answering forums and social media. It is important to develop techniques for extracting information automatically from these documents, as lot of important information is hidden within them. This extracted information can be used to improve access and management of knowledge hidden in large text corpora. Several applications such as Question Answering, Information Retrieval would benefit from this information. Entities like persons and organizations, form the most basic unit of the information. Occurrences of entities in a sentence are often linked through well-defined relations; e.g., occurrences of person and organization in a sentence may be linked through relations such as employed at. The task of Relation Extraction (RE) is to identify such relations automatically. In this paper, we survey several important supervised, semi-supervised and unsupervised RE techniques. We also cover the paradigms of Open Information Extraction (OIE) and Distant Supervision. Finally, we describe some of the recent trends in the RE techniques and possible future research directions. This survey would be useful for three kinds of readers - i) Newcomers in the field who want to quickly learn about RE; ii) Researchers who want to know how the various RE techniques evolved over time and what are possible future research directions and iii) Practitioners who just need to know which RE technique works best in various settings.

[1]  Jian Su,et al.  Exploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel , 2006, NAACL.

[2]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[3]  Pablo Gamallo,et al.  Dependency-Based Open Information Extraction , 2012 .

[4]  Ralph Grishman,et al.  Infusion of Labeled Data into Distant Supervision for Relation Extraction , 2014, ACL.

[5]  Hans Uszkoreit,et al.  Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web , 2012, International Semantic Web Conference.

[6]  Ralph Grishman,et al.  Semi-supervised Relation Extraction with Large-scale Word Clustering , 2011, ACL.

[7]  Pushpak Bhattacharyya,et al.  Semi-supervised Relation Extraction using EM Algorithm , 2013 .

[8]  Xianpei Han,et al.  A Feature-Enriched Tree Kernel for Relation Extraction , 2014, ACL.

[9]  Alessandro Moschitti,et al.  End-to-End Relation Extraction Using Distant Supervision from External Semantic Repositories , 2011, ACL.

[10]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[11]  Ronen Feldman,et al.  URES : an Unsupervised Web Relation Extraction System , 2006, ACL.

[12]  Deyu Zhou,et al.  Biomedical Relation Extraction: From Binary to Complex , 2014, Comput. Math. Methods Medicine.

[13]  Alessandro Moschitti,et al.  Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction , 2013, ACL.

[14]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[15]  ChengXiang Zhai,et al.  A Systematic Exploration of the Feature Space for Relation Extraction , 2007, NAACL.

[16]  Michael Collins,et al.  Convolution Kernels for Natural Language , 2001, NIPS.

[17]  Yang Liu,et al.  Exploring Fine-grained Entity Type Constraints for Distantly Supervised Relation Extraction , 2014, COLING.

[18]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[19]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[20]  Li Li,et al.  A Survey on Relation Extraction , 2017, CCKS.

[21]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[22]  Gary Geunbae Lee,et al.  A Cross-lingual Annotation Projection Approach for Relation Detection , 2010, COLING.

[23]  Andrew McCallum,et al.  Collective Cross-Document Relation Extraction Without Labelled Data , 2010, EMNLP.

[24]  Hans Uszkoreit,et al.  A Seed-driven Bottom-up Machine Learning Framework for Extracting Relations of Various Complexity , 2007, ACL.

[25]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[26]  Mirella Lapata,et al.  Unsupervised Relation Extraction with General Domain Knowledge , 2013, EMNLP.

[27]  Mark Steedman,et al.  Unsupervised Induction of Cross-Lingual Semantic Relations , 2013, EMNLP.

[28]  Gholamreza Haffari,et al.  Noisy Or-based model for Relation Extraction using Distant Supervision , 2014, EMNLP.

[29]  Ralph Grishman,et al.  Distant Supervision for Relation Extraction with an Incomplete Knowledge Base , 2013, NAACL.

[30]  Guodong Zhou,et al.  Exploring syntactic structured features over parse trees for relation extraction using kernel methods , 2008, Inf. Process. Manag..

[31]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[32]  Patrick Pantel,et al.  DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[33]  Girish Keshav Palshikar Techniques for Named Entity Recognition : A Survey , 2019 .

[34]  Mahdy Khayyamian,et al.  Syntactic Tree-based Relation Extraction Using a Generalization of Collins and Duffy Convolution Tree Kernel , 2009, HLT-NAACL.

[35]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[36]  Sandra Collovini,et al.  A review on Relation Extraction with an eye on Portuguese , 2013, Journal of the Brazilian Computer Society.

[37]  Rohit J. Kate,et al.  Joint Entity and Relation Extraction Using Card-Pyramid Parsing , 2010, CoNLL.

[38]  Ralph Grishman,et al.  Active learning for relation type extension with local and global data views , 2012, CIKM '12.

[39]  Chang Wang,et al.  Relation extraction and scoring in DeepQA , 2012, IBM J. Res. Dev..

[40]  Guodong Zhou,et al.  Bilingual Active Learning for Relation Classification via Pseudo Parallel Corpora , 2014, ACL.

[41]  Edouard Grave,et al.  A convex relaxation for weakly supervised relation extraction , 2014, EMNLP.

[42]  Ronen Feldman,et al.  Boosting Unsupervised Relation Extraction by Using NER , 2006, EMNLP.

[43]  Ralph Grishman,et al.  An Efficient Active Learning Framework for New Relation Types , 2013, IJCNLP.

[44]  Dan Roth,et al.  Probabilistic Reasoning for Entity & Relation Recognition , 2002, COLING.

[45]  Sergey Brin,et al.  Extracting Patterns and Relations from the World Wide Web , 1998, WebDB.

[46]  Guodong Zhou,et al.  Relation Extraction Using Convolution Tree Kernel Expanded with Entity Features , 2007, PACLIC.

[47]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[48]  Xianpei Han,et al.  Semantic Consistency: A Local Subspace Based Method for Distant Supervised Relation Extraction , 2014, ACL.

[49]  Tom M. Mitchell,et al.  Coupling Semi-Supervised Learning of Categories and Relations , 2009, HLT-NAACL 2009.

[50]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[51]  Ryan Gabbard,et al.  Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters , 2011, ACL.

[52]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[53]  Ivor W. Tsang,et al.  Robust Domain Adaptation for Relation Extraction via Clustering Consistency , 2014, ACL.

[54]  Zornitsa Kozareva,et al.  Not All Seeds Are Equal: Measuring the Quality of Text Mining Seeds , 2010, NAACL.

[55]  Ramesh Nallapati,et al.  Multi-instance Multi-label Learning for Relation Extraction , 2012, EMNLP.

[56]  Dan Roth,et al.  Exploiting Background Knowledge for Relation Extraction , 2010, COLING.

[57]  Heng Ji,et al.  Incremental Joint Extraction of Entity Mentions and Relations , 2014, ACL.

[58]  Marie-Francine Moens,et al.  Information Extraction: Algorithms and Prospects in a Retrieval Context , 2006, The Information Retrieval Series.

[59]  Dongyan Zhao,et al.  Encoding Relation Requirements for Relation Extraction via Joint Inference , 2014, ACL.

[60]  Wai Lam,et al.  Jointly Identifying Entities and Extracting Relations in Encyclopedia Text via A Graphical Model Approach , 2010, COLING.

[61]  Hans Uszkoreit,et al.  Boosting Relation Extraction with Limited Closed-World Knowledge , 2010, COLING.

[62]  Mark Stevenson,et al.  Improving Semi-supervised Acquisition of Relation Extraction Patterns , 2006 .

[63]  Heng Ji,et al.  Overview of the TAC 2010 Knowledge Base Population Track , 2010 .

[64]  Hiroshi Nakagawa,et al.  Reducing Wrong Labels in Distant Supervision for Relation Extraction , 2012, ACL.

[65]  Mitsuru Ishizuka,et al.  Exploiting Syntactic and Semantic Information for Relation Extraction from Wikipedia , 2006 .

[66]  Doug Downey,et al.  Unsupervised named-entity extraction from the Web: An experimental study , 2005, Artif. Intell..

[67]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[68]  Le Zhao,et al.  Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction , 2013, ACL.

[69]  Sanda M. Harabagiu,et al.  Shallow Semantics for Relation Extraction , 2005, IJCAI.

[70]  Ang Sun A Two-stage Bootstrapping Algorithm for Relation Extraction , 2009, RANLP.

[71]  Daniel S. Weld,et al.  Ontological Smoothing for Relation Extraction with Minimal Supervision , 2012, AAAI.

[72]  Claire Cardie,et al.  Joint Extraction of Entities and Relations for Opinion Recognition , 2006, EMNLP.

[73]  Eric Crestan,et al.  Helping editors choose better seed sets for entity set expansion , 2009, CIKM.

[74]  Andrew Y. Ng,et al.  Semantic Compositionality through Recursive Matrix-Vector Spaces , 2012, EMNLP.

[75]  Dmitry Zelenko,et al.  Kernel methods for relation extraction , 2003 .

[76]  Zhifang Sui,et al.  Towards Accurate Distant Supervision for Relational Facts Extraction , 2013, ACL.

[77]  Gerhard Paass,et al.  Dependency Tree Kernels for Relation Extraction from Natural Language Text , 2009, ECML/PKDD.

[78]  Luke S. Zettlemoyer,et al.  Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations , 2011, ACL.

[79]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[80]  Michael Collins,et al.  Semantic Tagging using a Probabilistic Context Free Grammar , 1998, VLC@COLING/ACL.

[81]  Mengqiu Wang,et al.  A Re-examination of Dependency Path Kernels for Relation Extraction , 2008, IJCNLP.

[82]  Guodong Zhou,et al.  Tree Kernel-Based Relation Extraction with Context-Sensitive Structured Parse Tree Information , 2007, EMNLP.

[83]  Scott Miller,et al.  A Novel Use of Statistical Parsing to Extract Information from Text , 2000, ANLP.

[84]  Zhu Zhang,et al.  Weakly-supervised relation classification for information extraction , 2004, CIKM '04.

[85]  Alessandro Moschitti,et al.  Convolution Kernels on Constituent, Dependency and Sequential Structures for Relation Extraction , 2009, EMNLP.

[86]  Mitsuru Ishizuka,et al.  Relation Extraction from Wikipedia Using Subtree Mining , 2007, AAAI.

[87]  Daniel S. Weld,et al.  Learning 5000 Relational Extractors , 2010, ACL.

[88]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[89]  L. Getoor,et al.  1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[90]  Ralph Grishman,et al.  Discovering Relations among Named Entities from Large Corpora , 2004, ACL.

[91]  Takashi Chikayama,et al.  Simple Customization of Recursive Neural Networks for Semantic Relation Classification , 2013, EMNLP.

[92]  Mark Stevenson,et al.  Extracting Relations Within and Across Sentences , 2011, RANLP.

[93]  Philipp Cimiano,et al.  Using the Web to Reduce Data Sparseness in Pattern-Based Information Extraction , 2007, PKDD.

[94]  Daniel S. Weld,et al.  Using Wikipedia to bootstrap open information extraction , 2009, SGMD.

[95]  Alessandro Moschitti,et al.  Joint Distant and Direct Supervision for Relation Extraction , 2011, IJCNLP.

[96]  Richard Tobin,et al.  Datasets for generic relation extraction* , 2011, Natural Language Engineering.

[97]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[98]  Razvan C. Bunescu,et al.  A Shortest Path Dependency Kernel for Relation Extraction , 2005, HLT.

[99]  Hoifung Poon,et al.  Unsupervised Semantic Parsing , 2009, EMNLP.

[100]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[101]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[102]  Dan Roth,et al.  Active Learning for Pipeline Models , 2008, AAAI.

[103]  Dan Roth,et al.  Exploiting Syntactico-Semantic Structures for Relation Extraction , 2011, ACL.

[104]  Andrew McCallum,et al.  Relation Extraction with Matrix Factorization and Universal Schemas , 2013, NAACL.

[105]  Gary Geunbae Lee,et al.  A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction , 2012, ACL.

[106]  Fang Kong,et al.  Exploiting Constituent Dependencies for Tree Kernel-Based Semantic Relation Extraction , 2008, COLING.

[107]  Jing Jiang,et al.  Multi-Task Transfer Learning for Weakly-Supervised Relation Extraction , 2009, ACL.

[108]  Dong-Hong Ji,et al.  Unsupervised Feature Selection for Relation Extraction , 2005, IJCNLP.

[109]  Yang Jin,et al.  Simple Algorithms for Complex Relation Extraction with Applications to Biomedical IE , 2005, ACL.

[110]  William R. Hersh,et al.  A survey of current work in biomedical text mining , 2005, Briefings Bioinform..

[111]  Eduard H. Hovy,et al.  Learning surface text patterns for a Question Answering System , 2002, ACL.

[112]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[113]  Makoto Miwa,et al.  Modeling Joint Entity and Relation Extraction with Table Representation , 2014, EMNLP.

[114]  Nanda Kambhatla Minority Vote: At-Least-N Voting Improves Recall for Extracting Relations , 2006, ACL.

[115]  Dong-Hong Ji,et al.  Relation Extraction Using Label Propagation Based Semi-Supervised Learning , 2006, ACL.

[116]  Razvan C. Bunescu,et al.  Learning to Extract Relations from the Web using Minimal Supervision , 2007, ACL.

[117]  Craig A. Knoblock,et al.  Selective Sampling with Redundant Views , 2000, AAAI/IAAI.

[118]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[119]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[120]  Andrew McCallum,et al.  Joint inference of entities, relations, and coreference , 2013, AKBC '13.

[121]  Nanda Kambhatla,et al.  Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations , 2004, ACL 2004.

[122]  Bo Zhang,et al.  StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[123]  P. Elango Coreference Resolution : A Survey , 2006 .

[124]  Andrew McCallum,et al.  Structured Relation Discovery using Generative Models , 2011, EMNLP.

[125]  Ronen Feldman,et al.  Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Web , 2007, ACL.

[126]  Jian Su,et al.  A Composite Kernel to Extract Relations between Entities with Both Flat and Structured Features , 2006, ACL.

[127]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[128]  Guodong Zhou,et al.  Tree kernel-based semantic relation extraction with rich syntactic and semantic information , 2010, Inf. Sci..

[129]  张宏涛,et al.  A Unified Active Learning Framework for Biomedical Relation Extraction , 2012 .

[130]  Ralph Grishman,et al.  Employing Word Representations and Regularization for Domain Adaptation of Relation Extraction , 2014, ACL.

[131]  Denilson Barbosa,et al.  Effectiveness and Efficiency of Open Relation Extraction , 2013, EMNLP.

[132]  Christopher D. Manning,et al.  Combining Distant and Partial Supervision for Relation Extraction , 2014, EMNLP.

[133]  Daniel S. Weld,et al.  Type-Aware Distantly Supervised Relation Extraction with Linked Arguments , 2014, EMNLP.

[134]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[135]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[136]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[137]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[138]  Distant Supervision for Relation Extraction with Matrix Completion , 2014, ACL.

[139]  Christopher Ré,et al.  Big Data versus the Crowd: Looking for Relationships in All the Right Places , 2012, ACL.

[140]  Naoaki Okazaki,et al.  Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web , 2009, ACL.

[141]  Fei-Yu Xu,et al.  Bootstrapping relation extraction from semantic seeds , 2008 .

[142]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[143]  Chang Wang,et al.  Relation Extraction with Relation Topics , 2011, EMNLP.

[144]  Ralph Grishman,et al.  Extracting Relations with Integrated Information Using Kernel Methods , 2005, ACL.

[145]  Kai-Wei Chang,et al.  Typed Tensor Decomposition of Knowledge Bases for Relation Extraction , 2014, EMNLP.

[146]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[147]  Ido Dagan,et al.  Investigating a Generic Paraphrase-Based Approach for Relation Extraction , 2006, EACL.

[148]  Oren Etzioni,et al.  The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.