A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

Relation extraction has the potential for large-scale knowledge graph construction, but current methods do not consider the qualifier attributes for each relation triplet, such as time, quantity or location. The qualifiers form hyper-relational facts which better capture the rich and complex knowledge graph structure. For example, the relation triplet (Leonard Parker, Educated At, Harvard University) can be factually enriched by including the qualifier (End Time, 1967). Hence, we propose the task of hyper-relational extraction to extract more specific and complete facts from text. To support the task, we construct HyperRED, a large-scale and general-purpose dataset. Existing models cannot perform hyper-relational extraction as it requires a model to consider the interaction between three entities. Hence, we propose CubeRE, a cube-filling model inspired by table-filling approaches and explicitly considers the interaction between relation triplets and qualifiers. To improve model scalability and reduce negative class imbalance, we further propose a cube-pruning method. Our experiments show that CubeRE outperforms strong baselines and reveal possible directions for future research. Our code and data are available at github.com/declare-lab/HyperRED.

[1]  Wei Han,et al.  Dialogue Relation Extraction with Document-Level Heterogeneous Graph Attention Networks , 2020, Cognitive Computation.

[2]  Lidong Bing,et al.  IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks , 2022, Annual Meeting of the Association for Computational Linguistics.

[3]  H. Ng,et al.  Document-Level Relation Extraction with Adaptive Focal Loss and Knowledge Distillation , 2022, FINDINGS.

[4]  Yew Ken Chia,et al.  RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction , 2022, FINDINGS.

[5]  Lidong Bing,et al.  MReD: A Meta-Review Dataset for Structure-Controllable Text Generation , 2021, FINDINGS.

[6]  William W. Cohen,et al.  Time-Aware Language Models as Temporal Knowledge Bases , 2021, TACL.

[7]  H. Ng,et al.  Revisiting DocRED - Addressing the Overlooked False Negative Problem in Relation Extraction , 2022, arXiv.org.

[8]  Soujanya Poria,et al.  Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering , 2021, RANLP.

[9]  Soujanya Poria,et al.  Aspect Sentiment Triplet Extraction Using Reinforcement Learning , 2021, CIKM.

[10]  Lidong Bing,et al.  Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction , 2021, ACL.

[11]  Yuanbin Wu,et al.  UniRE: A Unified Label Space for Entity Relation Extraction , 2021, ACL.

[12]  Soujanya Poria,et al.  CIDER: Commonsense Inference for Dialogue Explanation and Reasoning , 2021, SIGDIAL.

[13]  Zhiyuan Liu,et al.  CLEVE: Contrastive Pre-training for Event Extraction , 2021, ACL.

[14]  Stefano Soatto,et al.  Structured Prediction as Translation between Augmented Natural Languages , 2021, ICLR.

[15]  Zexuan Zhong,et al.  A Frustratingly Easy Approach for Joint Entity and Relation Extraction , 2020, ArXiv.

[16]  Roberto Navigli,et al.  REBEL: Relation Extraction By End-to-end Language generation , 2021, EMNLP.

[17]  Maosong Sun,et al.  CodRED: A Cross-Document Relation Extraction Dataset for Acquiring Knowledge in the Wild , 2021, EMNLP.

[18]  Sam Witteveen,et al.  Red Dragon AI at TextGraphs 2020 Shared Task : LIT : LSTM-Interleaved Transformer for Multi-Hop Explanation Ranking , 2020, TEXTGRAPHS.

[19]  Luo Si,et al.  APE: Argument Pair Extraction from Peer Review and Rebuttal via Multi-task Learning , 2020, EMNLP.

[20]  Benjamin Van Durme,et al.  Temporal Reasoning in Natural Language Inference , 2020, FINDINGS.

[21]  Jens Lehmann,et al.  Message Passing for Hyper-Relational Knowledge Graphs , 2020, EMNLP.

[22]  Hannaneh Hajishirzi,et al.  Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web , 2020, ACL.

[23]  Paolo Rosso,et al.  Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction , 2020, WWW.

[24]  Alec Radford,et al.  Scaling Laws for Neural Language Models , 2020, ArXiv.

[25]  Hwee Tou Ng,et al.  Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction , 2019, AAAI.

[26]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[27]  Mathias Niepert,et al.  Cross-Sentence N-ary Relation Extraction using Lower-Arity Universal Schemas , 2019, EMNLP.

[28]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[29]  Sam Witteveen,et al.  Scene Graph Parsing by Attention Graph , 2019, ArXiv.

[30]  Maosong Sun,et al.  DocRED: A Large-Scale Document-Level Relation Extraction Dataset , 2019, ACL.

[31]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[32]  Yuanzhuo Wang,et al.  Link Prediction on N-ary Relational Data , 2019, WWW.

[33]  Hoifung Poon,et al.  Document-Level N-ary Relation Extraction with Multiscale Representation Learning , 2019, NAACL.

[34]  Xi Chen,et al.  Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks , 2019, NAACL.

[35]  Haonan Chen,et al.  Combining Fact Extraction and Verification with Neural Semantic Matching Networks , 2018, AAAI.

[36]  Jason Weston,et al.  Dialogue Natural Language Inference , 2018, ACL.

[37]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[38]  Zhiyuan Liu,et al.  FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation , 2018, EMNLP.

[39]  Christophe Gravier,et al.  T-REx: A Large Scale Alignment of Natural Language with Knowledge Base Triples , 2018, LREC.

[40]  Xin Luna Dong,et al.  CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web , 2018, Proc. VLDB Endow..

[41]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[42]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[43]  Danqi Chen,et al.  Position-aware Attention and Supervised Data Improve Slot Filling , 2017, EMNLP.

[44]  James P. Callan,et al.  Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding , 2017, WWW.

[45]  Hinrich Schütze,et al.  Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction , 2016, COLING.

[46]  Hao Ma,et al.  Question Answering with Knowledge Base, Web and Beyond , 2016, NAACL.

[47]  Jianxin Li,et al.  On the Representation and Embedding of Knowledge Bases beyond Binary Relations , 2016, IJCAI.

[48]  Evgeniy Gabrilovich,et al.  A Review of Relational Machine Learning for Knowledge Graphs , 2015, Proceedings of the IEEE.

[49]  Lidong Bing,et al.  Improving Distant Supervision for Information Extraction Using Label Propagation Through Lists , 2015, EMNLP.

[50]  Michael Günther,et al.  Introducing Wikidata to the Linked Data Web , 2014, SEMWEB.

[51]  Makoto Miwa,et al.  Modeling Joint Entity and Relation Extraction with Table Representation , 2014, EMNLP.

[52]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[53]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[54]  Lidong Bing,et al.  Wikipedia entity expansion and attribute extraction from the web using semi-supervised learning , 2013, WSDM.

[55]  Simone Paolo Ponzetto,et al.  Collaboratively built semi-structured content and Artificial Intelligence: The story so far , 2013, Artif. Intell..

[56]  Christian Bizer,et al.  DBpedia spotlight: shedding light on the web of documents , 2011, I-Semantics '11.

[57]  Andrew McCallum,et al.  Modeling Relations and Their Mentions without Labeled Text , 2010, ECML/PKDD.

[58]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[59]  Nguyen Bach,et al.  A Review of Relation Extraction , 2007 .

[60]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[61]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.