论文信息 - Uncovering Main Causalities for Long-tailed Information Extraction

Uncovering Main Causalities for Long-tailed Information Extraction

Information Extraction (IE) aims to extract structural information from unstructured texts. In practice, long-tailed distributions caused by the selection bias of a dataset, may lead to incorrect correlations, also known as spurious correlations, between entities and labels in the conventional likelihood models. This motivates us to propose counterfactual IE (CFIE), a novel framework that aims to uncover the main causalities behind data in the view of causal inference. Specifically, 1) we first introduce a unified structural causal model (SCM) for various IE tasks, describing the relationships among variables; 2) with our SCM, we then generate counterfactuals based on an explicit language structure to better calculate the direct causal effect during the inference stage; 3) we further propose a novel debiasing approach to yield more robust predictions. Experiments on three IE tasks across five public datasets show the effectiveness of our CFIE model in mitigating the spurious correlation issues.

[1] Guillaume Lample,et al. Neural Architectures for Named Entity Recognition , 2016, NAACL.

[2] Philipp Koehn,et al. Abstract Meaning Representation for Sembanking , 2013, LAW@ACL.

[3] Wei Lu,et al. Speaker-Oriented Latent Structures for Dialogue-Based Relation Extraction , 2021, ArXiv.

[4] Mark A. Przybocki,et al. The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[5] Jian Tang,et al. Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs , 2020, ICML.

[6] Wei Lu,et al. Dependency-Guided LSTM-CRF for Named Entity Recognition , 2019, EMNLP.

[7] Huajun Chen,et al. Contrastive Triple Extraction with Generative Transformer , 2020, ArXiv.

[8] Wei Lu,et al. Reasoning with Latent Structure Refinement for Document-Level Relation Extraction , 2020, ACL.

[9] Christopher D. Manning,et al. Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.

[10] Gökhan Tür,et al. What is left to be understood in ATIS? , 2010, 2010 IEEE Spoken Language Technology Workshop.

[11] Xi Chen,et al. Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks , 2019, NAACL.

[12] Lidong Bing,et al. Better Feature Integration for Named Entity Recognition , 2021, NAACL.

[13] Hanwang Zhang,et al. Deconfounded Image Captioning: A Causal Retrospect , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Eric Nichols,et al. Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[15] Hanwang Zhang,et al. Interventional Few-Shot Learning , 2020, NeurIPS.

[16] Jun Liu,et al. SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Jinfeng Yi,et al. Model-Agnostic Counterfactual Reasoning for Eliminating Popularity Bias in Recommender System , 2020, KDD.

[18] Wei Fan,et al. Cooperative Denoising for Distantly Supervised Relation Extraction , 2018, COLING.

[19] J. Pearl,et al. Causal Inference in Statistics: A Primer , 2016 .

[20] Zhiwu Lu,et al. Counterfactual VQA: A Cause-Effect Look at Language Bias , 2020, Computer Vision and Pattern Recognition.

[21] Zhao Wang,et al. Identifying spurious correlations for robust text classification , 2020, FINDINGS.

[22] Dacheng Tao,et al. Label-Noise Robust Domain Adaptation , 2020, ICML.

[23] Zhiyuan Liu,et al. Hierarchical Relation Extraction with Coarse-to-Fine Grained Attention , 2018, EMNLP.

[24] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26] Claire Gardent,et al. The WebNLG Challenge: Generating Text from RDF Data , 2017, INLG.

[27] Jun Zhao,et al. Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[28] Tong Zhang,et al. Stable Learning via Differentiated Variable Decorrelation , 2020, KDD.

[29] Jianqiang Huang,et al. Unbiased Scene Graph Generation From Biased Training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Jianfeng Dong,et al. Context-aware Biaffine Localizing Network for Temporal Sentence Grounding , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Lifu Huang,et al. Zero-Shot Transfer Learning for Event Extraction , 2017, ACL.

[32] Guohui Ling,et al. Causal Intervention for Leveraging Popularity Bias in Recommendation , 2021, SIGIR.

[33] Xiang Ren,et al. Learning Dual Retrieval Module for Semi-supervised Relation Extraction , 2019, WWW.

[34] Donald B. Rubin,et al. Essential concepts of causal inference: a remarkable history and an intriguing future , 2019, Biostatistics & Epidemiology.

[35] Yue Zhang,et al. N-ary Relation Extraction using Graph-State LSTM , 2018, EMNLP.

[36] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[37] Daniel Gildea,et al. The Proposition Bank: An Annotated Corpus of Semantic Roles , 2005, CL.

[38] Xiangnan He,et al. Clicks can be Cheating: Counterfactual Recommendation for Mitigating Clickbait Issue , 2020, SIGIR.

[39] Kaiming He,et al. Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[40] Maosong Sun,et al. Learning from Context or Names? An Empirical Study on Neural Relation Extraction , 2020, EMNLP.

[41] Tat-Seng Chua,et al. Interventional Video Relation Detection , 2021, ACM Multimedia.

[42] Luo Si,et al. De-biased Court’s View Generation with Causality , 2020, EMNLP.

[43] Ralph Grishman,et al. Event Detection and Domain Adaptation with Convolutional Neural Networks , 2015, ACL.

[44] Yifan Yang,et al. PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction , 2021, ACL.

[45] Xiangnan He,et al. Should Graph Convolution Trust Neighbors? A Simple Causal Inference Method , 2020, SIGIR.

[46] Huajun Chen,et al. Document-level Relation Extraction as Semantic Segmentation , 2021, IJCAI.

[47] Xiangnan He,et al. Empowering Language Understanding with Counterfactual Reasoning , 2021, FINDINGS.

[48] Nanyun Peng,et al. Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[49] Liangli Zhen,et al. Video Corpus Moment Retrieval with Contrastive Learning , 2021, SIGIR.

[50] Fei Wu,et al. Recurrent Attention Network with Reinforced Generator for Visual Dialog , 2020, ACM Trans. Multim. Comput. Commun. Appl..

[51] Zhao Wang,et al. Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals , 2020, AAAI.

[52] Illtyd Trethowan. Causality , 1938 .

[53] Meng Wang,et al. Deconfounded Video Moment Retrieval with Causal Intervention , 2021, SIGIR.

[54] Hanwang Zhang,et al. Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect , 2020, NeurIPS.

[55] Seyed-Ahmad Ahmadi,et al. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[56] Marcus Rohrbach,et al. Decoupling Representation and Classifier for Long-Tailed Recognition , 2020, ICLR.

[57] Percy Liang,et al. Robustness to Spurious Correlations via Human Annotations , 2020, ICML.

[58] Matthias Niessner,et al. ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language , 2020, ECCV.

[59] Liangli Zhen,et al. Natural Language Video Localization: A Revisit in Span-Based Question Answering Framework , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60] Anton van den Hengel,et al. Counterfactual Vision and Language Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62] Stella X. Yu,et al. Large-Scale Long-Tailed Recognition in an Open World , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63] Ji-Rong Wen,et al. Counterfactual Data-Augmented Sequential Recommendation , 2021, SIGIR.

[64] Jinhui Tang,et al. Causal Intervention for Weakly-Supervised Semantic Segmentation , 2020, NeurIPS.

[65] Yunqi Li,et al. Counterfactual Explainable Recommendation , 2021, CIKM.

[66] Uri Shalit,et al. Identifying Causal Effect Inference Failure with Uncertainty-Aware Models , 2020, NeurIPS.

[67] Huajun Chen,et al. OpenUE: An Open Toolkit of Universal Extraction from Text , 2020, EMNLP.

[68] Hwee Tou Ng,et al. Towards Robust Linguistic Analysis using OntoNotes , 2013, CoNLL.

[69] Jiashi Feng,et al. The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation , 2020, ECCV.

[70] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[71] Yongdong Zhang,et al. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS SPECIAL ISSUE ON DEEP NEURAL NETWORKS FOR GRAPHS 1 Causal Incremental Graph Convolution for Recommender System Retraining , 2021 .

[72] Rui Qiao,et al. Interventional Video Grounding with Dual Contrastive Learning , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[73] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[74] Hoifung Poon,et al. Distant Supervision for Relation Extraction beyond the Sentence Boundary , 2016, EACL.

[75] Eduard H. Hovy,et al. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[76] Wei Lu,et al. Learning Latent Forests for Medical Relation Extraction , 2020, IJCAI.

[77] Angel X. Chang,et al. Scan2Cap: Context-aware Dense Captioning in RGB-D Scans , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[78] Xuefeng Bai,et al. Semantic Representation for Dialogue Modeling , 2021, ACL.

[79] Jie Zhou,et al. MAVEN: A Massive General Domain Event Detection Dataset , 2020, EMNLP.