Table-based Fact Verification with Salience-aware Learning

Tables provide valuable knowledge that can be used to verify textual statements. While a number of works have considered table-based fact verification, direct alignments of tabular data with tokens in textual statements are rarely available. Moreover, training a generalized fact verification model requires abundant labeled training data. In this paper, we propose a novel system to address these problems. Inspired by counterfactual causality, our system identifies token-level salience in the statement with probing-based salience estimation. Salience estimation allows enhanced learning of fact verification from two perspectives. From one perspective, our system conducts masked salient token prediction to enhance the model for alignment and reasoning between the table and the statement. From the other perspective, our system applies salienceaware data augmentation to generate a more diverse set of training instances by replacing non-salient terms. Experimental results on TabFact show the effective improvement by the proposed salience-aware learning techniques, leading to the new SOTA performance on the benchmark. 1

[1]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[2]  Jianqiang Huang,et al.  Unbiased Scene Graph Generation From Biased Training , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Vasudeva Varma,et al.  MVAE: Multimodal Variational Autoencoder for Fake News Detection , 2019, WWW.

[4]  David A. Smith,et al.  Structural Encoding and Pre-training Matter: Adapting BERT for Table-Based Fact Verification , 2021, EACL.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[7]  Seung-won Hwang,et al.  Paraphrase Diversification Using Counterfactual Debiasing , 2019, AAAI.

[8]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[9]  Thomas Muller,et al.  Understanding tables with intermediate pre-training , 2020, FINDINGS.

[10]  Claire Cardie,et al.  Finding Deceptive Opinion Spam by Any Stretch of the Imagination , 2011, ACL.

[11]  Tie-Yan Liu,et al.  Towards Better Text Understanding and Retrieval through Kernel Entity Salience Modeling , 2018, SIGIR.

[12]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[13]  Zhiwu Lu,et al.  Counterfactual VQA: A Cause-Effect Look at Language Bias , 2020, Computer Vision and Pattern Recognition.

[14]  Luke S. Zettlemoyer,et al.  Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.

[15]  Kuan-Hao Huang,et al.  Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs , 2021, EACL.

[16]  Diyi Yang,et al.  ToTTo: A Controlled Table-To-Text Generation Dataset , 2020, EMNLP.

[17]  Eduard Hovy,et al.  Learning the Difference that Makes a Difference with Counterfactually-Augmented Data , 2020, ICLR.

[18]  Ting Liu,et al.  Learn to Combine Linguistic and Symbolic Information for Table-based Fact Verification , 2020, COLING.

[19]  Ali Farhadi,et al.  Defending Against Neural Fake News , 2019, NeurIPS.

[20]  Donald Nute,et al.  Counterfactuals , 1975, Notre Dame J. Formal Log..

[21]  Dan Klein,et al.  Learning to Compose Neural Networks for Question Answering , 2016, NAACL.

[22]  Bill Byrne,et al.  Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem , 2020, ACL.

[23]  Nan Duan,et al.  LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network , 2020, ACL.

[24]  Henry E. Brady Causation and Explanation in Social Science , 2008 .

[25]  Hannaneh Hajishirzi,et al.  Fact or Fiction: Verifying Scientific Claims , 2020, EMNLP.

[26]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[27]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[28]  Christian Chiarcos,et al.  Introduction: Salience in linguistics and beyond , 2011, Salience - Multidisciplinary Perspectives on its Function in Discourse.

[29]  Haonan Chen,et al.  Combining Fact Extraction and Verification with Neural Semantic Matching Networks , 2018, AAAI.

[30]  Hao Ma,et al.  Table Cell Search for Question Answering , 2016, WWW.

[31]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[32]  Quan Liu,et al.  Program Enhanced Fact Verification with Verbalization and Graph Attention Network , 2020, EMNLP.

[33]  Illtyd Trethowan Causality , 1938 .

[34]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[35]  Ankur Taly,et al.  Counterfactual Fairness in Text Classification through Robustness , 2018, AIES.

[36]  Yejin Choi,et al.  Counterfactual Story Reasoning and Generation , 2019, EMNLP.

[37]  Kyomin Jung,et al.  Detecting Incongruity Between News Headline and Body Text via a Deep Hierarchical Encoder , 2018, AAAI.

[38]  Dan Roth,et al.  TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification , 2018, EMNLP.

[39]  Teruko Mitamura,et al.  Automatic Event Salience Identification , 2018, EMNLP.

[40]  Wenhu Chen,et al.  HybridQA: A Dataset of Multi-Hop Question Answering over Tabular and Textual Data , 2020, EMNLP.

[41]  Thomas Muller,et al.  TaPas: Weakly Supervised Table Parsing via Pre-training , 2020, ACL.

[42]  Maosong Sun,et al.  GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification , 2019, ACL.

[43]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[44]  P. Tetlock,et al.  Counterfactual Thought Experiments in World Politics Logical, Methodological, and Psychological Perspectives , 1996 .

[45]  Ryan Cotterell,et al.  Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology , 2019, ACL.

[46]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[47]  J. C. Cheung,et al.  Factual Error Correction for Abstractive Summarization Models , 2020, EMNLP.

[48]  Dan Roth,et al.  Evidence-based Trustworthiness , 2019, ACL.

[49]  Wenhu Chen,et al.  TabFact: A Large-scale Dataset for Table-based Fact Verification , 2019, ICLR.