AnswerFact: Fact Checking in Product Question Answering

Product-related question answering platforms nowadays are widely employed in many E-commerce sites, providing a convenient way for potential customers to address their concerns during online shopping. However, the misinformation in the answers on those platforms poses unprecedented challenges for users to obtain reliable and truthful product information, which may even cause a commercial loss in E-commerce business. To tackle this issue, we investigate to predict the veracity of answers in this paper and introduce AnswerFact, a large scale fact checking dataset from product question answering forums. Each answer is accompanied by its veracity label and associated evidence sentences, providing a valuable testbed for evidence-based fact checking tasks in QA settings. We further propose a novel neural model with tailored evidence ranking components to handle the concerned answer veracity prediction problem. Extensive experiments are conducted with our proposed model and various existing fact checking methods, showing that our method outperforms all baselines on this task.

[1]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[2]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[3]  Preslav Nakov,et al.  SemEval-2017 Task 3: Community Question Answering , 2017, *SEMEVAL.

[4]  Preslav Nakov,et al.  Fact Checking in Community Forums , 2018, AAAI.

[5]  Feng Ji,et al.  Simple and Effective Text Matching with Richer Alignment Features , 2019, ACL.

[6]  David Carmel,et al.  Product Question Answering Using Customer Generated Content - Research Challenges , 2018, SIGIR.

[7]  Wai Lam,et al.  Joint Learning of Answer Selection and Answer Summary Generation in Community Question Answering , 2019, AAAI.

[8]  Vasudeva Varma,et al.  Fermi at SemEval-2019 Task 8: An elementary but effective approach to Question Discernment in Community QA Forums , 2019, SemEval@NAACL-HLT.

[9]  Wai Lam,et al.  Review-guided Helpful Answer Identification in E-commerce , 2020, WWW.

[10]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[11]  Mengting Wan,et al.  Modeling Ambiguity, Subjectivity, and Diverging Viewpoints in Opinion Question Answering Systems , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[12]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[13]  Haonan Chen,et al.  Combining Fact Extraction and Verification with Neural Semantic Matching Networks , 2018, AAAI.

[14]  Kam-Fai Wong,et al.  Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks , 2019, ACL.

[15]  Wenji Mao,et al.  Modeling Conversation Structure and Temporal Dynamics for Jointly Predicting Rumor Stance and Veracity , 2019, EMNLP.

[16]  Rui Yan,et al.  Natural Language Inference by Tree-Based Convolution and Heuristic Matching , 2015, ACL.

[17]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[18]  Christian Hansen,et al.  MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims , 2019, EMNLP.

[19]  Madian Khabsa,et al.  Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching , 2018, AAAI.

[20]  Gerhard Weikum,et al.  DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning , 2018, EMNLP.

[21]  Preslav Nakov,et al.  Integrating Stance Detection and Fact Checking in a Unified Corpus , 2018, NAACL.

[22]  Aaron Smith and Monica Anderson,et al.  Online Shopping and E-Commerce , 2016 .

[23]  Verónica Pérez-Rosas,et al.  Automatic Detection of Fake News , 2017, COLING.

[24]  Wenhu Chen,et al.  TabFact: A Large-scale Dataset for Table-based Fact Verification , 2019, ICLR.

[25]  Nan Hua,et al.  Universal Sentence Encoder for English , 2018, EMNLP.

[26]  Arkaitz Zubiaga,et al.  SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours , 2017, *SEMEVAL.

[27]  Wai Lam,et al.  Answer Ranking for Product-Related Questions via Multiple Semantic Relations Modeling , 2020, SIGIR.

[28]  Preslav Nakov,et al.  SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums , 2019, *SEMEVAL.

[29]  Andreas Vlachos,et al.  The Fact Extraction and VERification (FEVER) Shared Task , 2018, FEVER@EMNLP.

[30]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[31]  Siu Cheung Hui,et al.  Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture , 2017, SIGIR.

[32]  He Jiang,et al.  Combating Fake News , 2019, ACM Trans. Intell. Syst. Technol..

[33]  Andreas Vlachos,et al.  Automated Fact Checking: Task Formulations, Methods and Future Directions , 2018, COLING.

[34]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[35]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.