Hierarchical Graph Network for Multi-hop Question Answering

In this paper, we present Hierarchical Graph Network (HGN) for multi-hop question answering. To aggregate clues from scattered texts across multiple paragraphs, a hierarchical graph is created by constructing nodes from different levels of granularity (questions, paragraphs, sentences, and entities), the representations of which are initialized with RoBERTa-based context encoders. Given this hierarchical graph, the initial node representations are updated through graph propagation, and multi-hop reasoning is performed via traversing through the graph edges for each subsequent sub-task (e.g., paragraph selection, supporting facts extraction, answer prediction). By weaving heterogeneous nodes into an integral unified graph, this characteristic hierarchical differentiation of node granularity enables HGN to support different question answering sub-tasks simultaneously. Experiments on the HotpotQA benchmark demonstrate that the proposed model achieves new state of the art in both the Distractor and Fullwiki settings.

[1]  Julien Perez,et al.  Latent Question Reformulation and Information Accumulation for Multi-Hop Machine Reading , 2019 .

[2]  Chang Zhou,et al.  Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.

[3]  Richard Socher,et al.  Efficient and Robust Question Answering from Minimal Context over Documents , 2018, ACL.

[4]  SWERING WITH EXTRA,et al.  T RANSFORMER -XH: M ULTI - HOP Q UESTION A N - SWERING WITH E X TRA H OP A TTENTION , 2022 .

[5]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[6]  Sebastian Riedel,et al.  Constructing Datasets for Multi-hop Reading Comprehension Across Documents , 2017, TACL.

[7]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[8]  Yu Cheng,et al.  Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding , 2020, ArXiv.

[9]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[10]  Mohit Bansal,et al.  Revealing the Importance of Semantic Retrieval for Machine Reading at Scale , 2019, EMNLP.

[11]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[12]  Zijian Wang,et al.  Answering Complex Open-domain Questions Through Iterative Query Generation , 2019, EMNLP.

[13]  Guokun Lai,et al.  RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.

[14]  Rémi Louf,et al.  Transformers : State-ofthe-art Natural Language Processing , 2019 .

[15]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[16]  Hannaneh Hajishirzi,et al.  Multi-hop Reading Comprehension through Question Decomposition and Rescoring , 2019, ACL.

[17]  Richard Socher,et al.  Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering , 2019, ICLR.

[18]  Ting Liu,et al.  Is Graph Structure Necessary for Multi-hop Reasoning? , 2020, ArXiv.

[19]  Wenhan Xiong,et al.  Simple yet Effective Bridge Reasoning for Open-Domain Multi-Hop Question Answering , 2019, MRQA@EMNLP.

[20]  Jonathan Berant,et al.  The Web as a Knowledge-Base for Answering Complex Questions , 2018, NAACL.

[21]  Ming Tu,et al.  Select, Answer and Explain: Interpretable Multi-hop Reading Comprehension over Multiple Documents , 2020, AAAI.

[22]  Paul N. Bennett,et al.  Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention , 2020, ICLR.

[23]  Xiaodong Liu,et al.  Stochastic Answer Networks for Machine Reading Comprehension , 2017, ACL.

[24]  Yue Zhang,et al.  Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks , 2018, ArXiv.

[25]  Richard Socher,et al.  Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering , 2019, ICLR.

[26]  Ran El-Yaniv,et al.  Multi-Hop Paragraph Retrieval for Open-Domain Question Answering , 2019, ACL.

[27]  Ruslan Salakhutdinov,et al.  Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.

[28]  Ankur P. Parikh,et al.  Multi-Mention Learning for Reading Comprehension with Neural Cascades , 2017, ICLR.

[29]  Philip Bachman,et al.  NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[30]  Greg Durrett,et al.  Understanding Dataset Design Choices for Multi-hop Reasoning , 2019, NAACL.

[31]  Eunsol Choi,et al.  Coarse-to-Fine Question Answering for Long Documents , 2016, ACL.

[32]  Nicola De Cao,et al.  Question Answering by Reasoning Across Documents with Graph Convolutional Networks , 2018, NAACL.

[33]  Bowen Zhou,et al.  Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs , 2019, ACL.

[34]  Mohit Bansal,et al.  Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA , 2019, ACL.

[35]  Masaaki Nagata,et al.  Answering while Summarizing: Multi-task Learning for Multi-hop QA with Evidence Extraction , 2019, ACL.

[36]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[37]  Jason Weston,et al.  Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[38]  Greg Durrett,et al.  Multi-hop Question Answering via Reasoning Chains , 2019, ArXiv.

[39]  Rajarshi Das,et al.  Multi-step Entity-centric Information Retrieval for Multi-Hop Question Answering , 2019, EMNLP.

[40]  Mohit Bansal,et al.  Self-Assembling Modular Networks for Interpretable Multi-Hop Reasoning , 2019, EMNLP/IJCNLP.

[41]  Shuohang Wang,et al.  Machine Comprehension Using Match-LSTM and Answer Pointer , 2016, ICLR.

[42]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[43]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[44]  Lei Li,et al.  Dynamically Fused Graph Network for Multi-hop Reasoning , 2019, ACL.

[45]  Sameer Singh,et al.  Compositional Questions Do Not Necessitate Multi-hop Reasoning , 2019, ACL.

[46]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.