Multiresolution Graph Attention Networks for Relevance Matching

A large number of deep learning models have been proposed for the text matching problem, which is at the core of various typical natural language processing (NLP) tasks. However, existing deep models are mainly designed for the semantic matching between a pair of short texts, such as paraphrase identification and question answering, and do not perform well on the task of relevance matching between short-long text pairs. This is partially due to the fact that the essential characteristics of short-long text matching have not been well considered in these deep models. More specifically, these methods fail to handle extreme length discrepancy between text pieces and neither can they fully characterize the underlying structural information in long text documents. In this paper, we are especially interested in relevance matching between a piece of short text and a long document, which is critical to problems like query-document matching in information retrieval and web searching. To extract the structural information of documents, an undirected graph is constructed, with each vertex representing a keyword and the weight of an edge indicating the degree of interaction between keywords. Based on the keyword graph, we further propose a Multiresolution Graph Attention Network to learn multi-layered representations of vertices through a Graph Convolutional Network (GCN), and then match the short text snippet with the graphical representation of the document with an attention mechanism applied over each layer of the GCN. Experimental results on two datasets demonstrate that our graph approach outperforms other state-of-the-art deep matching models.

[1]  Yelong Shen,et al.  Learning semantic representations using convolutional neural networks for web search , 2014, WWW.

[2]  Ming Zhou,et al.  Reinforced Mnemonic Reader for Machine Reading Comprehension , 2017, IJCAI.

[3]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[4]  Peng Zhang,et al.  IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models , 2017, SIGIR.

[5]  Svetlana Hensman,et al.  Construction of Conceptual Graph Representation of Texts , 2004, NAACL.

[6]  Yixin Chen,et al.  An End-to-End Deep Learning Architecture for Graph Classification , 2018, AAAI.

[7]  Barbara Rosario,et al.  Latent Semantic Indexing : An Overview 1 Latent Semantic Indexing : An overview INFOSYS 240 Spring 2000 Final Paper , 2001 .

[8]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[9]  Steven J. Simske,et al.  Document sentences as a small world , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[10]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[11]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[12]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[13]  Sanda M. Harabagiu,et al.  Learning Textual Graph Patterns to Detect Causal Event Relations , 2010, FLAIRS.

[14]  Zheng Chen,et al.  Representing document as dependency graph for document clustering , 2011, CIKM '11.

[15]  J. Leskovec,et al.  Learning Semantic Graph Mapping for Document Summarization , 2004 .

[16]  Hang Li,et al.  An Information Retrieval Approach to Short Text Conversation , 2014, ArXiv.

[17]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[18]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[19]  Collin F. Baker,et al.  Graph Methods for Multilingual FrameNets , 2017, TextGraphs@ACL.

[20]  Aditi Sharan,et al.  Keyword and Keyphrase Extraction Techniques: A Literature Review , 2015 .

[21]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[22]  Frans Coenen,et al.  Text Classification using Graph Mining-based Feature Extraction , 2010, SGAI Conf..

[23]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[24]  Xueqi Cheng,et al.  A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations , 2015, AAAI.

[25]  Yuxing Peng,et al.  Mnemonic Reader for Machine Comprehension , 2017, ArXiv.

[26]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[27]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[28]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[29]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[30]  Xuanjing Huang,et al.  Convolutional Neural Tensor Network Architecture for Community-Based Question Answering , 2015, IJCAI.

[31]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[32]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[33]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[34]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[35]  Stefan Riezler,et al.  A Full-Text Learning to Rank Dataset for Medical Information Retrieval , 2016, ECIR.

[36]  Susan T. Dumais,et al.  Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..

[37]  Takenobu Tokunaga,et al.  Evaluating text coherence based on semantic similarity graph , 2017, TextGraphs@ACL.

[38]  Abraham Kandel,et al.  Clustering of Web Documents using a Graph Model , 2003, Web Document Analysis.

[39]  Xueqi Cheng,et al.  MatchZoo: A Toolkit for Deep Text Matching , 2017, ArXiv.

[40]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[41]  Michalis Vazirgiannis,et al.  Text Categorization as a Graph Classification Problem , 2015, ACL.

[42]  Wenpeng Yin,et al.  Convolutional Neural Network for Paraphrase Identification , 2015, NAACL.

[43]  Michalis Vazirgiannis,et al.  Graph-of-word and TW-IDF: new approach to ad hoc IR , 2013, CIKM.

[44]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[45]  Hang Li,et al.  A Deep Architecture for Matching Short Texts , 2013, NIPS.