Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching

Semantic text matching models have been widely used in community question answering, information retrieval, and dialogue. However, these models cannot well address the long-form text matching problem. That is because there are usually many noises in the setting of long-form text matching, and it is difficult for existing semantic text matching to capture the key matching signals from this noisy information. Besides, these models are computationally expensive because they simply use all textual data indiscriminately in the matching process. To tackle the effectiveness and efficiency problem, we propose a novel hierarchical noise filtering model in this paper, namely Match-Ignition. The basic idea is to plug the wellknown PageRank algorithm into the Transformer, to identify and filter both sentence and word level noisy information in the matching process. Noisy sentences are usually easy to detect because the sentence is the basic unit of a long-form text, so we directly use PageRank to filter such information, based on a sentence similarity graph. While words need to rely on their contexts to express concrete meanings, so we propose to jointly learn the filtering process and the matching process, to reflect the contextual dependencies between words. Specifically, a word graph is first built based on the attention scores in each self-attention block of Transformer, and keywords are then selected by applying PageRank on this graph. In this way, noisy words will be filtered out layer by layer in the matching process. Experimental results show that Match-Ignition outperforms both traditional text matching models for short text and recent long-form text matching models. We also conduct detailed analysis to show that Match-Ignition can efficiently capture important sentences or words, which are helpful for long-form text matching. ACM Reference Format: Liang Pang, Yanyan Lan, Xueqi Cheng. 2021. Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching. In Proceedings of ACM Conference (Conference’17). ACM, New York, NY, USA, 9 pages.

[1]  Gerard de Melo,et al.  PACRR: A Position-Aware Neural IR Model for Relevance Matching , 2017, EMNLP.

[2]  Xueqi Cheng,et al.  Match-SRNN: Modeling the Recursive Matching Structure with Spatial RNN , 2016, IJCAI.

[3]  Cheng Li,et al.  Semantic Text Matching for Long-Form Documents , 2019, WWW.

[4]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Xueqi Cheng,et al.  DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval , 2017, CIKM.

[7]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[8]  Liu Yang,et al.  Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Document Matching , 2020, ArXiv.

[9]  Xueqi Cheng,et al.  A Study of MatchPyramid Models on Ad-hoc Retrieval , 2016, ArXiv.

[10]  Yu Xu,et al.  Matching Article Pairs with Graphical Decomposition and Convolutions , 2018, ACL.

[11]  Zheng Zhang,et al.  Star-Transformer , 2019, NAACL.

[12]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[13]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[15]  Hugo Zaragoza,et al.  The Probabilistic Relevance Framework: BM25 and Beyond , 2009, Found. Trends Inf. Retr..

[16]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[17]  Feng Ji,et al.  Simple and Effective Text Matching with Richer Alignment Features , 2019, ACL.

[18]  Xuancheng Ren,et al.  Sparse Transformer: Concentrated Attention Through Explicit Selection , 2019 .

[19]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[22]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[23]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[24]  Xueqi Cheng,et al.  A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations , 2015, AAAI.

[25]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[26]  Yelong Shen,et al.  A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval , 2014, CIKM.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[29]  Hang Li,et al.  A Deep Architecture for Matching Short Texts , 2013, NIPS.

[30]  Nick Craswell,et al.  Learning to Match using Local and Distributed Representations of Text for Web Search , 2016, WWW.

[31]  Rabab Kreidieh Ward,et al.  Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[32]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[33]  Xueqi Cheng,et al.  MatchZoo: A Toolkit for Deep Text Matching , 2017, ArXiv.

[34]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[35]  W. Bruce Croft,et al.  A Deep Look into Neural Ranking Models for Information Retrieval , 2019, Inf. Process. Manag..