Multi-Level Matching Networks for Text Matching

Text matching aims to establish the matching relationship between two texts. It is an important operation in some information retrieval related tasks such as question duplicate detection, question answering, and dialog systems. Bidirectional long short term memory (BiLSTM) coupled with attention mechanism has achieved state-of-the-art performance in text matching. A major limitation of existing works is that only high level contextualized word representations are utilized to obtain word level matching results without considering other levels of word representations, thus resulting in incorrect matching decisions for cases where two words with different meanings are very close in high level contextualized word representation space. Therefore, instead of making decisions utilizing single level word representations, a multi-level matching network (MMN) is proposed in this paper for text matching, which utilizes multiple levels of word representations to obtain multiple word level matching results for final text level matching decision. Experimental results on two widely used benchmarks, SNLI and Scaitail, show that the proposed MMN achieves the state-of-the-art performance.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Zhiguo Wang,et al.  Bilateral Multi-Perspective Matching for Natural Language Sentences , 2017, IJCAI.

[4]  Claudia Niederée,et al.  Multihop Attention Networks for Question Answer Matching , 2018, SIGIR.

[5]  Siu Cheung Hui,et al.  Hermitian Co-Attention Networks for Text Matching in Asymmetrical Domains , 2018, IJCAI.

[6]  Quan Z. Sheng,et al.  Related or Duplicate: Distinguishing Similar CQA Questions via Convolutional Neural Networks , 2018, SIGIR.

[7]  Peter Clark,et al.  SciTaiL: A Textual Entailment Dataset from Science Question Answering , 2018, AAAI.

[8]  Hrant Khachatrian,et al.  Natural Language Inference over Interaction Space: ICLR 2018 Reproducibility Report , 2018, ArXiv.

[9]  Jun Huang,et al.  Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems , 2018, SIGIR.

[10]  Jian Zhang,et al.  Natural Language Inference over Interaction Space , 2017, ICLR.

[11]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[12]  Jakob Uszkoreit,et al.  A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.

[13]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[14]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[15]  Xuanjing Huang,et al.  Convolutional Interaction Network for Natural Language Inference , 2018, EMNLP.

[16]  Siu Cheung Hui,et al.  Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference , 2017, EMNLP.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.