Text Matching (TM) is a fundamental task of natural language processing widely used in many application systems such as information retrieval, automatic question answering, machine translation, dialogue system, reading comprehension, etc. In recent years, a large number of deep learning neural networks have been applied to TM, and have refreshed benchmarks of TM repeatedly. Among the deep learning neural networks, convolutional neural network (CNN) is one of the most popular networks, which suffers from difficulties in dealing with small samples and keeping relative structures of features. In this paper, we propose a novel deep learning architecture based on capsule network for TM, called CapsTM, where capsule network is a new type of neural network architecture proposed to address some of the short comings of CNN and shows great potential in many tasks. CapsTM is a five-layer neural network, including an input layer, a representation layer, an aggregation layer, a capsule layer and a prediction layer. In CapsTM, two pieces of text are first individually converted into sequences of embeddings and are further transformed by a highway network in the input layer. Then, Bidirectional Long Short-Term Memory (BiLSTM) is used to represent each piece of text and attention-based interaction matrix is used to represent interactive information of the two pieces of text in the representation layer. Subsequently, the two kinds of representations are fused together by BiLSTM in the aggregation layer, and are further represented with capsules (vectors) in the capsule layer. Finally, the prediction layer is a connected network used for classification. CapsTM is an extension of ESIM by adding a capsule layer before the prediction layer. We construct a corpus of Chinese medical question matching, which contains 36,360 question pairs. This corpus is randomly split into three parts: a training set of 32,360 question pairs, a development set of 2000 question pairs and a test set of 2000 question pairs. On this corpus, we conduct a series of experiments to evaluate the proposed CapsTM and compare it with other state-of-the-art methods. CapsTM achieves the highest F-score of 0.8666. The experimental results demonstrate that CapsTM is effective for Chinese medical question matching and outperforms other state-of-the-art methods for comparison.
[1]
Rabab Kreidieh Ward,et al.
Semantic Modelling with Long-Short-Term Memory for Information Retrieval
,
2014,
ArXiv.
[2]
Bowen Zhou,et al.
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
,
2015,
TACL.
[3]
Jin-Hyuk Hong,et al.
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
,
2018,
AAAI.
[4]
Jeffrey Dean,et al.
Efficient Estimation of Word Representations in Vector Space
,
2013,
ICLR.
[5]
Zhiguo Wang,et al.
Bilateral Multi-Perspective Matching for Natural Language Sentences
,
2017,
IJCAI.
[6]
Tao Shen,et al.
DiSAN: Directional Self-Attention Network for RNN/CNN-free Language Understanding
,
2017,
AAAI.
[7]
Jakob Uszkoreit,et al.
A Decomposable Attention Model for Natural Language Inference
,
2016,
EMNLP.
[8]
Larry P. Heck,et al.
Learning deep structured semantic models for web search using clickthrough data
,
2013,
CIKM.
[9]
Hilde van der Togt,et al.
Publisher's Note
,
2003,
J. Netw. Comput. Appl..
[10]
Geoffrey E. Hinton,et al.
Dynamic Routing Between Capsules
,
2017,
NIPS.
[11]
Zhen-Hua Ling,et al.
Enhanced LSTM for Natural Language Inference
,
2016,
ACL.
[12]
Ming-Wei Chang,et al.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
,
2019,
NAACL.
[13]
Yelong Shen,et al.
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
,
2014,
CIKM.
[14]
Maarten Versteegh,et al.
Learning Text Similarity with Siamese Recurrent Networks
,
2016,
Rep4NLP@ACL.
[15]
Hang Li,et al.
Convolutional Neural Network Architectures for Matching Natural Language Sentences
,
2014,
NIPS.