Interactive Self-Attentive Siamese Network for Biomedical Sentence Similarity

The determination of semantic similarity between sentences is an important component in natural language processing (NLP) tasks such as text retrieval and text summarization. Many approaches have been proposed for estimating sentence similarity, and Siamese neural networks (SNN) provide a better approach. However, the sentence semantic representation, generated by sharing weights in the SNN without any attention mechanism, ignores the different contributions of different words to the overall sentence semantics. Furthermore, the attention operation within only a single sentence neglects interactive semantic influence on similarity estimation. To address these issues, an interactive self-attention (ISA) mechanism is proposed in this paper and integrated with an SNN, named an interactive self-attentive Siamese neural network (ISA-SNN) which is used to verify the effectiveness of ISA. The proposed model obtains the weights of words in a single sentence by means of self-attention and extracts inherent interactive semantic information between sentences via interactive attention to enhance sentence semantic representation. It achieves better performances without feature engineering than other existing methods on three biomedical benchmark datasets (a Pearson correlation coefficient of 0.656 and 0.713/0.658 on DBMI and CDD-ful/-ref, respectively).

[1]  Zhiyong Lu,et al.  PubMed Text Similarity Model and its application to curation efforts in the Conserved Domain Database , 2019, Database J. Biol. Databases Curation.

[2]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.

[3]  Guillaume A. Rousselet,et al.  Robust Correlation Analyses: False Positive and Power Validation Using a New Open Source Matlab Toolbox , 2012, Front. Psychology.

[4]  Wu Deng,et al.  Semi-Supervised Broad Learning System Based on Manifold Regularization and Broad Network , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.

[5]  Hongfei Lin,et al.  Extracting Drug-Drug Interaction from the Biomedical Literature Using a Stacked Generalization-Based Approach , 2013, PloS one.

[6]  Xuelong Li,et al.  MAM-RNN: Multi-level Attention Model Based RNN for Video Captioning , 2017, IJCAI.

[7]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8]  Marco Marelli,et al.  SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment , 2016, Language Resources and Evaluation.

[9]  Yanquan Zhou,et al.  A Hierarchical multi-input and output Bi-GRU Model for Sentiment Analysis on Customer Reviews , 2018 .

[10]  Houfeng Wang,et al.  Interactive Attention Networks for Aspect-Level Sentiment Classification , 2017, IJCAI.

[11]  Ngoc Phuoc An Vo,et al.  A Multi-Layer System for Semantic Textual Similarity , 2016, KDIR.

[12]  Baogang Wei,et al.  Dependency-based Siamese long short-term memory network for learning sentence representations , 2018, PloS one.

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  Mohamed Ali Hadj Taieb,et al.  Computing semantic similarity between biomedical concepts using new information content approach. , 2016, Journal of biomedical informatics.

[15]  Huimin Zhao,et al.  An Improved Quantum-Inspired Differential Evolution Algorithm for Deep Belief Network , 2020, IEEE Transactions on Instrumentation and Measurement.

[16]  Sunil Kumar Sahu,et al.  Recurrent neural network models for disease name recognition using domain invariant features , 2016, ACL.

[17]  Djoerd Hiemstra,et al.  Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term , 2002, SIGIR '02.

[18]  Li Zhao,et al.  Learning Structured Representation for Text Classification via Reinforcement Learning , 2018, AAAI.

[19]  Danny Merkx,et al.  Learning semantic sentence representations from visually grounded language without lexical knowledge , 2019, Natural Language Engineering.

[20]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[21]  Ziming Chi,et al.  A Sentence Similarity Estimation Method Based on Improved Siamese Network , 2018 .

[22]  Juan-Manuel Torres-Moreno,et al.  Predicting the Semantic Textual Similarity with Siamese CNN and LSTM , 2018, JEPTALNRECITAL.

[23]  Vitalii Zhelezniak,et al.  Correlation Coefficients and Semantic Textual Similarity , 2019, NAACL.

[24]  Bowen Zhou,et al.  ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs , 2015, TACL.

[25]  Yuanyuan Yang,et al.  Attentive Siamese LSTM Network for Semantic Textual Similarity Measure , 2018, 2018 International Conference on Asian Language Processing (IALP).

[26]  M. de Rijke,et al.  Short Text Similarity with Word Embeddings , 2015, CIKM.

[27]  Hongfei Lin,et al.  An attention‐based BiLSTM‐CRF approach to document‐level chemical named entity recognition , 2018, Bioinform..

[28]  Maarten Versteegh,et al.  Learning Text Similarity with Siamese Recurrent Networks , 2016, Rep4NLP@ACL.

[29]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[30]  Matthias Samwald,et al.  Neural sentence embedding models for semantic similarity estimation in the biomedical domain , 2019, BMC Bioinformatics.

[31]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[32]  Feng Ji,et al.  Improving Multilingual Semantic Textual Similarity with Shared Sentence Encoder for Low-resource Languages , 2018, ArXiv.

[33]  Jan Snajder,et al.  TakeLab: Systems for Measuring Semantic Text Similarity , 2012, *SEMEVAL.

[34]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[35]  Tao Jiang,et al.  Attentional Encoder Network for Targeted Sentiment Classification , 2019, ICANN.

[36]  Tapio Salakoski,et al.  Care episode retrieval: distributional semantic models for information retrieval in the clinical domain , 2015, BMC Medical Informatics and Decision Making.

[37]  Linqin Cai,et al.  A Stacked BiLSTM Neural Network Based on Coattention Mechanism for Question Answering , 2019, Comput. Intell. Neurosci..

[38]  Hinrich Schütze,et al.  Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts , 2018, SemDeep@COLING.

[39]  Rada Mihalcea,et al.  Measuring the Semantic Similarity of Texts , 2005, EMSEE@ACL.

[40]  Bo Xu,et al.  An effective neural model extracting document level chemical-induced disease relations from biomedical literature , 2018, J. Biomed. Informatics.

[41]  Rohan Ramanath,et al.  An Attentive Survey of Attention Models , 2019, ACM Trans. Intell. Syst. Technol..