Attention guided capsule networks for chemical-protein interaction extraction

The biomedical literature contains a sufficient number of chemical-protein interactions (CPIs). Automatic extraction of CPI is a crucial task in the biomedical domain, which has excellent benefits for precision medicine, drug discovery and basic biomedical research. In this study, we propose a novel model, BERT-based attention-guided capsule networks (BERT-Att-Capsule), for CPI extraction. Specifically, the approach first employs BERT (Bidirectional Encoder Representations from Transformers) to capture the long-range dependencies and bidirectional contextual information of input tokens. Then, the aggregation is regarded as a routing problem for how to pass messages from source capsule nodes to target capsule nodes. This process enables capsule networks to determine what and how much information need to be transferred, as well as to identify sophisticated and interleaved features. Afterwards, the multi-head attention is applied to guide the model to learn different contribution weights of capsule networks obtained by the dynamic routing. We evaluate our model on the CHEMPROT corpus. Our approach is superior in performance as compared with other state-of-the-art methods. Experimental results show that our approach can adequately capture the long-range dependencies and bidirectional contextual information of input tokens, obtain more fine-grained aggregation information through attention-guided capsule networks, and therefore improve the performance.

[1]  Hongfang Liu,et al.  Extracting chemical–protein relations using attention-based neural networks , 2018, Database J. Biol. Databases Curation.

[2]  Yang Liu,et al.  Extracting chemical-protein interactions from biomedical literature via granular attention based recurrent neural networks , 2019, Comput. Methods Programs Biomed..

[3]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[4]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[5]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[6]  Xuanjing Huang,et al.  Information Aggregation via Dynamic Routing for Sequence Encoding , 2018, COLING.

[7]  Zibin Zheng,et al.  Dynamically Route Hierarchical Structure Representation to Attentive Capsule for Text Classification , 2019, IJCAI.

[8]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[9]  Yijia Zhang,et al.  Chemical–protein interaction extraction via contextualized word representations and multihead attention , 2019, Database J. Biol. Databases Curation.

[10]  Xiaoyan Zhu,et al.  Aspect-level Sentiment Analysis using AS-Capsules , 2019, WWW.

[11]  Wei Zhang,et al.  Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction , 2018, EMNLP.

[12]  Tieyun Qian,et al.  Transfer Capsule Network for Aspect Level Sentiment Classification , 2019, ACL.

[13]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[14]  Juliane Fluck,et al.  Development of a benchmark corpus to support the automatic extraction of drug-related adverse effects from medical case reports , 2012, J. Biomed. Informatics.

[15]  Yifan Peng,et al.  Extracting chemical–protein relations with ensembles of SVM and deep learning models , 2018, Database J. Biol. Databases Curation.

[16]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[19]  Stefan M. Rüger,et al.  Adverse Drug Reaction Classification With Deep Neural Networks , 2016, COLING.

[20]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[21]  Tingting Zhao,et al.  Extracting chemical–protein interactions from literature using sentence structure analysis and feature engineering , 2019, Database.

[22]  John Boyle,et al.  Improving the learning of chemical-protein interactions from literature using transfer learning and specialized word embeddings , 2018, Database J. Biol. Databases Curation.

[23]  Min Yang,et al.  Investigating Capsule Networks with Dynamic Routing for Text Classification , 2018, EMNLP.