Extracting postmarketing adverse events from safety reports in the vaccine adverse event reporting system (VAERS) using deep learning

OBJECTIVE Automated analysis of vaccine postmarketing surveillance narrative reports is important to understand the progression of rare but severe vaccine adverse events (AEs). This study implemented and evaluated state-of-the-art deep learning algorithms for named entity recognition to extract nervous system disorder-related events from vaccine safety reports. MATERIALS AND METHODS We collected Guillain-Barré syndrome (GBS) related influenza vaccine safety reports from the Vaccine Adverse Event Reporting System (VAERS) from 1990 to 2016. VAERS reports were selected and manually annotated with major entities related to nervous system disorders, including, investigation, nervous_AE, other_AE, procedure, social_circumstance, and temporal_expression. A variety of conventional machine learning and deep learning algorithms were then evaluated for the extraction of the above entities. We further pretrained domain-specific BERT (Bidirectional Encoder Representations from Transformers) using VAERS reports (VAERS BERT) and compared its performance with existing models. RESULTS AND CONCLUSIONS Ninety-one VAERS reports were annotated, resulting in 2512 entities. The corpus was made publicly available to promote community efforts on vaccine AEs identification. Deep learning-based methods (eg, bi-long short-term memory and BERT models) outperformed conventional machine learning-based methods (ie, conditional random fields with extensive features). The BioBERT large model achieved the highest exact match F-1 scores on nervous_AE, procedure, social_circumstance, and temporal_expression; while VAERS BERT large models achieved the highest exact match F-1 scores on investigation and other_AE. An ensemble of these 2 models achieved the highest exact match microaveraged F-1 score at 0.6802 and the second highest lenient match microaveraged F-1 score at 0.8078 among peer models.

[1]  H. Hartung,et al.  Acute immunoinflammatory neuropathy: update on Guillain-Barré syndrome. , 2002, Current opinion in neurology.

[2]  Jingcheng Du,et al.  Relation Extraction from Clinical Narratives Using Pre-trained Language Models , 2019, AMIA.

[3]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[4]  Hongfang Liu,et al.  CLAMP – a toolkit for efficiently building customized clinical natural language processing pipelines , 2017, J. Am. Medical Informatics Assoc..

[5]  T Botsis,et al.  The contribution of the vaccine adverse event text mining system to the classification of possible Guillain-Barré syndrome reports. , 2013, Applied clinical informatics.

[6]  L. Magder,et al.  The Guillain-Barré syndrome and the 1992-1993 and 1993-1994 influenza vaccines. , 1998, The New England journal of medicine.

[7]  Jingcheng Du,et al.  Time event ontology (TEO): to support semantic representation and reasoning of complex temporal relations of clinical events , 2020, J. Am. Medical Informatics Assoc..

[8]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[9]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[10]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[11]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[12]  Taxiarchis Botsis,et al.  Generation of an annotated reference standard for vaccine adverse event reports. , 2018, Vaccine.

[13]  F. Godlee,et al.  Wakefield’s article linking MMR vaccine and autism was fraudulent , 2011, BMJ : British Medical Journal.

[14]  T. Cherian,et al.  Global immunization: status, progress, challenges and future , 2009, BMC international health and human rights.

[15]  Michele Filannino,et al.  2018 N2c2 Shared Task on Adverse Drug Events and Medication Extraction in Electronic Health Records , 2020, J. Am. Medical Informatics Assoc..

[16]  D. Geier,et al.  Influenza vaccination and Guillain Barre syndrome. , 2003, Clinical immunology.

[17]  Marianthi Markatou,et al.  Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection , 2011, J. Am. Medical Informatics Assoc..

[18]  Jingcheng Du,et al.  A study of deep learning approaches for medication and adverse drug event extraction from clinical text , 2019, J. Am. Medical Informatics Assoc..

[19]  Qingyu Chen,et al.  BioWordVec, improving biomedical word embeddings with subword information and MeSH , 2019, Scientific Data.

[20]  Robert Ball,et al.  Vaccine adverse event text mining system for extracting features from vaccine safety reports , 2012, J. Am. Medical Informatics Assoc..

[21]  Jingqi Wang,et al.  Enhancing Clinical Concept Extraction with Contextual Embedding , 2019, J. Am. Medical Informatics Assoc..

[22]  D. Burwen,et al.  Safety of trivalent inactivated influenza vaccines in adults: background for pandemic influenza vaccine safety monitoring. , 2009, Vaccine.

[23]  Hua Xu,et al.  BERT-based Ranking for Biomedical Entity Normalization , 2019, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[24]  Bo Zhao,et al.  Deep learning in clinical natural language processing: a methodical review , 2019, J. Am. Medical Informatics Assoc..

[25]  Jingcheng Du,et al.  Extracting psychiatric stressors for suicide from social media using deep learning , 2018, BMC Medical Informatics and Decision Making.