A BERT model generates diagnostically relevant semantic embeddings from pathology synopses with active learning

Pathology synopses consist of semi-structured or unstructured text summarizing visual information by observing human tissue. Experts write and interpret these synopses with high domain-specific knowledge to extract tissue semantics and formulate a diagnosis in the context of ancillary testing and clinical information. The limited number of specialists available to interpret pathology synopses restricts the utility of the inherent information. Deep learning offers a tool for information extraction and automatic feature generation from complex datasets. Using an active learning approach, we developed a set of semantic labels for bone marrow aspirate pathology synopses. We then trained a transformer-based deep-learning model to map these synopses to one or more semantic labels, and extracted learned embeddings (i.e., meaningful attributes) from the model’s hidden layer. Here we demonstrate that with a small amount of training data, a transformer-based natural language model can extract embeddings from pathology synopses that capture diagnostically relevant information. On average, these embeddings can be used to generate semantic labels mapping patients to probable diagnostic groups with a micro-average F1 score of 0.779 Â ± 0.025. We provide a generalizable deep learning model and approach to unlock the semantic information inherent in pathology synopses toward improved diagnostics, biodiscovery and AI-assisted computational pathology.

[1]  Taghi M. Khoshgoftaar,et al.  A survey on Image Data Augmentation for Deep Learning , 2019, Journal of Big Data.

[2]  J D Pallua,et al.  The future of pathology is digital. , 2020, Pathology, research and practice.

[3]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[4]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[5]  R. Riley,et al.  Bone marrow aspirate and biopsy: a pathologist's perspective. II. interpretation of the bone marrow aspirate and biopsy , 2009, Journal of clinical laboratory analysis.

[6]  Björn Buchhold,et al.  Semantic Search on Text and Knowledge Bases , 2016, Found. Trends Inf. Retr..

[7]  Lydia P. Howell,et al.  Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods , 2019, Academic pathology.

[8]  Douglas Heaven,et al.  Why deep-learning AIs are so easy to fool , 2019, Nature.

[9]  Qin Zhang,et al.  Extracting comprehensive clinical information for breast cancer using deep learning methods , 2019, Int. J. Medical Informatics.

[10]  Liyan Liu,et al.  A Transparent and Adaptable Method to Extract Colonoscopy and Pathology Data Using Natural Language Processing , 2020, Journal of Medical Systems.

[11]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[12]  Joel H. Saltz,et al.  Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies , 2015, BMC Bioinformatics.

[13]  Walter R. Mebane,et al.  Active Learning Approaches for Labeling Text: Review and Assessment of the Performance of Active Learning Approaches , 2020, Political Analysis.

[14]  J. Meeks,et al.  Automated Extraction of Grade, Stage, and Quality Information From Transurethral Resection of Bladder Tumor Pathology Reports Using Natural Language Processing. , 2018, JCO clinical cancer informatics.

[15]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[16]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[17]  Jingbo Zhu,et al.  Active Learning With Sampling by Uncertainty and Density for Data Annotations , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  J. Vardiman,et al.  Acute Myeloid Leukemia With Myelodysplasia-Related Changes. , 2015, American journal of clinical pathology.

[19]  Charles Elkan,et al.  Optimal Thresholding of Classifiers to Maximize F1 Measure , 2014, ECML/PKDD.

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  E. Estey,et al.  Bone marrow evaluation for diagnosis and monitoring of acute myeloid leukemia. , 2017, Blood reviews.

[22]  Morgan Ward,et al.  Extracting Diagnostic Data from Unstructured Bone Marrow Biopsy Reports of Myeloid Neoplasms Utilizing a Customized Natural Language Processing (NLP) Algorithm , 2018 .

[23]  M. Lungren,et al.  Preparing Medical Imaging Data for Machine Learning. , 2020, Radiology.

[24]  Zeeshan Ahmed,et al.  Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine , 2020, Database J. Biol. Databases Curation.

[25]  Wlodzislaw Duch,et al.  What Is Computational Intelligence and Where Is It Going? , 2007, Challenges for Computational Intelligence.

[26]  Charles P. Friedman,et al.  Development of visual diagnostic expertise in pathology -- an information-processing study. , 2003, Journal of the American Medical Informatics Association : JAMIA.

[27]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[28]  Geoff Holmes,et al.  Classifier chains for multi-label classification , 2009, Machine Learning.

[29]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[30]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[31]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Regina Barzilay,et al.  Using machine learning to parse breast pathology reports , 2016, bioRxiv.

[33]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[34]  A. Madabhushi,et al.  Histopathological Image Analysis: A Review , 2009, IEEE Reviews in Biomedical Engineering.

[35]  Claudio Moraga,et al.  The Influence of the Sigmoid Function Parameters on the Speed of Backpropagation Learning , 1995, IWANN.

[36]  Asako Koike,et al.  Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data , 2015, BMC Bioinformatics.

[37]  Ishwar K. Sethi,et al.  Confidence-based active learning , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.