Learning to ground medical text in a 3D human atlas

In this paper, we develop a method for grounding medical text into a physically meaningful and interpretable space corresponding to a human atlas. We build on text embedding architectures such as Bert and introduce a loss function that allows us to reason about the semantic and spatial relatedness of medical texts by learning a projection of the embedding into a 3D space representing the human body. We quantitatively and qualitatively demonstrate that our proposed method learns a context sensitive and spatially aware mapping, in both the inter-organ and intra-organ sense, using a large scale medical text dataset from the “Large-scale online biomedical semantic indexing” track of the 2020 BioASQ challenge. We extend our approach to a self-supervised setting, and find it to be competitive with a classification based method, and a fully supervised variant of approach.

[1]  B Pflesser,et al.  A Realistic Model of Human Structure from the Visible Human Data , 2001, Methods of Information in Medicine.

[2]  Karl Heinz Höhne,et al.  Segmentation of the Visible Human for high-quality volume-based visualization , 1997, Medical Image Anal..

[3]  Andreas Pommert,et al.  Creating a high-resolution spatial/symbolic model of the inner organs based on the Visible Human , 2001, Medical Image Anal..

[4]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[5]  Georgios Balikas,et al.  An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition , 2015, BMC Bioinformatics.

[6]  Trevor Darrell,et al.  Natural Language Object Retrieval , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Daniel King,et al.  ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing , 2019, BioNLP@ACL.

[8]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[9]  Vicente Ordonez,et al.  ReferItGame: Referring to Objects in Photographs of Natural Scenes , 2014, EMNLP.

[10]  William W. Cohen,et al.  Probing Biomedical Embeddings from Language Models , 2019, Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for.

[11]  Matthew B. Blaschko,et al.  Self-supervised context-aware COVID-19 document exploration through atlas grounding , 2020, NLPCOVID19.

[12]  C E Lipscomb,et al.  Medical Subject Headings (MeSH). , 2000, Bulletin of the Medical Library Association.

[13]  Jayant Krishnamurthy,et al.  Jointly Learning to Parse and Perceive: Connecting Natural Language to the Physical World , 2013, TACL.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Trevor Darrell,et al.  Grounding of Textual Phrases in Images by Reconstruction , 2015, ECCV.

[16]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[19]  Marie-Christine Gosselin,et al.  Development of a New Generation of High-Resolution Anatomical Models for Medical Device Evaluation , 2014 .

[20]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[21]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[22]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[23]  Oren Etzioni,et al.  CORD-19: The Covid-19 Open Research Dataset , 2020, NLPCOVID19.

[24]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[25]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[26]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[27]  Niels Kuster,et al.  Development of a new generation of high-resolution anatomical models for medical device evaluation: the Virtual Population 3.0 , 2014, Physics in medicine and biology.

[28]  Niels Kuster,et al.  The Virtual Family—development of surface-based anatomical models of two adults and two children for dosimetric simulations , 2010, Physics in medicine and biology.

[29]  Liwei Wang,et al.  Learning Two-Branch Neural Networks for Image-Text Matching Tasks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Sanja Fidler,et al.  What Are You Talking About? Text-to-Image Coreference , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Licheng Yu,et al.  Modeling Context in Referring Expressions , 2016, ECCV.

[32]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[33]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[34]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[35]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[36]  Kyle Lo,et al.  SciBERT: Pretrained Contextualized Embeddings for Scientific Text , 2019, ArXiv.

[37]  Sanja Fidler,et al.  Skip-Thought Vectors , 2015, NIPS.