Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks

Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonym prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Finally, we conduct a thorough grid search, which prior work lacks, to find the best set of hyperparameters. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still out perform the Siamese Networks initialized with embedding extracted from biomedical BERT model.

[1]  Vinh Phu Nguyen Adding an Attention Layer Improves the Performance of a Neural Network Architecture for Synonymy Prediction in the UMLS Metathesaurus , 2022, MedInfo.

[2]  Olivier Bodenreider,et al.  Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus , 2021, WWW.

[3]  Zaiqiao Meng,et al.  Self-Alignment Pretraining for Biomedical Entity Representations , 2020, NAACL.

[4]  Claudia Schulz,et al.  Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer! , 2020, AAAI.

[5]  Betty van Aken,et al.  How Does BERT Answer Questions?: A Layer-Wise Analysis of Transformer Representations , 2019, CIKM.

[6]  Benoît Sagot,et al.  What Does BERT Learn about the Structure of Language? , 2019, ACL.

[7]  Qingyu Chen,et al.  BioWordVec, improving biomedical word embeddings with subword information and MeSH , 2019, Scientific Data.

[8]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[9]  Noah A. Smith,et al.  To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.

[10]  Yonatan Belinkov,et al.  Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.

[11]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[12]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[13]  Iryna Gurevych,et al.  Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[16]  Kevin Leyton-Brown,et al.  An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.

[17]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[18]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[19]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..