论文信息 - Understanding patient complaint characteristics using contextual clinical BERT embeddings

Understanding patient complaint characteristics using contextual clinical BERT embeddings

In clinical conversational applications, extracted entities tend to capture the main subject of a patient’s com-plaint, namely symptoms or diseases. However, they mostly fail to recognize the characterizations of a complaint such as the time, the onset, and the severity. For example, if the input is "I have a headache and it is extreme", state-of-the-art models only recognize the main symptom entity - headache, but ignore the severity factor of extreme, that characterises headache. In this paper, we design a two-fold approach to detect the characterizations of entities like symptoms presented by general users in contexts where they would describe their symptoms to a clinician. We use Word2Vec and BERT models to encode clinical text given by the patients. We transform the output and re-frame the task as a multi-label classification problem. Finally, we combine the processed encodings with the Linear Discriminant Analysis (LDA) algorithm to classify the characterizations of the main entity. Experimental results demonstrate that our method achieves 40-50% improvement in the accuracy over the state-of-the-art models.

[1] Noémie Elhadad,et al. Unsupervised biomedical named entity recognition: Experiments with clinical and biological texts , 2013, J. Biomed. Informatics.

[2] Shameek Ghosh,et al. Quro: Facilitating User Symptom Check Using a Personalised Chatbot-Oriented Dialogue System. , 2018, Studies in health technology and informatics.

[3] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[5] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[6] Ming Yang,et al. Entity recognition from clinical texts via recurrent neural network , 2017, BMC Medical Informatics and Decision Making.

[7] Terence Sim,et al. Discriminant Subspace Analysis: A Fukunaga-Koontz Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Daniel King,et al. ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing , 2019, BioNLP@ACL.

[9] Busra Celikkaya,et al. Comprehend Medical: A Named Entity Recognition and Relationship Extraction Web Service , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).