SymptomGraph: Identifying Symptom Clusters from Narrative Clinical Notes using Graph Clustering

Patients with cancer or other chronic diseases often experience different symptoms before or after treatments. The symptoms could be physical, gastrointestinal, psychological, or cognitive (memory loss), or other types. Previous research focuses on understanding the individual symptoms or symptom correlations by collecting data through symptom surveys and using traditional statistical methods to analyze the symptoms, such as principal component analysis or factor analysis. This research proposes a computational system, SymptomGraph, to identify the symptom clusters in the narrative text of written clinical notes in electronic health records (EHR). SymptomGraph is developed to use a set of natural language processing (NLP) and artificial intelligence (AI) methods to first extract the clinician-documented symptoms from clinical notes. Then, a semantic symptom expression clustering method is used to discover a set of typical symptoms. A symptom graph is built based on the co-occurrences of the symptoms. Finally, a graph clustering algorithm is developed to discover the symptom clusters. Although SymptomGraph is applied to the narrative clinical notes, it can be adapted to analyze symptom survey data. We applied Symptom-Graph on a colorectal cancer patient with and without diabetes (Type 2) data set to detect the patient symptom clusters one year after the chemotherapy. Our results show that SymptomGraph can identify the typical symptom clusters of colorectal cancer patients' post-chemotherapy. The results also show that colorectal cancer patients with diabetes often show more symptoms of peripheral neuropathy, younger patients have mental dysfunctions of alcohol or tobacco abuse, and patients at later cancer stages show more memory loss symptoms. Our system can be generalized to extract and analyze symptom clusters of other chronic diseases or acute diseases like COVID-19.

[1]  Soroush Vosoughi,et al.  SymptomID: A Framework for Rapid Symptom Identification in Pandemics Using News Reports , 2021, ACM Trans. Manag. Inf. Syst..

[2]  Chuanze Kang,et al.  LR-GNN: a graph neural network based on link representation for predicting molecular associations , 2021, Briefings Bioinform..

[3]  Susan Storey,et al.  A Computational Framework to Analyze the Associations Between Symptoms and Cancer Patient Attributes Post Chemotherapy Using EHR Data , 2021, IEEE Journal of Biomedical and Health Informatics.

[4]  S. Kang,et al.  Significance of descriptive symptoms and signs and clinical parameters as predictors of neuropathic cancer pain , 2021, PloS one.

[5]  P. Cuijpers,et al.  Symptom clusters among cancer survivors: what can machine learning techniques tell us? , 2021, BMC Medical Research Methodology.

[6]  Sara Nouri Golmaei,et al.  DeepNote-GNN: predicting hospital readmission using clinical notes and patient network , 2021, BCB.

[7]  O. Cauli,et al.  Chemotherapy-Induced Neuropathy and Diabetes: A Scoping Review , 2021, Current oncology.

[8]  Yang Xiang,et al.  COVID-19 trial graph: a linked graph for COVID-19 clinical trials , 2021, J. Am. Medical Informatics Assoc..

[9]  C. Weng,et al.  Comparative effectiveness of medical concept embedding for feature engineering in phenotyping , 2021, JAMIA open.

[10]  K. Stavem,et al.  Persistent symptoms 1.5–6 months after COVID-19 in non-hospitalised subjects: a population-based cohort study , 2020, Thorax.

[11]  Feichen Shen,et al.  Constructing co-occurrence network embeddings to assist association extraction for COVID-19 and other coronavirus infectious diseases , 2020, J. Am. Medical Informatics Assoc..

[12]  Yilong Yang,et al.  Preoperative insomnia and its association with psychological factors, pain and anxiety in Chinese colorectal cancer patients , 2019, Supportive Care in Cancer.

[13]  Liying Zhang,et al.  Symptom clusters in patients with breast cancer receiving radiation therapy. , 2019, European journal of oncology nursing : the official journal of European Oncology Nursing Society.

[14]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[15]  Philip E. Bourne,et al.  Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review , 2019, J. Am. Medical Informatics Assoc..

[16]  Zhiyuan Liu,et al.  Graph Neural Networks: A Review of Methods and Applications , 2018, AI Open.

[17]  J. Holland,et al.  Physical Symptom Burden and Its Association With Distress, Anxiety, and Depression in Breast Cancer. , 2018, Psychosomatics.

[18]  Min Song,et al.  Relation extraction for biological pathway construction using node2vec , 2018, BMC Bioinformatics.

[19]  Hongfang Liu,et al.  Constructing Node Embeddings for Human Phenotype Ontology to Assist Phenotypic Similarity Measurement , 2018, 2018 IEEE International Conference on Healthcare Informatics Workshop (ICHI-W).

[20]  Nan Hua,et al.  Universal Sentence Encoder , 2018, ArXiv.

[21]  Xiang Zhang,et al.  Automated Medical Diagnosis by Ranking Clusters Across the Symptom-Disease Network , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[22]  N. Staff,et al.  Chemotherapy‐induced peripheral neuropathy: A current review , 2017, Annals of neurology.

[23]  Xiao Hu,et al.  Comparing machine learning clustering with latent class analysis on cancer symptoms' data , 2016, 2016 IEEE Healthcare Innovation Point-Of-Care Technologies Conference (HI-POCT).

[24]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[25]  L. Linder,et al.  Measures and Methods for Symptom and Symptom Cluster Assessment in Adolescents and Young Adults with Cancer. , 2015, Seminars in oncology nursing.

[26]  F. Pouwer,et al.  The impact of diabetes on neuropathic symptoms and receipt of chemotherapy among colorectal cancer patients: results from the PROFILES registry , 2015, Journal of Cancer Survivorship.

[27]  P. Butow,et al.  Symptom clusters in patients with advanced cancer: a systematic review of observational studies. , 2014, Journal of pain and symptom management.

[28]  Stephen T. Sonis,et al.  The Economic Burden of Toxicities Associated with Cancer Treatment: Review of the Literature and Analysis of Nausea and Vomiting, Diarrhoea, Oral Mucositis and Fatigue , 2013, PharmacoEconomics.

[29]  Ricardo J. G. B. Campello,et al.  Density-Based Clustering Based on Hierarchical Density Estimates , 2013, PAKDD.

[30]  Jimeng Sun,et al.  SympGraph: a framework for mining clinical notes through symptom relation graphs , 2012, KDD.

[31]  E. Bruera,et al.  Articles of interest in other scholarly journals , 2011, BMJ Supportive & Palliative Care.

[32]  S. McMillan,et al.  Peripheral neuropathy in patients with colorectal cancer receiving oxaliplatin. , 2011, Clinical journal of oncology nursing.

[33]  T. Chiu,et al.  Significance of symptom clustering in palliative care of advanced cancer patients. , 2010, Journal of pain and symptom management.

[34]  D. Peña Undocumented , 2009 .

[35]  Hee-Ju Kim,et al.  Statistical Approaches to Modeling Symptom Clusters in Cancer Patients , 2008, Cancer nursing.

[36]  Hee-Ju Kim,et al.  Common factor analysis versus principal component analysis: choice for symptom cluster research. , 2008, Asian nursing research.

[37]  M. Cappell,et al.  Pathophysiology, clinical presentation, and management of colon cancer. , 2008, Gastroenterology clinics of North America.

[38]  A. Barsevick The elusive concept of the symptom cluster. , 2007, Oncology Nursing Forum.

[39]  Declan Walsh,et al.  Symptom clustering in advanced cancer , 2006, Supportive Care in Cancer.

[40]  L. Tulman,et al.  Symptom Clusters: Concept Analysis and Clinical Implications for Cancer Nursing , 2005, Cancer nursing.

[41]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[42]  M. Horowitz,et al.  Prevalence of gastrointestinal symptoms associated with diabetes mellitus: a population-based survey of 15,000 adults. , 2001, Archives of internal medicine.

[43]  S S Schiffman,et al.  Taste and smell perception affect appetite and immunity in the elderly , 2000, European journal of clinical nutrition.

[44]  N. Redeker,et al.  Insomnia, Fatigue, Anxiety, Depression, and Quality of Life of Cancer Patients Undergoing Chemotherapy , 2000, Scholarly Inquiry for Nursing Practice.

[45]  Susan S. Schiffinan Chemosensory impairment and appetite commentary on "Impaired sensory functioning in elders: the relation with its potential determinants and nutritional intake". , 1999 .

[46]  C. de Graaf,et al.  Impaired sensory functioning in elders: the relation with its potential determinants and nutritional intake. , 1999, The journals of gerontology. Series A, Biological sciences and medical sciences.

[47]  A. Norman,et al.  Why do patients with weight loss have a worse outcome when undergoing chemotherapy for gastrointestinal malignancies? , 1998, European journal of cancer.

[48]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[49]  Hai-Cheng Yi,et al.  Predicting Drug-Target Interactions by Node2vec Node Embedding in Molecular Associations Network , 2020, ICIC.

[50]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[51]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[52]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[53]  D. Theobald Cancer pain, fatigue, distress, and insomnia in cancer patients. , 2004, Clinical cornerstone.

[54]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.