Comparing machine learning clustering with latent class analysis on cancer symptoms' data

Symptom Cluster Research is a major topic in Cancer Symptom Science. In spite of the several statistical and clinical approaches in this domain, there is not a consensus on which method performs better. Identifying a generally accepted analytical method is important in order to be able to utilize and process all the available data. In this paper we report a secondary analysis on cancer symptom data, comparing the performance of five Machine Learning (ML) clustering algorithms in doing so. Based on how well they separate specific subsets of symptom measurements we select the best of them and proceed to compare its performance with the Latent Class Analysis (LCA) method. This analysis is a part of an ongoing study for identifying suitable Machine Learning algorithms to analyse and predict cancer symptoms in cancer treatment.

[1]  Hee-Ju Kim,et al.  Analytical methods and issues for symptom cluster research in oncology , 2013, Current opinion in supportive and palliative care.

[2]  C. Miaskowski Symptom clusters: establishing the link between clinical practice and symptom management research , 2006, Supportive Care in Cancer.

[3]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[4]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[5]  Luis Ramirez,et al.  Big Data Analysis Using Modern Statistical and Machine Learning Methods in Medicine , 2014, International neurourology journal.

[6]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[7]  D. Karnofsky,et al.  The use of the nitrogen mustards in the palliative treatment of carcinoma. With particular reference to bronchogenic carcinoma , 1948 .

[8]  N. Kearney,et al.  Evaluation of a mobile phone-based, advanced symptom management system (ASyMS©) in the management of chemotherapy-related toxicity , 2009, Supportive Care in Cancer.

[9]  L. Tulman,et al.  Symptom Clusters: Concept Analysis and Clinical Implications for Cancer Nursing , 2005, Cancer nursing.

[10]  Mellar P. Davis,et al.  Cancer symptom clusters: clinical and research methodology. , 2011, Journal of palliative medicine.

[11]  Jiye Liang,et al.  A new initialization method for categorical data clustering , 2009, Expert Syst. Appl..

[12]  Dursun Delen,et al.  A comparative analysis of machine learning methods for classification type decision problems in healthcare , 2014, Decis. Anal..

[13]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[14]  N. Kearney,et al.  The symptom phenotype of oncology outpatients remains relatively stable from prior to through 1 week following chemotherapy , 2017, European journal of cancer care.

[15]  J. Unützer,et al.  National Institutes of Health State-of-the-Science Conference Statement , 2005, Journal of palliative medicine.

[16]  M. Dodd,et al.  Conceptual issues in symptom clusters research and their implications for quality-of-life assessment in patients with cancer. , 2007, Journal of the National Cancer Institute. Monographs.

[17]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[18]  Siddharth Rautaray,et al.  Survey on Data Mining Techniques for the Diagnosis of Diseases in Medical Domain , 2014 .

[19]  F. Collins,et al.  A new initiative on precision medicine. , 2015, The New England journal of medicine.

[20]  J. Unützer,et al.  National Institutes of Health State-of-the-Science Conference Statement: Symptom management in cancer: pain, depression, and fatigue, July 15-17, 2002. , 2004, Journal of the National Cancer Institute. Monographs.

[21]  Illhoi Yoo,et al.  Data-Mining Technologies for Diabetes: A Systematic Review , 2011, Journal of diabetes science and technology.

[22]  Illhoi Yoo,et al.  Data Mining in Healthcare and Biomedicine: A Survey of the Literature , 2012, Journal of Medical Systems.

[23]  H. Scher,et al.  The Memorial Symptom Assessment Scale: an instrument for the evaluation of symptom prevalence, characteristics and distress. , 1994, European journal of cancer.

[24]  S. Geetha,et al.  A Survey on Predictive Data mining Approaches for Medical Informatics , 2012 .

[25]  Bradley E Aouizerat,et al.  A review of the prevalence and impact of multiple symptoms in oncology patients. , 2009, Journal of pain and symptom management.

[26]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[27]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[28]  M. Liang,et al.  The Self-Administered Comorbidity Questionnaire: a new method to assess comorbidity for clinical and health services research. , 2003, Arthritis and rheumatism.

[29]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[30]  D. Blumenthal,et al.  The "meaningful use" regulation for electronic health records. , 2010, The New England journal of medicine.

[31]  Alison Richardson,et al.  Development of a Novel Remote Patient Monitoring System: The Advanced Symptom Management System for Radiotherapy to Improve the Symptom Experience of Patients With Lung Cancer Receiving Radiotherapy , 2015, Cancer nursing.

[32]  Divya Tomar,et al.  A survey on Data Mining approaches for Healthcare , 2013, BSBT 2013.

[33]  J. Unützer,et al.  National Institutes of Health State-of-the-Science Conference Statement: Symptom Management in Cancer: Pain, Depression, and Fatigue, July 15-17, 2002. , 2004 .

[34]  Helen Skerman,et al.  Latent Class Analysis Reveals Distinct Subgroups of Patients Based on Symptom Occurrence and Demographic and Clinical Characteristics. , 2015, Journal of pain and symptom management.

[35]  M. Hanumanthappa,et al.  Data Mining In Healthcare: A Survey of Techniques and Algorithms with Its Limitations and Challenges , 2013 .

[36]  Kathryn A. Lee,et al.  Disease and treatment characteristics do not predict symptom occurrence profiles in oncology outpatients receiving chemotherapy , 2014, Cancer.

[37]  S. Fleishman Treatment of symptom clusters: pain, depression, and fatigue. , 2004, Journal of the National Cancer Institute. Monographs.