Exploring Medical Data Classification with Three-Way Decision Trees

Uncertainty is an intrinsic component of the clinical practice, which manifests itself in a variety of different forms. Despite the growing popularity of Machine Learning–based Decision Support Systems (ML-DSS) in the clinical domain, the effects of the uncertainty that is inherent in the medical data used to train and optimize these systems remain largely under–considered in the Machine Learning community, as well as in the health informatics one. A particularly common type of uncertainty arising in the clinical decision–making process is related to the ambiguity resulting from either lack of decisive information (lack of evidence) or excess of discordant information (lack of consensus). Both types of uncertainty create the opportunity for clinicians to abstain from making a clear–cut classification of the phenomenon under observation and consideration. In this work, we study a Machine Learning model endowed with the ability to directly work with both sources of imperfect information mentioned above. In order to investigate the possible trade–off between accuracy and uncertainty given by the possibility of abstention, we performed an evaluation of the considered model, against a variety of standard Machine Learning algorithms, on a real–world clinical classification problem. We report promising results in terms of commonly used performance metrics.

[1]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[2]  Andreas Holzinger,et al.  Machine Learning for Health Informatics , 2016, Lecture Notes in Computer Science.

[3]  Marc Berg,et al.  Rationalizing Medical Work: Decision-support Techniques and Medical Practices , 2022 .

[4]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[5]  Z. Obermeyer,et al.  Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. , 2016, The New England journal of medicine.

[6]  R. Glynne-Jones,et al.  Critical appraisal of the ‘wait and see’ approach in rectal cancer for clinical complete responders after chemoradiation , 2012, The British journal of surgery.

[7]  R. Warren,et al.  Mammography screening: an incremental cost effectiveness analysis of double versus single reading of mammograms , 1996, BMJ.

[8]  Ben Taskar,et al.  Learning from Partial Labels , 2011, J. Mach. Learn. Res..

[9]  K. Gwet Handbook of Inter-Rater Reliability: The Definitive Guide to Measuring the Extent of Agreement Among Raters , 2014 .

[10]  Steven Hatch Uncertainty in medicine , 2017, British Medical Journal.

[11]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[12]  Davide Ciucci,et al.  Three-Way and Semi-supervised Decision Tree Learning Based on Orthopartitions , 2018, IPMU.

[13]  Federico Cabitza,et al.  Exploiting collective knowledge with three-way decision theory: Cases from the questionnaire-based research , 2017, Int. J. Approx. Reason..

[14]  T. Greenhalgh Uncertainty and Clinical Method , 2013 .

[15]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[16]  Simon Parsons,et al.  Qualitative methods for reasoning under uncertainty , 2001 .

[17]  R. Schwartzstein,et al.  Tolerating Uncertainty - The Next Medical Revolution? , 2016, The New England journal of medicine.

[18]  David Ellerman,et al.  An Introduction to Logical Entropy and its Relation to Shannon Entropy , 2013, Int. J. Semantic Comput..

[19]  L. Dorr,et al.  Rationale of the Knee Society clinical rating system. , 1989, Clinical orthopaedics and related research.

[20]  Patrick Blake,et al.  Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine , 2015, Journal of Clinical Bioinformatics.

[21]  Vili Podgorelec,et al.  Decision Trees: An Overview and Their Use in Medicine , 2002, Journal of Medical Systems.

[22]  Charlene R. Weir,et al.  Characterizing "information transfer" by using a Joint Cognitive Systems model to improve continuity of care in the aged , 2012, Int. J. Medical Informatics.

[23]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[24]  Jeffrey P. Mower PREP-Mt: predictive RNA editor for plant mitochondrial genes , 2005, BMC Bioinformatics.

[25]  K. Borgwardt,et al.  Machine Learning in Medicine , 2015, Mach. Learn. under Resour. Constraints Vol. 3.

[26]  Benjamin Djulbegovic,et al.  Lifting the fog of uncertainty from the practice of medicine , 2004, BMJ : British Medical Journal.

[27]  S. Sheather,et al.  Clinical applications of visual analogue scales: a critical review , 1988, Psychological Medicine.

[28]  Federico Cabitza,et al.  A giant with feet of clay: on the validity of the data that feed machine learning in medicine , 2017, Organizing for the Digital World.

[29]  Burr Settles,et al.  Active Learning , 2012, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[30]  J. Ware,et al.  A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. , 1996, Medical care.

[31]  Deborah Grady,et al.  Less is more: how less health care can result in better health. , 2010, Archives of internal medicine.

[32]  Sumit Pruthi,et al.  Second opinion interpretations by specialty radiologists at a pediatric hospital: rate of disagreement and clinical implications. , 2012, AJR. American journal of roentgenology.

[33]  Yiyu Yao,et al.  An Outline of a Theory of Three-Way Decisions , 2012, RSCTC.

[34]  H. Sox,et al.  Principles of medical decision making. , 1999, Spine.

[35]  Grigorios Tsoumakas,et al.  Multi-Label Classification: An Overview , 2007, Int. J. Data Warehous. Min..

[36]  F. Impellizzeri,et al.  Comparison of the reliability, responsiveness, and construct validity of 4 different questionnaires for evaluating outcomes after total knee arthroplasty. , 2011, The Journal of arthroplasty.

[37]  Marc Thilo Figge,et al.  Automated Classification of Circulating Tumor Cells and the Impact of Interobsever Variability on Classifier Training and Performance , 2015, Journal of immunology research.

[38]  J. Dowie The research-practice gap and the role of decision analysis in closing it , 1996, Health Care Analysis.

[39]  R. Rosenfeld Uncertainty-Based Medicine , 2003, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[40]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[41]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[42]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[43]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[44]  Federico Cabitza,et al.  The elephant in the record: On the multiplicity of data recording work , 2019, Health Informatics J..

[45]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[46]  K Kataoka,et al.  [Indices of obesity derived from body weight and height]. , 1995, Nihon rinsho. Japanese journal of clinical medicine.

[47]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[48]  Andreas Holzinger,et al.  Interactive machine learning for health informatics: when do we need the human-in-the-loop? , 2016, Brain Informatics.

[49]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[50]  Nico Karssemeijer,et al.  Large scale deep learning for computer aided detection of mammographic lesions , 2017, Medical Image Anal..

[51]  Paul K J Han,et al.  Varieties of uncertainty in health care: a conceptual taxonomy. , 2011, Medical decision making : an international journal of the Society for Medical Decision Making.

[52]  Gary Weiss,et al.  Does cost-sensitive learning beat sampling for classifying rare classes? , 2005, UBDM '05.

[53]  D. Ciucci Orthopairs and granular computing , 2016 .

[54]  R. Fox Medical Uncertainty Revisited , 2000 .

[55]  D. Dowding,et al.  Using decision trees to aid decision-making in nursing. , 2004, Nursing times.

[56]  José Hernández-Orallo,et al.  Cautious Classifiers , 2004, ROCAI.

[57]  Davide Ciucci,et al.  Orthopairs: A Simple and Widely UsedWay to Model Uncertainty , 2011, Fundam. Informaticae.

[58]  J. Kassirer,et al.  The threshold approach to clinical decision making. , 1980, The New England journal of medicine.