A patient-similarity-based model for diagnostic prediction

OBJECTIVE To simulate the clinical reasoning of doctors, retrieve analogous patients of an index patient automatically and predict diagnoses by the similar/dissimilar patients. METHODS We proposed a novel patient-similarity-based framework for diagnostic prediction, which is inspired by the structure-mapping theory about analogy reasoning in psychology. Patient similarity is defined as the similarity between two patients' diagnoses sets rather than a dichotomous (absence/presence of just one disease). The multilabel classification problem is converted to a single-value regression problem by integrating the pairwise patients' clinical features into a vector and taking the vector as the input and the patient similarity as the output. In contrast to the common k-NN method which only considering the nearest neighbors, we not only utilize similar patients (positive analogy) to generate diagnostic hypotheses, but also utilize dissimilar patients (negative analogy) are used to reject diagnostic hypotheses. RESULTS The patient-similarity-based models perform better than the one-vs-all baseline and traditional k-NN methods. The f-1 score of positive-analogy-based prediction is 0.698, significantly higher than the scores of baselines ranging from 0.368 to 0.661. It increases to 0.703 when the negative analogy method is applied to modify the prediction results of positive analogy. The performance of this method is highly promising for larger datasets. CONCLUSION The patient-similarity-based model provides diagnostic decision support that is more accurate, generalizable, and interpretable than those of previous methods and is based on heterogeneous and incomplete data. The model also serves as a new application for the use of clinical big data through artificial intelligence technology.

[1]  Bo Yang,et al.  Deep Subspace Similarity Fusion for the Prediction of Cancer Subtypes , 2018, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[2]  Paul Bartha,et al.  By parallel reasoning : the construction and evaluation of analogical arguments , 2010 .

[3]  Nino Guallart,et al.  Analogical Reasoning in Clinical Practice , 2014 .

[4]  David Riaño,et al.  Computer technologies to integrate medical treatments to manage multimorbidity , 2017, J. Biomed. Informatics.

[5]  K. Dunbar,et al.  The in vivo/in vitro approach to cognition: the case of analogy , 2001, Trends in Cognitive Sciences.

[6]  Mary Hesse,et al.  Models and analogies in science , 1970 .

[7]  Stephan Dreiseitl,et al.  Using concept hierarchies to improve calculation of patient similarity , 2016, J. Biomed. Informatics.

[8]  Lisa M. Baker,et al.  Relational reasoning in medical education: Patterns in discourse and diagnosis. , 2014 .

[9]  E.H. Shortliffe,et al.  Knowledge engineering for medical decision making: A review of computer-based clinical decision aids , 1979, Proceedings of the IEEE.

[10]  Fei Wang,et al.  PSF: A Unified Patient Similarity Evaluation Framework Through Metric Learning With Weak Supervision , 2015, IEEE Journal of Biomedical and Health Informatics.

[11]  D. Gentner,et al.  Structure mapping in analogy and similarity. , 1997 .

[12]  Armin Pourshafeie,et al.  Medal: a patient similarity metric using medication prescribing patterns , 2019 .

[13]  Lior Wolf,et al.  Identifying Analogies Across Domains , 2018, ICLR.

[14]  Q. Zou,et al.  Similarity computation strategies in the microRNA-disease network: a survey. , 2015, Briefings in functional genomics.

[15]  S. G. Axline,et al.  Computer-based consultations in clinical therapeutics: explanation and rule acquisition capabilities of the MYCIN system. , 1975, Computers and biomedical research, an international journal.

[16]  Shuqiang Yang,et al.  A novel similarity comparison approach for dynamic ECG series. , 2015, Bio-medical materials and engineering.

[17]  L.J.A. Juthe Argumentation by analogy: A systematic analytical study of an argument scheme , 2016 .

[18]  A. Ghasemi,et al.  Normality Tests for Statistical Analysis: A Guide for Non-Statisticians , 2012, International journal of endocrinology and metabolism.

[19]  Benjamin S. Glicksberg,et al.  Identification of type 2 diabetes subgroups through topological analysis of patient similarity , 2015, Science Translational Medicine.

[20]  Fei Wang,et al.  Adaptive semi-supervised recursive tree partitioning: The ART towards large scale patient indexing in personalized healthcare , 2015, J. Biomed. Informatics.

[21]  Lucila Ohno-Machado,et al.  Generation of knowledge for clinical decision support: Statistical and machine learning techniques , 2014 .

[22]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[23]  Xudong Lu,et al.  Renal Biopsy Recommendation Based on Text Understanding , 2019, MedInfo.

[24]  Gerald J. Wyckoff,et al.  Making Sense of Pharmacovigilance and Drug Adverse Event Reporting: Comparative Similarity Association Analysis Using AI Machine Learning Algorithms in Dogs and Cats. , 2019, Topics in companion animal medicine.

[25]  Constantine Frangakis,et al.  Multiple imputation by chained equations: what is it and how does it work? , 2011, International journal of methods in psychiatric research.

[26]  Riccardo Bellazzi,et al.  Patient similarity by joint matrix trifactorization to identify subgroups in acute myeloid leukemia , 2018, JAMIA open.

[27]  Huilong Duan,et al.  Measure clinical drug-drug similarity using Electronic Medical Records , 2019, Int. J. Medical Informatics.

[28]  Katherine E Henson,et al.  Risk of Suicide After Cancer Diagnosis in England , 2018, JAMA psychiatry.

[29]  Shraddha Pai,et al.  Patient Similarity Networks for Precision Medicine. , 2018, Journal of molecular biology.

[30]  Riccardo Bellazzi,et al.  Patient similarity for precision medicine: A systematic review , 2018, J. Biomed. Informatics.

[31]  Babita Pandey,et al.  Diagnosis of Liver Disease Using Correlation Distance Metric Based K-Nearest Neighbor Approach , 2016 .

[32]  K. Holyoak,et al.  Analogical problem solving , 1980, Cognitive Psychology.

[33]  R A Jenders,et al.  Advances in Clinical Decision Support: Highlights of Practice and the Literature 2015-2016 , 2017, Yearbook of Medical Informatics.

[34]  Huilong Duan,et al.  A Hybrid Method for ICD-10 Auto-Coding of Chinese Diagnoses , 2017, MedInfo.

[35]  Conor Liston,et al.  New machine-learning technologies for computer-aided diagnosis , 2018, Nature Medicine.

[36]  Fei Wang,et al.  Privacy-Preserving Patient Similarity Learning in a Federated Environment: Development and Analysis , 2018, JMIR medical informatics.

[37]  Eric J Topol,et al.  High-performance medicine: the convergence of human and artificial intelligence , 2019, Nature Medicine.

[38]  Fei Wang,et al.  Survey on distance metric learning and dimensionality reduction in data mining , 2014, Data Mining and Knowledge Discovery.

[39]  Fei Wang,et al.  Supervised patient similarity measure of heterogeneous patient records , 2012, SKDD.

[40]  M LIPKIN,et al.  Differential diagnosis of hematologic diseases aided by mechanical correlation of data. , 1957, Science.

[41]  Dafna Shahaf,et al.  Accelerating Innovation Through Analogy Mining , 2017, KDD.

[42]  Gil Patrus Pena,et al.  Analogies in medicine: valuable for learning, reasoning, remembering and naming , 2010, Advances in health sciences education : theory and practice.

[43]  Richard W. Carlson,et al.  Pattern-Based Interactive Diagnosis of Multiple Disorders: The MEDAS System , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Chonghui Guo,et al.  A data-driven framework of typical treatment process extraction and evaluation , 2018, J. Biomed. Informatics.

[45]  Kenneth D. Forbus,et al.  The Roles of Similarity in Transfer: Separating Retrievability From Inferential Soundness , 1993, Cognitive Psychology.

[46]  David Sánchez,et al.  Ontology-based information content computation , 2011, Knowl. Based Syst..

[47]  Olivier Dameron,et al.  A Similarity Measure Based on Care Trajectories as Sequences of Sets , 2017, AIME.

[48]  Joy Higgs,et al.  Clinical decision making and multiple problem spaces , 2008 .

[49]  K. Holyoak,et al.  The analogical mind. , 1997, The American psychologist.

[50]  Anderson Rocha,et al.  Multiclass From Binary: Expanding One-Versus-All, One-Versus-One and ECOC-Based Approaches , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Huilong Duan,et al.  A finite-state automata based negation detection algorithm for Chinese clinical documents , 2014, 2014 IEEE International Conference on Progress in Informatics and Computing.

[52]  Åke Frändberg,et al.  On Analogical Use of Legal Rules , 2018 .

[53]  M. Hesse Analogy and Confirmation Theory , 1963, Philosophy of Science.

[54]  Yang Li,et al.  A similarity measure method combining location feature for mammogram retrieval. , 2018, Journal of X-ray science and technology.

[55]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[56]  Huilong Duan,et al.  A CRF-based Method for Automatic Construction of Chinese Symptom Lexicon , 2015, 2015 7th International Conference on Information Technology in Medicine and Education (ITME).

[57]  Daniela Zaharie,et al.  Taxonomy-based dissimilarity measures for profile identification in medical data , 2015, 2015 IEEE 13th International Symposium on Intelligent Systems and Informatics (SISY).

[58]  Huilong Duan,et al.  Using the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity , 2019, BMC Medical Informatics and Decision Making.

[59]  Jianying Hu,et al.  Towards Personalized Medicine: Leveraging Patient Similarity and Drug Similarity Analytics , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[60]  Randolph A. Miller,et al.  Review: Medical Diagnostic Decision Support Systems - Past, Present, And Future: A Threaded Bibliography and Brief Commentary , 1994, J. Am. Medical Informatics Assoc..

[61]  Lena Wiese,et al.  Efficient In-Database Patient Similarity Analysis for Personalized Medical Decision Support Systems , 2018, Big Data Res..

[62]  Evizal Abdul Kadir,et al.  Lambda value analysis on Weighted Minkowski distance model in CBR of Schizophrenia type diagnosis , 2016, 2016 4th International Conference on Information and Communication Technology (ICoICT).

[63]  Linden J. Ball,et al.  International Handbook of Thinking and Reasoning , 2017 .

[64]  Shraddha Pai,et al.  netDx: interpretable patient classification using integrated patient similarity networks , 2019, Molecular systems biology.

[65]  D. Gentner Structure‐Mapping: A Theoretical Framework for Analogy* , 1983 .