Patient similarity: methods and applications

Patient similarity analysis is important in health care applications. It takes patient information such as their electronic medical records and genetic data as input and computes the pairwise similarity between patients. Procedures of typical a patient similarity study can be divided into several steps including data integration, similarity measurement, and neighborhood identification. And according to an analysis of patient similarity, doctors can easily find the most suitable treatments. There are many methods to analyze the similarity such as cluster analysis. And during machine learning become more and more popular, Using neural networks such as CNN is a new hot topic. This review summarizes representative methods used in each step and discusses applications of patient similarity networks especially in the context of precision medicine.

[1]  Kenney Ng,et al.  Personalized Predictive Modeling and Risk Factor Identification using Patient Similarity , 2015, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[2]  Fei Wang,et al.  Two Heads Better Than One: Metric+Active Learning and its Applications for IT Service Classification , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[3]  David Windridge,et al.  Identifying Similar Patients Using Self-Organising Maps: A Case Study on Type-1 Diabetes Self-care Survey Responses , 2015, ArXiv.

[4]  Nataša Pržulj,et al.  Methods for biological data integration: perspectives and challenges , 2015, Journal of The Royal Society Interface.

[5]  Gunther Heidemann,et al.  Determining Patient Similarity in Medical Social Networks , 2010 .

[6]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[7]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[8]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[9]  Lequan Yu,et al.  Self-Supervised Feature Learning via Exploiting Multi-Modal Data for Retinal Disease Diagnosis , 2020, IEEE Transactions on Medical Imaging.

[10]  Bryan Conroy,et al.  Phenotyping with Prior Knowledge using Patient Similarity , 2020, MLHC.

[11]  Jimeng Sun,et al.  Localized Supervised Metric Learning on Temporal Physiological Data , 2010, 2010 20th International Conference on Pattern Recognition.

[12]  F. Yin,et al.  Using patient data similarities to predict radiation pneumonitis via a self-organizing map , 2008, Physics in medicine and biology.

[13]  David J. Hand,et al.  Statistical Classification Methods in Consumer Credit Scoring: a Review , 1997 .

[14]  Masaki Kato,et al.  Clustering patients by depression symptoms to predict venlafaxine ER antidepressant efficacy: Individual patient data analysis. , 2020, Journal of psychiatric research.

[15]  T. Newman,et al.  Universal attenuators and their interactions with feedback loops in gene regulatory networks , 2016, bioRxiv.

[16]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[17]  R. Sharan,et al.  A method for inferring medical diagnoses from patient similarities , 2013, BMC Medicine.

[18]  Fei Wang,et al.  Supervised patient similarity measure of heterogeneous patient records , 2012, SKDD.

[19]  C. McCulloch,et al.  A K‐nearest neighbors survival probability prediction method , 2013, Statistics in medicine.

[20]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.

[21]  Jianying Hu,et al.  Towards Personalized Medicine: Leveraging Patient Similarity and Drug Similarity Analytics , 2014, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[22]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[23]  Benjamin S. Glicksberg,et al.  Identification of type 2 diabetes subgroups through topological analysis of patient similarity , 2015, Science Translational Medicine.

[24]  Yoshua Bengio,et al.  Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach , 2011, ICML.

[25]  Marinka Zitnik,et al.  Data Fusion by Matrix Factorization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Zhi-Hua Zhou,et al.  A brief introduction to weakly supervised learning , 2018 .

[27]  Yu-Gang Jiang,et al.  A relative similarity based method for interactive patient risk prediction , 2014, Data Mining and Knowledge Discovery.

[28]  L Albergante,et al.  Insights into Biological Complexity from Simple Foundations. , 2016, Advances in experimental medicine and biology.

[29]  B. Zupan,et al.  Discovering disease-disease associations by fusing systems-level molecular data , 2013, Scientific Reports.

[30]  Isaac S Kohane,et al.  Ten things we have to do to achieve precision medicine , 2015, Science.

[31]  Yu Tian,et al.  An Electronic Medical Record System with Treatment Recommendations Based on Patient Similarity , 2015, Journal of Medical Systems.

[32]  Shiyu Chang,et al.  Low-Rank Sparse Feature Selection for Patient Similarity Learning , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[33]  Fei Wang,et al.  Medical prognosis based on patient similarity and expert feedback , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[34]  Smaranda Belciug,et al.  Patient grouping optimization using a hybrid self-organizing map and Gaussian mixture model for length of stay-based clustering system , 2010, 2010 5th IEEE International Conference Intelligent Systems.

[35]  Jyotishman Pathak,et al.  Using EHRs for Heart Failure Therapy Recommendation Using Multidimensional Patient Similarity Analytics , 2016, MIE.

[36]  Alexis Boukouvalas,et al.  What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm , 2016, PloS one.

[37]  Saadaldeen Rashid Ahmed Ahmed,et al.  Lung cancer classification using data mining and supervised learning algorithms on multi-dimensional data set , 2019 .

[38]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[39]  Huilong Duan,et al.  Using the distance between sets of hierarchical taxonomic clinical concepts to measure patient similarity , 2019, BMC Medical Informatics and Decision Making.

[40]  Fei Wang,et al.  Composite distance metric integration by leveraging multiple experts' inputs and its application in patient similarity assessment , 2012, Stat. Anal. Data Min..

[41]  LWC Chan,et al.  Machine learning of patient similarity: A case study on predicting survival in cancer patient after locoregional chemotherapy , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[42]  Fenglong Ma,et al.  Deep Patient Similarity Learning for Personalized Healthcare , 2018, IEEE Transactions on NanoBioscience.

[43]  Huilong Duan,et al.  Measure clinical drug-drug similarity using Electronic Medical Records , 2019, Int. J. Medical Informatics.

[44]  Guojin Zhu,et al.  The Growing Self-organizing Map for Clustering Algorithms in Programming Codes , 2010, 2010 International Conference on Artificial Intelligence and Computational Intelligence.

[45]  Isaac S. Kohane,et al.  Technical desiderata for the integration of genomic data into Electronic Health Records , 2012, J. Biomed. Informatics.

[46]  Andrew Shea,et al.  DeepFaceLIFT: Interpretable Personalized Models for Automatic Estimation of Self-Reported Pain , 2017, AffComp@IJCAI.

[47]  Fei Wang,et al.  PSF: A Unified Patient Similarity Evaluation Framework Through Metric Learning With Weak Supervision , 2015, IEEE Journal of Biomedical and Health Informatics.

[48]  Joon Lee,et al.  Personalized mortality prediction for the critically ill using a patient similarity metric and bagging , 2016, 2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI).

[49]  Manolis Kellis,et al.  Integrative construction of regulatory region networks in 127 human reference epigenomes by matrix factorization , 2017, bioRxiv.

[50]  Josep Vehí,et al.  Risk-based postprandial hypoglycemia forecasting using supervised learning , 2019, Int. J. Medical Informatics.

[51]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[52]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[53]  Huilong Duan,et al.  A patient-similarity-based model for diagnostic prediction , 2019, Int. J. Medical Informatics.

[54]  Joon Lee,et al.  Personalized Mortality Prediction Driven by Electronic Medical Data and a Patient Similarity Metric , 2015, PloS one.

[55]  A. Barabasi,et al.  Human symptoms–disease network , 2014, Nature Communications.

[56]  E. Álava,et al.  Proposal for the creation of a national strategy for precision medicine in cancer: a position statement of SEOM, SEAP, and SEFH , 2017, Clinical and Translational Oncology.

[57]  Khan Muhammad,et al.  GAN-Based Semi-Supervised Learning Approach for Clinical Decision Support in Health-IoT Platform , 2019, IEEE Access.

[58]  Zhuowen Tu,et al.  Similarity network fusion for aggregating data types on a genomic scale , 2014, Nature Methods.