Using topological data analysis and pseudo time series to infer temporal phenotypes from electronic health records

Highlights • Topological Data and Pseudo Time Series to discover Type 2 Diabetes temporal phenotypes.• Temporal phenotypes inferred from state-space model based on hidden-states transitions.• Study of states continuous transitions visually delivered in an easily explainable way.• Mined phenotypes characterized by significant differences in disease deterioration.

[1]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .

[2]  E. Shortliffe,et al.  Clinical Decision Support in the Era of Artificial Intelligence. , 2018, JAMA.

[3]  József Beck,et al.  Geometric discrepancy theory and uniform distribution , 1997 .

[4]  Benjamin S. Glicksberg,et al.  Identification of type 2 diabetes subgroups through topological analysis of patient similarity , 2015, Science Translational Medicine.

[5]  Milos Hauskrecht,et al.  Mining recent temporal patterns for event detection in multivariate time series data , 2012, KDD.

[6]  Teuvo Kohonen,et al.  The self-organizing map , 1990, Neurocomputing.

[7]  Riccardo Bellazzi,et al.  Machine Learning Methods to Predict Diabetes Complications , 2018, Journal of diabetes science and technology.

[8]  David F. Garway-Heath,et al.  The Pseudotemporal Bootstrap for Predicting Glaucoma From Cross-Sectional Visual Field Data , 2010, IEEE Transactions on Information Technology in Biomedicine.

[9]  A. Gupta,et al.  Extracting Dynamics from Static Cancer Expression Data , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Nick S. Jones,et al.  Automatic time-series phenotyping using massive feature extraction , 2016, bioRxiv.

[11]  G. Carlsson,et al.  Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival , 2011, Proceedings of the National Academy of Sciences.

[12]  Yuval Shahar,et al.  Classification of multivariate time series via temporal abstraction and time intervals mining , 2015, Knowledge and Information Systems.

[13]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[14]  Ludovic Duponchel,et al.  Topological data analysis: A promising big data exploration tool in biology, analytical chemistry and physical chemistry. , 2016, Analytica chimica acta.

[15]  P. Y. Lum,et al.  Extracting insights from the shape of complex data using topology , 2013, Scientific Reports.

[16]  Riccardo Bellazzi,et al.  A dashboard-based system for supporting diabetes care , 2018, J. Am. Medical Informatics Assoc..

[17]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[18]  Riccardo Bellazzi,et al.  Risk factors for the development of micro-vascular complications of type 2 diabetes in a single-centre cohort of patients , 2018, Diabetes & vascular disease research.

[19]  Wlodek Zadrozny,et al.  A Short Survey of Topological Data Analysis in Time Series and Systems Analysis , 2018, ArXiv.

[20]  Kieran R. Campbell,et al.  Uncovering pseudotemporal trajectories with covariates from single cell and bulk expression data , 2018, Nature Communications.

[21]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[22]  Yuval Shahar,et al.  Fast time intervals mining using the transitivity of temporal relations , 2013, Knowledge and Information Systems.

[23]  Henri Riihimäki,et al.  A topological data analysis based classification method for multiple measurements , 2019, BMC Bioinformatics.

[24]  Riccardo Bellazzi,et al.  Careflow Mining Techniques to Explore Type 2 Diabetes Evolution , 2018, Journal of diabetes science and technology.

[25]  Yuanxi Li,et al.  Modelling and analysing the dynamics of disease progression from cross-sectional studies , 2013, J. Biomed. Informatics.

[26]  Riccardo Bellazzi,et al.  Temporal electronic phenotyping by mining careflows of breast cancer patients , 2017, J. Biomed. Informatics.

[27]  Milos Hauskrecht,et al.  A temporal pattern mining approach for classifying electronic health record data , 2013, ACM Trans. Intell. Syst. Technol..

[28]  George Hripcsak,et al.  Next-generation phenotyping of electronic health records , 2012, J. Am. Medical Informatics Assoc..

[29]  Junhyong Kim,et al.  Reconstructing the Temporal Ordering of Biological Samples Using Microarray Data , 2003, Bioinform..

[30]  Frédéric Chazal,et al.  High-Dimensional Topological Data Analysis , 2016 .

[31]  Yuanxi Li,et al.  Updating Markov models to integrate cross-sectional and longitudinal studies , 2017, Artif. Intell. Medicine.

[32]  Julien Tierny Topological Data Analysis for Scientific Visualization , 2017, Mathematics and visualization.

[33]  Alessandro Rinaldo,et al.  Time Series Featurization via Topological Data Analysis , 2018 .