A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data

Patients' medical conditions often evolve in complex and seemingly unpredictable ways. Even within a relatively narrow and well-defined episode of care, variations between patients in both their progression and eventual outcome can be dramatic. Understanding the patterns of events observed within a population that most correlate with differences in outcome is therefore an important task in many types of studies using retrospective electronic health data. In this paper, we present a method for interactive pattern mining and analysis that supports ad hoc visual exploration of patterns mined from retrospective clinical patient data. Our approach combines (1) visual query capabilities to interactively specify episode definitions, (2) pattern mining techniques to help discover important intermediate events within an episode, and (3) interactive visualization techniques that help uncover event patterns that most impact outcome and how those associations change over time. In addition to presenting our methodology, we describe a prototype implementation and present use cases highlighting the types of insights or hypotheses that our approach can help uncover.

[1]  Ben Shneiderman,et al.  Temporal Summaries: Supporting Temporal Categorical Searching, Aggregation and Comparison , 2009, IEEE Transactions on Visualization and Computer Graphics.

[2]  Krist Wongsuphasawat,et al.  Outflow : Visualizing Patient Flow by Symptoms and Outcome , 2011 .

[3]  Silvia Miksch,et al.  Visualizing Complex Notions of Time , 2001, MedInfo.

[4]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[5]  M. Daniels Streptomycin treatment of pulmonary tuberculosis. , 1949, Medical times.

[6]  Ben Shneiderman,et al.  LifeLines: using visualization to enhance navigation and analysis of patient records , 1998, AMIA.

[7]  M. Kurosaki,et al.  Data mining model using simple and readily available factors could identify patients at high risk for hepatocellular carcinoma in chronic hepatitis C. , 2012, Journal of hepatology.

[8]  James F. Allen Towards a General Theory of Action and Time , 1984, Artif. Intell..

[9]  Ben Shneiderman,et al.  Finding comparable temporal categorical records: A similarity measure with an interactive visualization , 2009, 2009 IEEE Symposium on Visual Analytics Science and Technology.

[10]  Tudor Toma,et al.  Learning predictive models that use pattern discovery - A bootstrap evaluative approach applied in organ functioning sequences , 2010, J. Biomed. Informatics.

[11]  Ben Shneiderman,et al.  Searching Electronic Health Records for Temporal Patterns in Patient Histories: A Case Study with Microsoft Amalga , 2008, AMIA.

[12]  David Gotz,et al.  Exploring Flow, Factors, and Outcomes of Temporal Event Sequences with the Outflow Visualization , 2012, IEEE Transactions on Visualization and Computer Graphics.

[13]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[14]  Ben Shneiderman,et al.  LifeFlow: visualizing an overview of event sequences , 2011, CHI.

[15]  G. Niklas Norén,et al.  Temporal pattern discovery in longitudinal electronic patient records , 2010, Data Mining and Knowledge Discovery.

[16]  Jihoon Kim,et al.  iDASH: integrating data for analysis, anonymization, and sharing , 2012, J. Am. Medical Informatics Assoc..

[17]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[18]  Cui Tao,et al.  Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn project , 2012, J. Biomed. Informatics.

[19]  Griffin M. Weber,et al.  Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2) , 2010, J. Am. Medical Informatics Assoc..

[20]  Fei Wang,et al.  ICDA: A Platform for Intelligent Care Delivery Analytics , 2012, AMIA.

[21]  Jie Chen,et al.  A Delivery Framework for Health Data Mining and Analytics , 2005, ACSC.

[22]  Yuval Shahar,et al.  Medical Temporal-Knowledge Discovery via Temporal Abstraction , 2009, AMIA.

[23]  N. D’Esopo,et al.  STREPTOMYCIN TREATMENT OF PULMONARY TUBERCULOSIS , 1998, The Journal of clinical investigation.

[24]  Yuval Shahar,et al.  A Framework for Knowledge-Based Temporal Abstraction , 1997, Artif. Intell..

[25]  W. Kannel,et al.  The natural history of congestive heart failure: the Framingham study. , 1971, The New England journal of medicine.

[26]  Tudor Toma,et al.  Discovery and inclusion of SOFA score episodes in mortality prediction , 2007, J. Biomed. Informatics.

[27]  Theophano Mitsa,et al.  Temporal Data Mining , 2010 .

[28]  Fei Wang,et al.  A Framework for Mining Signatures from Event Sequences and Its Applications in Healthcare Data , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  David Gotz,et al.  Interactive Intervention Analysis , 2012, AMIA.

[30]  Ben Shneiderman,et al.  A Visual Interface for Multivariate Temporal Data: Finding Patterns of Events across Multiple Histories , 2006, 2006 IEEE Symposium On Visual Analytics Science And Technology.

[31]  John F. Hurdle,et al.  Predicting Three-Year Kidney Graft Survival in Recipients with Systemic Lupus Erythematosus , 2011, ASAIO journal.

[32]  L. Ohno-Machado Journal of Biomedical Informatics , 2001 .

[33]  H. Rosling,et al.  Health advocacy with Gapminder animated statistics , 2011, Journal of epidemiology and global health.

[34]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[35]  Fabian Mörchen,et al.  Efficient mining of understandable patterns from multivariate interval time series , 2007, Data Mining and Knowledge Discovery.