Experiences with mining temporal event sequences from electronic medical records: initial successes and some challenges

The standardization and wider use of electronic medical records (EMR) creates opportunities for better understanding patterns of illness and care within and across medical systems. Our interest is in the temporal history of event codes embedded in patients' records, specifically investigating frequently occurring sequences of event codes across patients. In studying data from more than 1.6 million patient histories at the University of Michigan Health system we quickly realized that frequent sequences, while providing one level of data reduction, still constitute a serious analytical challenge as many involve alternate serializations of the same sets of codes. To further analyze these sequences, we designed an approach where a partial order is mined from frequent sequences of codes. We demonstrate an EMR mining system called EMRView that enables exploration of the precedence relationships to quickly identify and visualize partial order information encoded in key classes of patients. We demonstrate some important nuggets learned through our approach and also outline key challenges for future research based on our experiences.

[1]  Nikolaj Tatti Maximum Entropy Based Significance of Itemsets , 2007, ICDM.

[2]  Daniel Slamanig,et al.  Privacy-enhancing methods for e-health applications: how to prevent statistical analyses and attacks , 2008, Int. J. Bus. Intell. Data Min..

[3]  Gad M. Landau,et al.  Using PQ Trees for Comparative Genomics , 2005, CPM.

[4]  R Bellazzi,et al.  Mining health care administrative data with temporal association rules on hybrid events. , 2011, Methods of information in medicine.

[5]  R. Bellazzi,et al.  Methods and tools for mining multivariate temporal data in clinical and biomedical applications , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[6]  Jessica S. Ancker,et al.  Redesigning electronic health record systems to support public health , 2007, J. Biomed. Informatics.

[7]  Riccardo Bellazzi,et al.  Temporal Data Mining for the Assessment of the Costs Related to Diabetes Mellitus Pharmacological Treatment , 2009, AMIA.

[8]  Wayne H. Ward,et al.  Towards Temporal Relation Discovery from the Clinical Narrative , 2009, AMIA.

[9]  C. Chong,et al.  The role of emergency ultrasound for evaluating acute pyelonephritis in the ED. , 2011, The American journal of emergency medicine.

[10]  Peter A. Bath,et al.  Health informatics: current issues and challenges , 2008, J. Inf. Sci..

[11]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[12]  R. Iorio,et al.  Total Hip Arthroplasty: Optimal Treatment for Displaced Femoral Neck Fractures in Elderly Patients , 2004, Clinical orthopaedics and related research.

[13]  Debra Revere,et al.  Understanding the information needs of public health practitioners: A literature review to inform design of an interactive digital knowledge management system , 2007, J. Biomed. Informatics.

[14]  M. Bhandari,et al.  Optimal treatment of femoral neck fractures according to patient's physiologic age: an evidence-based review. , 2010, The Orthopedic clinics of North America.

[15]  Yuval Shahar,et al.  Medical Temporal-Knowledge Discovery via Temporal Abstraction , 2009, AMIA.

[16]  Jens Stoye,et al.  Finding All Common Intervals of k Permutations , 2001, CPM.

[17]  Hua Xu,et al.  Extracting timing and status descriptors for colonoscopy testing from electronic medical records , 2010, J. Am. Medical Informatics Assoc..

[18]  Kellogg S. Booth,et al.  Testing for the Consecutive Ones Property, Interval Graphs, and Graph Planarity Using PQ-Tree Algorithms , 1976, J. Comput. Syst. Sci..

[19]  T. Renkawitz,et al.  Fractured neck of femur--internal fixation versus arthroplasty. , 2010, Deutsches Arzteblatt international.

[20]  Suchit Ahuja,et al.  Privacy policies of personal health records: an evaluation of their effectiveness in protecting patient information , 2010, IHI.

[21]  A. Mulley,et al.  Primary Care Medicine , 1981 .

[22]  A. Macpherson,et al.  School Playground Surfacing and Arm Fractures in Children: A Cluster Randomized Trial Comparing Sand to Wood Chip Surfaces , 2009, PLoS medicine.

[23]  Ben Shneiderman,et al.  Searching Electronic Health Records for Temporal Patterns in Patient Histories: A Case Study with Microsoft Amalga , 2008, AMIA.

[24]  Heikki Mannila,et al.  Global partial orders from sequential data , 2000, KDD '00.

[25]  Hude Quan,et al.  BMC Health Services Research BioMed Central Correspondence , 2006 .

[26]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[27]  A. Musk,et al.  Asbestos‐related disease , 2011, Internal medicine journal.

[28]  R. Loder The demographics of playground equipment injuries in children. , 2008, Journal of pediatric surgery.

[29]  A. Akhmetova Discovery of Frequent Episodes in Event Sequences , 2006 .

[30]  Isaac S Kohane,et al.  Longitudinal histories as predictors of future diagnoses of domestic abuse: modelling study , 2009, BMJ : British Medical Journal.

[31]  Henrik Svanström,et al.  Temporal Data Mining for Adverse Events Following Immunization in Nationwide Danish Healthcare Databases , 2010, Drug safety.

[32]  Jennifer G. Dy,et al.  Medical coding classification by leveraging inter-code relationships , 2010, KDD.