Analysis of a Sequence of Events in Healthcare

Healthcare industry generates streams of data in different problem domains. Analysis of such data requires stream analytics tools and techniques to generate useful insights. Stream analytics involve analysis of time variant events. The specific patterns in the events can indicate some imminent outcomes such as state of a heart, etc. Therefore, novel ways to find specific patterns in the events generated by multiple sources are required. A key requirement for applying any such method is data preparation and organization to enable such analysis. In this paper, we extend the CRISP-DM process to include data preparation approaches for sequence mining. We present progression analysis, an approach for converting multidimensional time variant streams of health records in a form to be able to detect useful sequential signals. To illustrate the process, we use patient health history stored in an Electronic Medical Record system (EMR) and present a healthcare application to compare progression of diseases over time between patients diagnosed with Tobacco Use Disorder (TUD) and non-tobacco users. Interestingly, many diseases follow the same path for TUD and non-TUD patients. Finally, the generalizability of the progression analysis is discussed.

[1]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[2]  Pradeep Kumar,et al.  A web recommendation system considering sequential information , 2015, Decis. Support Syst..

[3]  Daniel Gayo-Avello,et al.  A survey on session detection methods in query logs and a proposal for future evaluation , 2009, Inf. Sci..

[4]  Jeffrey S. Saltz,et al.  The need for new processes, methodologies and tools to support big data teams and improve big data project effectiveness , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[5]  Holger Ziekow,et al.  Towards a Big Data Analytics Framework for IoT and Smart City Applications , 2015 .

[6]  Ramesh Sharda,et al.  Progression analysis of signals: Extending CRISP-DM to stream analytics , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[7]  Alok Gupta,et al.  Editorial Thoughts: What and How ISR Publishes , 2017, Information systems research.

[8]  Ramesh Sharda,et al.  Time-Based Comorbidity in Patients Diagnosed with Tobacco Use Disorder , 2018 .

[9]  Daniel Adomako Asamoah,et al.  Teaching Case: Who Renews? Who Leaves? Identifying Customer Churn in a Telecom Company Using Big Data Techniques , 2016, J. Inf. Syst. Educ..

[10]  Melissa Steward Electronic Medical Records , 2005, The Journal of legal medicine.

[11]  Unil Yun Analyzing Sequential Patterns in Retail Databases , 2007, Journal of Computer Science and Technology.

[12]  Cheng-Jung Lin,et al.  Goal-oriented sequential pattern for network banking churn analysis , 2003, Expert Syst. Appl..

[13]  Milos Hauskrecht,et al.  A temporal pattern mining approach for classifying electronic health record data , 2013, ACM Trans. Intell. Syst. Technol..

[14]  H. Völzke,et al.  A time sequence analysis of the relationship between cardiovascular risk factors, vascular diseases and restless legs syndrome in the general population , 2013, Journal of sleep research.

[15]  Wai Fong Boh,et al.  The Career Paths Less (or More) Traveled: A Sequence Analysis of IT Career Histories, Mobility Patterns, and Career Success , 2012, MIS Q..

[16]  Myra Spiliopoulou,et al.  A Framework for the Evaluation of Session Reconstruction Heuristics in Web-Usage Analysis , 2003, INFORMS J. Comput..

[17]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.