Challenges in Managing Real-Time Data in Health Information System (HIS)

In this paper, we have discussed the challenges in handling real-time medical big data collection and storage in health information system HIS. Based on challenges, we have proposed a model for real-time analysis of medical big data. We exemplify the approach through Spark Streaming and Apache Kafka using the processing of health big data Stream. Apache Kafka works very well in transporting data among different systems such as relational databases, Apache Hadoop and non-relational databases. However, Apache Kafka lacks analyzing the stream, Spark Streaming framework has the capability to perform some operations on the stream. We have identified the challenges in current real-time systems and proposed our solution to cope with the medical big data streams.

[1]  Chia-Hung Hsiao,et al.  Privacy preservation and information security protection for patients' portable electronic health records , 2009, Comput. Biol. Medicine.

[2]  N Peek,et al.  Technical Challenges for Big Data in Biomedicine and Health: Data Sources, Infrastructure, and Analytics , 2014, Yearbook of Medical Informatics.

[3]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[4]  Jeremiah Scholl,et al.  The Taiwanese method for providing patients data from multiple hospital EHR systems , 2011, J. Biomed. Informatics.

[5]  Viju Raghupathi,et al.  Big data analytics in healthcare: promise and potential , 2014, Health Information Science and Systems.

[6]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[7]  Uzay Kaymak,et al.  An open platform for personal health record apps with platform-level privacy protection , 2014, Comput. Biol. Medicine.

[8]  Eric Bouillet,et al.  The best of two worlds: Integrating IBM InfoSphere Streams with Apache YARN , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[9]  Md. Rafiqul Islam,et al.  A Secure Real Time Data Processing Framework for Personally Controlled Electronic Health Record (PCEHR) System , 2014, SecureComm.

[10]  Scott Shenker,et al.  Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters , 2012, HotCloud.

[11]  Pete Wyckoff,et al.  Hive - A Warehousing Solution Over a Map-Reduce Framework , 2009, Proc. VLDB Endow..

[12]  Elena Baralis,et al.  Real-Time Analysis of Physiological Data to Support Medical Applications , 2009, IEEE Transactions on Information Technology in Biomedicine.

[13]  Rick Cattell,et al.  Scalable SQL and NoSQL data stores , 2011, SGMD.

[14]  Rinkle Rani,et al.  Managing Data in Healthcare Information Systems: Many Models, One Solution , 2015, Computer.

[15]  Mohammed Kaosar,et al.  A Privacy-Preserving Framework for Personally Controlled Electronic Health Record (PCEHR) System , 2013 .

[16]  Wajahat Ali Khan,et al.  Cloud-based Smart CDSS for chronic diseases , 2013 .