Towards a real-time big data analytics platform for health applications

We established a framework construct to form a big data analytics (BDA) platform using real volumes of health big data. Existing high-performance computing (HPC) architecture was utilised with HBase (noSQL database) and Hadoop (HDFS). Generated noSQL database was emulated from metadata and inpatient profiles of Vancouver Island Health Authority's hospital system. Special adjustments of Hadoop's ecosystem and HBase with the addition of 'salt buckets' to ingest were required. Results revealed that HBase took a week's time to generate ∼10 TB of data for one billion records via ingestion. Hadoop ingestion into HBase only took three seconds. Both simple and complex queries were less than two seconds, and all queries produced accurate patient data results. Data migration performance requirements of our BDA platform can significantly capture large volumes of data while reducing data retrieval times and its linkages to innovative processes and configurations that met patient data security/privacy standards are discussed.