Healthcare organizations are undergoing tremendous change in analytical computing. The vehicle for this change is implementing a Business Intelligence (BI) platform, which includes On-Line Analytical Processing (OLAP) Cubes, data visualization tools, and integrated data warehouse (DW). Extracting sensitive data (Personally Identifiable Information-PII, Personal Health Information-PHI) from operational databases, then saving them into a consolidated central repository to facilitate efficient analysis are the best solution. This solution demonstrates a good understanding of overall business operational performance and trends as well as improves the business processes through using statistical and data mining methods. Nonetheless, this huge data warehouse (DW) constitutes one of the most serious privacy breach threats that any healthcare organization might face when many internal users of different security levels have access to BI components. Data masking techniques are used to minimize the inadvertent disclosure risk of sensitive data as well as to preserve the basic quality of data analytics (data utility). However, the traditional masking methods fail to maintain the utility of data analysis for reporting and research purposes. In this paper a practical classification component for a built-in data masking framework (IMETU-Identify, Map, Execute, Test, and Utilize) is proposed and focuses on the first two modules. The analyzed health data attributes are based on the Discharge Abstract Database (DAD - Acute Inpatient in Canada). The proposed component is to identify the sensitive data, and select the best masking format to provide more data privacy and protection at rest (i.e., it can accurately be drilled down to a monthly aggregated level). This Module allows sensitive data attributes to be safely used and complies with the privacy regulatory requirements within the healthcare data warehouse by mapping them with the proper irreversible or reversible masking techniques. Native encryption techniques are avoided due to complex calculation and increase storage space.
[1]
Jorge Bernardino,et al.
A data masking technique for data warehouses
,
2011,
IDEAS '11.
[2]
Abdelkader H. Ouda,et al.
Localization and security enhancement of block-based image authentication
,
2005,
IEEE International Conference on Image Processing 2005.
[3]
Ravindra S. Hegadi,et al.
A Survey on Recent Trends, Process and Development in Data Masking for Testing
,
2011
.
[4]
Rathindra Sarathy,et al.
Data Shuffling Procedure for Masking Data
,
2018
.
[5]
Luiz Fernando Capretz,et al.
Business intelligence solutions in healthcare a case study: Transforming OLTP system to BI solution
,
2013,
2013 Third International Conference on Communications and Information Technology (ICCIT).
[6]
Mihir Bellare,et al.
Format-Preserving Encryption
,
2009,
IACR Cryptol. ePrint Arch..
[7]
Reece Johnson,et al.
Types of care
,
2015
.