PRELIMINARY DATA ANALYSIS IN HEALTHCARE MULTICENTRIC DATA MINING: A PRIVACY-PRESERVING DISTRIBUTED APPROACH

The new era of cognitive health care systems offers a large amount of patient data that can be used to develop prediction models and clinical decision support systems. In this frame, the multi-institutional approach is strongly encouraged in order to reach more numerous samples for data mining and more reliable statistics. For these purposes, shared ontologies need to be developed for data management to ensure database semantic coherence in accordance with the various centers’ ethical and legal policies. Therefore, we propose a privacy-preserving distributed approach as a preliminary data analysis tool to identify possible compliance issues and heterogeneity from the agreed multi-institutional research protocol before training a clinical prediction model. This kind of preliminary analysis appeared fast and reliable and its results corresponded to those obtained using the traditional centralized approach. A real time interactive dashboard has also been presented to show analysis results and make the workflow swifter and easier.

[1]  Andre Dekker,et al.  Distributed Learning to Protect Privacy in Multi-centric Clinical Studies , 2015, AIME.

[2]  Xiaoqian Jiang,et al.  WebDISCO: a web service for distributed cox model learning without patient-level data sharing , 2015, J. Am. Medical Informatics Assoc..

[3]  G. Bedogni,et al.  Clinical Prediction Models—a Practical Approach to Development, Validation and Updating , 2009 .

[4]  Andre Dekker,et al.  VATE: VAlidation of high TEchnology based on large database analysis by learning machine , 2014 .

[5]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[6]  P. Lambin,et al.  Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital - A real life proof of concept. , 2016, Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology.

[7]  J. Kulynych,et al.  The effect of the new federal medical-privacy rule on research. , 2002, The New England journal of medicine.

[8]  Vincenzo Valentini,et al.  ENT COBRA (Consortium for Brachytherapy Data Analysis): interdisciplinary standardized data collection system for head and neck patients treated with interventional radiotherapy (brachytherapy) , 2016, Journal of contemporary brachytherapy.

[9]  Xiaoqian Jiang,et al.  Privacy Preserving Federated Big Data Analysis , 2018 .

[10]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[11]  Palma London,et al.  Distributed Optimization and Data Market Design , 2017 .

[12]  Jimeng Sun,et al.  Publishing data from electronic health records while preserving privacy: A survey of algorithms , 2014, J. Biomed. Informatics.

[13]  P. Lambin,et al.  Learning methods in radiation oncology ‘Rapid Learning health care in oncology’ – An approach towards decision support systems enabling customised radiotherapy’ q , 2013 .

[14]  P. Lambin,et al.  Decision support systems for personalized and participative radiation oncology☆ , 2017, Advanced drug delivery reviews.