On Sharing Models Instead of Data using Mimic learning for Smart Health Applications

Electronic health records (EHR) systems contain vast amounts of medical information about patients. These data can be used to train machine learning models that can predict health status, as well as to help prevent future diseases or disabilities. However, getting patients' medical data to obtain well-trained machine learning models is a challenging task. This is because sharing the patients' medical records is prohibited by law in most countries due to patients privacy concerns. In this paper, we tackle this problem by sharing the models instead of the original sensitive data by using the mimic learning approach. The idea is first to train a model on the original sensitive data, called the teacher model. Then, using this model, we can transfer its knowledge to another model, called the student model, without the need to learn the original data used in training the teacher model. The student model is then shared to the public and can be used to make accurate predictions. To assess the mimic learning approach, we have evaluated our scheme using different medical datasets. The results indicate that the student model mimics the teacher model performance in terms of prediction accuracy without the need to access to the patients' original data records.

[1]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[2]  Michael Klompas,et al.  Uses of electronic health records for public health surveillance to advance public health. , 2015, Annual review of public health.

[3]  Nicola Marchetti,et al.  On Minimizing Energy Consumption for D2D Clustered Caching Networks , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).

[4]  Mohamed Baza,et al.  Blockchain-Based Charging Coordination Mechanism for Smart Grid Energy Storage Units , 2018, 2019 IEEE International Conference on Blockchain (Blockchain).

[5]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[6]  Mehdi Bennis,et al.  Optimizing Joint Probabilistic Caching and Communication for Clustered D2D Networks. , 2018, 1810.05510.

[7]  Mohamed Baza,et al.  Towards Secure Smart Parking System Using Blockchain Technology , 2020, 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC).

[8]  Amr A. El-Sherif,et al.  Cooperation and underlay mode selection in cognitive radio network , 2016, 2016 Fifth International Conference on Future Generation Communication Technologies (FGCT).

[9]  M. Majid Butt,et al.  Performance Analysis and Optimization of Cache-Assisted CoMP for Clustered D2D Networks , 2020, IEEE Transactions on Mobile Computing.

[10]  B K Raghavendra,et al.  A Review on Machine Learning Classification Techniques for Plant Disease Detection , 2019, 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS).

[11]  Mohamed Baza,et al.  Detecting Sybil Attacks Using Proofs of Work and Location in VANETs , 2019, IEEE Transactions on Dependable and Secure Computing.

[12]  Z. Obermeyer,et al.  Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. , 2016, The New England journal of medicine.

[13]  Mohamed Baza,et al.  Blockchain-based Firmware Update Scheme Tailored for Autonomous Vehicles , 2018, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[14]  Nicola Marchetti,et al.  Cooperative Transmission and Probabilistic Caching for Clustered D2D Networks , 2018, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  Mostafa M. Fouda,et al.  An efficient distributed approach for key management in microgrids , 2015, 2015 11th International Computer Engineering Conference (ICENCO).

[18]  Maarten de Rijke,et al.  Share your Model instead of your Data: Privacy Preserving Mimic Learning for Ranking , 2017, ArXiv.

[19]  Amr A. El-Sherif,et al.  Stability analysis for multi-user cooperative cognitive radio network with energy harvesting , 2016, 2016 2nd IEEE International Conference on Computer and Communications (ICCC).

[20]  Mohamed Baza,et al.  Privacy-Preserving and Collusion-Resistant Charging Coordination Schemes for Smart Grid , 2019, ArXiv.

[21]  Mohamed Baza,et al.  Mimic Learning to Generate a Shareable Network Intrusion Detection Model , 2019, 2020 IEEE 17th Annual Consumer Communications & Networking Conference (CCNC).

[22]  Ibrahim Yilmaz,et al.  Expansion of Cyber Attack Data From Unbalanced Datasets Using Generative Techniques , 2019, ArXiv.

[23]  Mohamed Baza,et al.  Privacy-Preserving Smart Parking System Using Blockchain and Private Information Retrieval , 2019, 2019 International Conference on Smart Applications, Communications and Networking (SmartNets).

[24]  Mehdi Bennis,et al.  Delay Analysis for Wireless D2D Caching with Inter-Cluster Cooperation , 2017, GLOBECOM 2017 - 2017 IEEE Global Communications Conference.

[25]  Mohamed Baza,et al.  Blockchain-Based Distributed Key Management Approach Tailored for Smart Grid , 2020 .

[26]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[27]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[28]  Nicola Marchetti,et al.  Caching to the Sky: Performance Analysis of Cache-Assisted CoMP for Cellular-Connected UAVs , 2018, 2019 IEEE Wireless Communications and Networking Conference (WCNC).

[29]  Mohamed Baza,et al.  Incentivized and Secure Blockchain-based Firmware Update and Dissemination for Autonomous Vehicles , 2020 .

[30]  Mehdi Bennis,et al.  Inter-Cluster Cooperation for Wireless D2D Caching Networks , 2018, IEEE Transactions on Wireless Communications.

[31]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[32]  Walid Saad,et al.  Toward a Connected Sky: Performance of Beamforming With Down-Tilted Antennas for Ground and UAV User Co-Existence , 2019, IEEE Communications Letters.

[33]  Nicola Marchetti,et al.  Mobility in the Sky: Performance and Mobility Analysis for Cellular-Connected UAVs , 2019, IEEE Transactions on Communications.

[34]  Brian G. Arndt,et al.  Tethered to the EHR: Primary Care Physician Workload Assessment Using EHR Event Log Data and Time-Motion Observations , 2017, The Annals of Family Medicine.

[35]  Walid Saad,et al.  On the Reliability of Wireless Virtual Reality at Terahertz (THz) Frequencies , 2019, 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS).

[36]  Mohamed Baza,et al.  B-Ride: Ride Sharing With Privacy-Preservation, Trust and Fair Payment Atop Public Blockchain , 2019, IEEE Transactions on Network Science and Engineering.

[37]  Mohamed Younis,et al.  A Light Blockchain-Powered Privacy-Preserving Organization Scheme for Ride Sharing Services , 2020, 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring).

[38]  Jimeng Sun,et al.  Generating Multi-label Discrete Patient Records using Generative Adversarial Networks , 2017, MLHC.

[39]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..