Machine Learning Models Disclosure from Trusted Research Environments (TRE), Challenges and Opportunities

Trusted Research environments (TRE)s are safe and secure environments in which researchers can access sensitive data. With the growth and diversity of medical data such as Electronic Health Records (EHR), Medical Imaging and Genomic data, there is an increase in the use of Artificial Intelligence (AI) in general and the subfield of Machine Learning (ML) in particular in the healthcare domain. This generates the desire to disclose new types of outputs from TREs, such as trained machine learning models. Although specific guidelines and policies exists for statistical disclosure controls in TREs, they do not satisfactorily cover these new types of output request. In this paper, we define some of the challenges around the application and disclosure of machine learning for healthcare within TREs. We describe various vulnerabilities the introduction of AI brings to TREs. We also provide an introduction to the different types and levels of risks associated with the disclosure of trained ML models. We finally describe the new research opportunities in developing and adapting policies and tools for safely disclosing machine learning outputs from TREs.

[1]  P. Alam ‘A’ , 2021, Composites Engineering: An A–Z Guide.

[2]  P. Alam ‘T’ , 2021, Composites Engineering: An A–Z Guide.

[3]  Asif Karim,et al.  Efficient Prediction of Cardiovascular Disease Using Machine Learning Algorithms With Relief and LASSO Feature Selection Techniques , 2021, IEEE Access.

[4]  Xiaolei Xie,et al.  Predicting Hospital Readmission: A Joint Ensemble-Learning Model , 2020, IEEE Journal of Biomedical and Health Informatics.

[5]  Masaki Kobayashi,et al.  Usefulness of a decision tree model for the analysis of adverse drug reactions: Evaluation of a risk prediction model of vancomycin‐associated nephrotoxicity constructed using a data mining procedure , 2017, Journal of evaluation in clinical practice.

[6]  J. Tohka,et al.  Structural Brain Imaging Phenotypes of Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD) Found by Hierarchical Clustering , 2020, International journal of Alzheimer's disease.

[7]  Jiachen Yang,et al.  Precision medicine as a control problem: Using simulation and deep reinforcement learning to discover adaptive, personalized multi-cytokine therapy for sepsis , 2018, ArXiv.

[8]  Gebräuchliche Fertigarzneimittel,et al.  V , 1893, Therapielexikon Neurologie.

[9]  Rina Mishra,et al.  A review on steganography and cryptography , 2015, 2015 International Conference on Advances in Computer Engineering and Applications.

[10]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[11]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[12]  Anthony N. Nguyen,et al.  Active learning: a step towards automating medical concept extraction , 2015, J. Am. Medical Informatics Assoc..

[13]  M. Emre Celebi,et al.  Unsupervised Learning Algorithms , 2016 .

[14]  J. Zhang,et al.  Learning to learn by yourself: Unsupervised meta‐learning with self‐knowledge distillation for COVID‐19 diagnosis from pneumonia cases , 2021, Int. J. Intell. Syst..

[15]  Bjoern H. Menze,et al.  Deep Reinforcement Learning for Organ Localization in CT , 2020, MIDL.

[16]  Edward Y. Chang,et al.  Context-Aware Symptom Checking for Disease Diagnosis Using Hierarchical Reinforcement Learning , 2018, AAAI.

[17]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[18]  Xue Ying,et al.  An Overview of Overfitting and its Solutions , 2019, Journal of Physics: Conference Series.

[19]  Svetha Venkatesh,et al.  DeepCare: A Deep Dynamic Memory Model for Predictive Medicine , 2016, PAKDD.

[20]  F. Ritchie The ‘Five Safes’: A framework for planning, designing and evaluating data access solutions , 2017 .

[21]  Wangrok Oh,et al.  Measurement and Analysis of Human Body Channel Response for Biometric Recognition , 2021, IEEE Transactions on Instrumentation and Measurement.

[22]  Haipeng Shen,et al.  Artificial intelligence in healthcare: past, present and future , 2017, Stroke and Vascular Neurology.

[23]  P. Alam,et al.  R , 1823, The Herodotus Encyclopedia.

[24]  D. Camarillo,et al.  Problems in pregnancy, modeling fetal mortality through the Naïve Bayes classifier , 2020 .

[25]  Giuseppe De Pietro,et al.  Reinforcement learning for intelligent healthcare applications: A survey , 2020, Artif. Intell. Medicine.

[26]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[27]  Nikita Borisov,et al.  Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations , 2018, CCS.

[28]  D. Adkins,et al.  Machine Learning and Electronic Health Records: A Paradigm Shift. , 2017, The American journal of psychiatry.

[29]  P. Alam ‘K’ , 2021, Composites Engineering.

[30]  Jupeng Li,et al.  End-to-End Coordinate Regression Model with Attention-Guided Mechanism for Landmark Localization in 3D Medical Images , 2020, MLMI@MICCAI.

[31]  Olivier Gevaert,et al.  Genomic data imputation with variational auto-encoders , 2020, GigaScience.

[32]  Vitaly Shmatikov,et al.  Machine Learning Models that Remember Too Much , 2017, CCS.

[33]  Tsan-Ming Choi,et al.  Role of Analytics for Operational Risk Management in the Era of Big Data , 2020, Decis. Sci..

[34]  P. Alam,et al.  H , 1887, High Explosives, Propellants, Pyrotechnics.

[35]  Fei Wang,et al.  Deep learning for healthcare: review, opportunities and challenges , 2018, Briefings Bioinform..

[36]  Fredrik D. Johansson,et al.  Guidelines for reinforcement learning in healthcare , 2019, Nature Medicine.

[37]  Carlos R. García-Alonso,et al.  Use of the self-organising map network (SOMNet) as a decision support system for regional mental health planning , 2018, Health Research Policy and Systems.

[38]  Michael Veale,et al.  Algorithms that remember: model inversion attacks and data protection law , 2018, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[39]  Reza Shokri,et al.  Machine Learning with Membership Privacy using Adversarial Regularization , 2018, CCS.

[40]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[41]  Mingfu Xue,et al.  Machine Learning Security: Threats, Countermeasures, and Evaluations , 2020, IEEE Access.

[42]  Daniel Rueckert,et al.  End-to-end privacy preserving deep learning on multi-institutional medical imaging , 2021, Nature Machine Intelligence.

[43]  Gerry Reilly,et al.  Trusted Research Environments (TRE) Green Paper , 2020 .

[44]  Mustafa Musa Jaber,et al.  Cloud based framework for diagnosis of diabetes mellitus using K-means clustering , 2018, Health Information Science and Systems.

[45]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[46]  Gorjan Alagic,et al.  #p , 2019, Quantum information & computation.

[47]  Qiang Zhang,et al.  Classification Model on Big Data in Medical Diagnosis Based on Semi-Supervised Learning , 2020, Comput. J..

[48]  P. Alam ‘G’ , 2021, Composites Engineering: An A–Z Guide.

[49]  Jianxin Wang,et al.  Multi-Receptive-Field CNN for Semantic Segmentation of Medical Images , 2020, IEEE Journal of Biomedical and Health Informatics.

[50]  O. Obulesu,et al.  Machine Learning Techniques and Tools: A Survey , 2018, 2018 International Conference on Inventive Research in Computing Applications (ICIRCA).

[51]  Frederik Barkhof,et al.  Using Unsupervised Learning to Identify Clinical Subtypes of Alzheimer's Disease in Electronic Health Records , 2020, MIE.

[52]  Somesh Jha,et al.  Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing , 2014, USENIX Security Symposium.

[53]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[54]  Vitaly Shmatikov,et al.  Chiron: Privacy-preserving Machine Learning as a Service , 2018, ArXiv.

[55]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[56]  K Rajesh Babu,et al.  Brain Tumor segmentation of T1w MRI images based on Clustering using Dimensionality Reduction Random Projection Technique. , 2020, Current medical imaging.

[57]  Amine Nait-Ali,et al.  Hidden biometrics: Towards using biosignals and biomedical images for security applications , 2011, International Workshop on Systems, Signal Processing and their Applications, WOSSPA.

[58]  P. Alam ‘W’ , 2021, Composites Engineering.

[59]  Ajmal Mian,et al.  Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey , 2018, IEEE Access.

[60]  Kayla A Johnson,et al.  Supervised learning is an accurate method for network-based gene classification , 2019, bioRxiv.