Security and privacy requirements for a multi-institutional cancer research data grid: an interview-based study

BackgroundData protection is important for all information systems that deal with human-subjects data. Grid-based systems – such as the cancer Biomedical Informatics Grid (caBIG) – seek to develop new mechanisms to facilitate real-time federation of cancer-relevant data sources, including sources protected under a variety of regulatory laws, such as HIPAA and 21CFR11. These systems embody new models for data sharing, and hence pose new challenges to the regulatory community, and to those who would develop or adopt them. These challenges must be understood by both systems developers and system adopters. In this paper, we describe our work collecting policy statements, expectations, and requirements from regulatory decision makers at academic cancer centers in the United States. We use these statements to examine fundamental assumptions regarding data sharing using data federations and grid computing.MethodsAn interview-based study of key stakeholders from a sample of US cancer centers. Interviews were structured, and used an instrument that was developed for the purpose of this study. The instrument included a set of problem scenarios – difficult policy situations that were derived during a full-day discussion of potentially problematic issues by a set of project participants with diverse expertise. Each problem scenario included a set of open-ended questions that were designed to elucidate stakeholder opinions and concerns. Interviews were transcribed verbatim and used for both qualitative and quantitative analysis. For quantitative analysis, data was aggregated at the individual or institutional unit of analysis, depending on the specific interview question.ResultsThirty-one (31) individuals at six cancer centers were contacted to participate. Twenty-four out of thirty-one (24/31) individuals responded to our request- yielding a total response rate of 77%. Respondents included IRB directors and policy-makers, privacy and security officers, directors of offices of research, information security officers and university legal counsel. Nineteen total interviews were conducted over a period of 16 weeks. Respondents provided answers for all four scenarios (a total of 87 questions). Results were grouped by broad themes, including among others: governance, legal and financial issues, partnership agreements, de-identification, institutional technical infrastructure for security and privacy protection, training, risk management, auditing, IRB issues, and patient/subject consent.ConclusionThe findings suggest that with additional work, large scale federated sharing of data within a regulated environment is possible. A key challenge is developing suitable models for authentication and authorization practices within a federated environment. Authentication – the recognition and validation of a person's identity – is in fact a global property of such systems, while authorization – the permission to access data or resources – mimics data sharing agreements in being best served at a local level. Nine specific recommendations result from the work and are discussed in detail. These include: (1) the necessity to construct separate legal or corporate entities for governance of federated sharing initiatives on this scale; (2) consensus on the treatment of foreign and commercial partnerships; (3) the development of risk models and risk management processes; (4) development of technical infrastructure to support the credentialing process associated with research including human subjects; (5) exploring the feasibility of developing large-scale, federated honest broker approaches; (6) the development of suitable, federated identity provisioning processes to support federated authentication and authorization; (7) community development of requisite HIPAA and research ethics training modules by federation members; (8) the recognition of the need for central auditing requirements and authority, and; (9) use of two-protocol data exchange models where possible in the federation.

[1]  R. V. van Nieuwpoort,et al.  The Grid 2: Blueprint for a New Computing Infrastructure , 2003 .

[2]  Marianne M. Swanson,et al.  Standards for Security Categorization of Federal Information and Information Systems , 2004 .

[3]  Socrates H. Tuch Health Insurance Portability and Accountability Act of 1996. Public Law 104-191. , 1996, United States statutes at large.

[4]  Nikolaus Forgó,et al.  The ACGT ethical and legal requirements , 2007 .

[5]  Ian T. Foster,et al.  Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, Journal of Computer Science and Technology.

[6]  Ian T. Foster Globus Toolkit Version 4: Software for Service-Oriented Systems , 2005, NPC.

[7]  Carole A. Goble,et al.  Taverna: a tool for building and running workflows of services , 2006, Nucleic Acids Res..

[8]  Ann Zimmerman,et al.  The Biomedical Informatics Research Network , 2008 .

[9]  Lynn A. Karoly,et al.  Health Insurance Portability and Accountability Act of 1996 (HIPAA) Administrative Simplification , 2010, Practice Management Consultant.

[10]  Joel H. Saltz,et al.  Model Formulation: caGrid 1.0: An Enterprise Grid Infrastructure for Biomedical Research , 2008, J. Am. Medical Informatics Assoc..

[11]  Rajiv Dhir,et al.  A multidisciplinary approach to honest broker services for tissue banks and clinical data , 2008, Cancer.

[12]  Xiaoyang Sean Wang,et al.  Authorization in trust management: Features and foundations , 2008, CSUR.

[13]  Heather A. Piwowar,et al.  Towards a Data Sharing Culture: Recommendations for Leadership from Academic Health Centers , 2008, PLoS medicine.

[14]  Jorge Luis Rodriguez,et al.  The Open Science Grid , 2005 .

[15]  JRobert Beck,et al.  The Cancer Biomedical Informatics Grid (caBIGTM): Infrastructure and Applications for a Worldwide Research Community , 2007, MedInfo.