Image Data Sharing for Biomedical Research—Meeting HIPAA Requirements for De-identification

Data sharing is increasingly recognized as critical to cross-disciplinary research and to assuring scientific validity. Despite National Institutes of Health and National Science Foundation policies encouraging data sharing by grantees, little data sharing of clinical data has in fact occurred. A principal reason often given is the potential of inadvertent violation of the Health Insurance Portability and Accountability Act privacy regulations. While regulations specify the components of private health information that should be protected, there are no commonly accepted methods to de-identify clinical data objects such as images. This leads institutions to take conservative risk-averse positions on data sharing. In imaging trials, where images are coded according to the Digital Imaging and Communications in Medicine (DICOM) standard, the complexity of the data objects and the flexibility of the DICOM standard have made it especially difficult to meet privacy protection objectives. The recent release of DICOM Supplement 142 on image de-identification has removed much of this impediment. This article describes the development of an open-source software suite that implements DICOM Supplement 142 as part of the National Biomedical Imaging Archive (NBIA). It also describes the lessons learned by the authors as NBIA has acquired more than 20 image collections encompassing over 30 million images.