A Flexible Database Architecture for Mining DICOM Objects: the DICOM Data Warehouse

Digital Imaging and Communications in Medicine (DICOM) has brought a very high level of standardization to medical images, allowing interoperability in many cases. However, there are still challenges facing the informaticist attempting to data mine DICOM objects. Images (and other objects) from different vintage equipment will encompass different levels of the standard, and there are also proprietary “shadow” tags to be aware of. The database architecture described herein “flattens” such differences by compiling a knowledge base of specific DICOM implementations and mapping variable data elements to a common lexicon for subsequent queries. The project is open sourced, built on open infrastructure, and is available at GitHub.