DICOM re‐encoding of volumetrically annotated Lung Imaging Database Consortium (LIDC) nodules

Purpose The dataset contains annotations for lung nodules collected by the Lung Imaging Data Consortium and Image Database Resource Initiative (LIDC) stored as standard DICOM objects. The annotations accompany a collection of computed tomography (CT) scans for over 1000 subjects annotated by multiple expert readers, and correspond to “nodules ≥ 3 mm”, defined as any lesion considered to be a nodule with greatest in‐plane dimension in the range 3–30 mm regardless of presumed histology. The present dataset aims to simplify reuse of the data with the readily available tools, and is targeted towards researchers interested in the analysis of lung CT images. Acquisition and validation methods Open source tools were utilized to parse the project‐specific XML representation of LIDC‐IDRI annotations and save the result as standard DICOM objects. Validation procedures focused on establishing compliance of the resulting objects with the standard, consistency of the data between the DICOM and project‐specific representation, and evaluating interoperability with the existing tools. Data format and usage notes The dataset utilizes DICOM Segmentation objects for storing annotations of the lung nodules, and DICOM Structured Reporting objects for communicating qualitative evaluations (nine attributes) and quantitative measurements (three attributes) associated with the nodules. The total of 875 subjects contain 6859 nodule annotations. Clustering of the neighboring annotations resulted in 2651 distinct nodules. The data are available in TCIA at https://doi.org/10.7937/TCIA.2018.h7umfurq. Potential applications The standardized dataset maintains the content of the original contribution of the LIDC‐IDRI consortium, and should be helpful in developing automated tools for characterization of lung lesions and image phenotyping. In addition to those properties, the representation of the present dataset makes it more FAIR (Findable, Accessible, Interoperable, Reusable) for the research community, and enables its integration with other standardized data collections.

[1]  Ron Kikinis,et al.  Standardized representation of the LIDC annotations using DICOM , 2019 .

[2]  P. Lambin,et al.  Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach , 2014, Nature Communications.

[3]  Nicolette de Keizer,et al.  Forty years of SNOMED: a literature review , 2008, BMC Medical Informatics Decis. Mak..

[4]  Tanveer Syeda-Mahmood,et al.  Role of Big Data and Machine Learning in Diagnostic Decision Support in Radiology. , 2018, Journal of the American College of Radiology : JACR.

[5]  Stephen M. Moore,et al.  The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository , 2013, Journal of Digital Imaging.

[6]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[7]  Richard C. Pais,et al.  The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. , 2011, Medical physics.

[8]  Christopher G. Chute,et al.  BioPortal: ontologies and integrated data resources at the click of a mouse , 2009, Nucleic Acids Res..

[9]  Jerry F. Magnan,et al.  Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods , 2016, Journal of medical imaging.

[10]  Milan Sonka,et al.  3D Slicer as an image computing platform for the Quantitative Imaging Network. , 2012, Magnetic resonance imaging.

[11]  Frederico Valente,et al.  Dicoogle, a Pacs Featuring Profiled Content Based Image Retrieval , 2013, PloS one.

[12]  Klaus H. Maier-Hein,et al.  The Medical Imaging Interaction Toolkit: challenges and advances , 2013, International Journal of Computer Assisted Radiology and Surgery.

[13]  C. Langlotz RadLex: a new method for indexing online educational materials. , 2006, Radiographics : a review publication of the Radiological Society of North America, Inc.

[14]  Paulo Mazzoncini de Azevedo Marques,et al.  Cloud-Based NoSQL Open Database of Pulmonary Nodules for Computer-Aided Lung Cancer Diagnosis and Reproducible Research , 2016, Journal of Digital Imaging.

[15]  Daniel L. Rubin,et al.  ePAD: An Image Annotation and Analysis Platform for Quantitative Imaging , 2019, Tomography.

[16]  Gabor Fichtinger,et al.  dcmqi: An Open Source Library for Standardized Communication of Quantitative Image Analysis Results Using DICOM. , 2017, Cancer research.

[17]  David R. Maffitt,et al.  De-identification of Medical Images with Retention of Scientific Research Value. , 2015, Radiographics : a review publication of the Radiological Society of North America, Inc.

[18]  Sherri de Coronado,et al.  NCI Thesaurus: A semantic model integrating cancer-related clinical and molecular information , 2007, J. Biomed. Informatics.

[19]  Richard C. Pais,et al.  Evaluation of Lung MDCT Nodule Annotation Across Radiologists and Methods 1 , 2006 .

[20]  清也 稲邑,et al.  DICOM Structured Reporting構造化報告書 , 2001 .

[21]  Michael F. McNitt-Gray,et al.  The Lung Image Database Consortium (LIDC) data collection process for nodule detection and annotation , 2007, SPIE Medical Imaging.

[22]  Erik Ziegler,et al.  LesionTracker: Extensible Open-Source Zero-Footprint Web Viewer for Cancer Imaging Research and Clinical Trials. , 2017, Cancer research.

[23]  Jayashree Kalpathy-Cramer,et al.  Quantitative Imaging Network: Data Sharing and Competitive AlgorithmValidation Leveraging The Cancer Imaging Archive. , 2014, Translational oncology.

[24]  Chao Zeng,et al.  Development of a Data Integration and Visualization Software for LIDC , 2013, J. Softw..

[25]  André Stumpf,et al.  An Empirical Study Into Annotator Agreement, Ground Truth Estimation, and Algorithm Evaluation , 2013, IEEE Transactions on Image Processing.

[26]  E. Hoffman,et al.  Lung image database consortium: developing a resource for the medical imaging research community. , 2004, Radiology.

[27]  Jacob D. Furst,et al.  Mapping LIDC, RadLex™, and Lung Nodule Image Features , 2011, Journal of Digital Imaging.

[28]  Rolf Apweiler,et al.  The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries , 2006, BMC Bioinformatics.

[29]  Ahmed Hosny,et al.  Artificial intelligence in radiology , 2018, Nature Reviews Cancer.

[30]  Andreas Wahle,et al.  DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research , 2016, PeerJ.

[31]  Ron Kikinis,et al.  An annotated test-retest collection of prostate multiparametric MRI , 2018, Scientific Data.

[32]  Matthew C Hancock,et al.  Lung nodule segmentation via level set machine learning , 2019, ArXiv.

[33]  Hongli Lin,et al.  A pulmonary nodule view system for the Lung Image Database Consortium (LIDC). , 2011, Academic radiology.

[34]  R. Steenbakkers,et al.  The Image Biomarker Standardization Initiative: Standardized Quantitative Radiomics for High-Throughput Image-based Phenotyping. , 2020, Radiology.