Chest ImaGenome Dataset for Clinical Reasoning

Despite the progress in automatic detection of radiologic findings from chest Xray (CXR) images in recent years, a quantitative evaluation of the explainability of these models is hampered by the lack of locally labeled datasets for different findings. With the exception of a few expert-labeled small-scale datasets for specific findings, such as pneumonia and pneumothorax, most of the CXR deep learning models to date are trained on global "weak" labels extracted from text reports, or trained via a joint image and unstructured text learning strategy. Inspired by the Visual Genome effort in the computer vision community, we constructed the first Chest ImaGenome dataset with a scene graph data structure to describe 242, 072 images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, we provide: i) 1, 256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph per image, ii) over 670, 000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph dataset from 500 unique patients.

[1]  Neal Lewis,et al.  SPOT the Drug! An Unsupervised Pattern Matching Method to Extract Drug Names from Very Large Clinical Corpora , 2012, 2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology.

[2]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ronald M. Summers,et al.  TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  R. Chaisson,et al.  Official American Thoracic Society/Centers for Disease Control and Prevention/Infectious Diseases Society of America Clinical Practice Guidelines: Treatment of Drug-Susceptible Tuberculosis. , 2016, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[5]  Trevor Darrell,et al.  Object Hallucination in Image Captioning , 2018, EMNLP.

[6]  D. Lynch,et al.  CT staging and monitoring of fibrotic interstitial lung diseases in clinical practice and treatment trials: a position paper from the Fleischner Society. , 2015, The Lancet. Respiratory medicine.

[7]  Andrew Y. Ng,et al.  CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT , 2020, EMNLP.

[8]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Arthur S Slutsky,et al.  Acute Respiratory Distress Syndrome The Berlin Definition , 2012 .

[10]  George Shih,et al.  Crowdsourcing pneumothorax annotations using machine learning annotations on the NIH chest X-ray dataset , 2019, Journal of Digital Imaging.

[11]  Stefan Lee,et al.  Graph R-CNN for Scene Graph Generation , 2018, ECCV.

[12]  Steven Horng,et al.  MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports , 2019, Scientific Data.

[13]  Kirk Roberts,et al.  A dataset of chest X-ray reports annotated with Spatial Role Labeling annotations , 2020, Data in brief.

[14]  寺田 泰蔵 BTS guidelines for the management of spontaneous pneumothorax (特集 救急診療ガイドライン) -- (海外のガイドライン) , 2008 .

[15]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Zhe Gan,et al.  Semantic Compositional Networks for Visual Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  L. Cardinale,et al.  Effectiveness of chest radiography, lung ultrasound and thoracic computed tomography in the diagnosis of congestive heart failure. , 2014, World journal of radiology.

[18]  Peter Szolovits,et al.  Clinically Accurate Chest X-Ray Report Generation , 2019, MLHC.

[19]  Daguang Xu,et al.  When Radiology Report Generation Meets Knowledge Graph , 2020, AAAI.

[20]  Antonio Pertusa,et al.  PadChest: A large chest x-ray image dataset with multi-label annotated reports , 2019, Medical Image Anal..

[21]  Xiaogang Wang,et al.  Scene Graph Generation from Objects, Phrases and Region Captions , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Stefan Jaeger,et al.  Two public chest X-ray datasets for computer-aided screening of pulmonary diseases. , 2014, Quantitative imaging in medicine and surgery.

[23]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[24]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[25]  James M. Brown,et al.  Siamese neural networks for continuous disease severity evaluation and change detection in medical imaging. , 2020, NPJ digital medicine.

[26]  Svetlana Lazebnik,et al.  Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  D. Naidich,et al.  Evaluation of individuals with pulmonary nodules: when is it lung cancer? Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. , 2013, Chest.

[28]  Clement J. McDonald,et al.  Preparing a collection of radiology examinations for distribution and retrieval , 2015, J. Am. Medical Informatics Assoc..

[29]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[30]  Danfei Xu,et al.  Scene Graph Generation by Iterative Message Passing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  J. Harvey,et al.  BTS guidelines for the management of spontaneous pneumothorax , 2003, Thorax.

[32]  J. Gohagan,et al.  The Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial of the National Cancer Institute: history, organization, and status. , 2000, Controlled clinical trials.

[33]  Peggy Cruse,et al.  Management of Adults With Hospital-acquired and Ventilator-associated Pneumonia: 2016 Clinical Practice Guidelines by the Infectious Diseases Society of America and the American Thoracic Society. , 2016, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[34]  Ronald M. Summers,et al.  ChestX-ray: Hospital-Scale Chest X-ray Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax Diseases , 2019, Deep Learning and Convolutional Neural Networks for Medical Imaging and Clinical Informatics.

[35]  Carol C Wu,et al.  Augmenting the National Institutes of Health Chest Radiograph Dataset with Expert Annotations of Possible Pneumonia. , 2019, Radiology. Artificial intelligence.

[36]  Mi Young Kim,et al.  Chest radiography surveillance for lung cancer: Results from a National Health Insurance database in South Korea. , 2019, Lung cancer.

[37]  Dhruv Batra,et al.  Analyzing the Behavior of Visual Question Answering Models , 2016, EMNLP.

[38]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[39]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  James M. Brown,et al.  Siamese neural networks for continuous disease severity evaluation and change detection in medical imaging , 2020, npj Digital Medicine.

[41]  Hugo Larochelle,et al.  GuessWhat?! Visual Object Discovery through Multi-modal Dialogue , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Margaret Mitchell,et al.  VQA: Visual Question Answering , 2015, International Journal of Computer Vision.

[43]  Linda Kato,et al.  AI Accelerated Human-in-the-loop Structuring of Radiology Reports , 2020, AMIA.

[44]  Mehdi Moradi,et al.  AnaXNet: Anatomy Aware Multi-label Finding Classification in Chest X-ray , 2021, MICCAI.

[45]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[46]  Jayashree Kalpathy-Cramer,et al.  Automated Assessment and Tracking of COVID-19 Pulmonary Disease Severity on Chest Radiographs using Convolutional Siamese Neural Networks , 2020, Radiology. Artificial intelligence.

[47]  Tanveer Syeda-Mahmood,et al.  Automatic Bounding Box Annotation of Chest X-Ray Data for Localization of Abnormalities , 2020, 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI).

[48]  M. Balaan,et al.  Acute Respiratory Distress Syndrome , 2016, Critical care nursing quarterly.

[49]  David Zhang,et al.  Label Co-Occurrence Learning With Graph Convolutional Networks for Multi-Label Chest X-Ray Image Classification , 2020, IEEE Journal of Biomedical and Health Informatics.

[50]  Eric P. Xing,et al.  Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation , 2018, NeurIPS.

[51]  Christopher D. Manning,et al.  Learning to Summarize Radiology Findings , 2018, Louhi@EMNLP.

[52]  José M. F. Moura,et al.  Visual Dialog , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).