RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction

We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for developing automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain.

[1]  P. Rajpurkar,et al.  The Current and Future State of AI Interpretation of Medical Images. , 2023, The New England journal of medicine.

[2]  J. Leskovec,et al.  Foundation models for generalist medical artificial intelligence , 2023, Nature.

[3]  P. Rajpurkar,et al.  Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray Report Generation , 2023, ArXiv.

[4]  Xiaojun Chang,et al.  Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Weidi Xie,et al.  MedKLIP: Medical Knowledge Enhanced Language-Image Pre-Training , 2023, medRxiv.

[6]  A. Ng,et al.  Evaluating Progress in Automatic Chest X-Ray Radiology Report Generation , 2022, medRxiv.

[7]  C. Langlotz,et al.  Optimizing the Breast Imaging Report for Today and Tomorrow , 2022, Journal of Breast Imaging.

[8]  Chun-Nan Hsu,et al.  RadBERT: Adapting Transformer-based Language Models to Radiology. , 2022, Radiology. Artificial intelligence.

[9]  K. Verspoor,et al.  Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  P. Rajpurkar,et al.  The Need for Medical Artificial Intelligence That Incorporates Prior Images. , 2022, Radiology.

[11]  Doron Stupp,et al.  Structured Understanding of Assessment and Plans in Clinical Documentation , 2022, medRxiv.

[12]  Yuh-Min Chen,et al.  Application of Artificial Intelligence in Lung Cancer , 2022, Cancers.

[13]  Xian Wu,et al.  Knowledge matters: Chest radiology report generation with general and specific knowledge , 2021, Medical Image Anal..

[14]  Tommaso Di Noto,et al.  Diagnostic surveillance of high-grade gliomas: towards automated change detection using radiology report classification , 2021, medRxiv.

[15]  Matthew P. Lungren,et al.  RadGraph: Extracting Clinical Entities and Relations from Radiology Reports , 2021, NeurIPS Datasets and Benchmarks.

[16]  Jong-Hoon Oh,et al.  Extracting clinical terms from radiology reports with deep learning , 2021, J. Biomed. Informatics.

[17]  Benjamin Van Durme,et al.  Gradual Fine-Tuning for Low-Resource Domain Adaptation , 2021, ADAPTNLP.

[18]  Andrew Y. Ng,et al.  VisualCheXbert: addressing the discrepancy between radiology report labels and image labels , 2021, CHIL.

[19]  Jianfeng Gao,et al.  Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing , 2020, ACM Trans. Comput. Heal..

[20]  Department of Computer Science,et al.  CheXpert++: Approximating the CheXpert labeler for Speed, Differentiability, and Probabilistic Output , 2020, MLHC.

[21]  Kirk Roberts,et al.  Rad-SpatialNet: A Frame-based Resource for Fine-Grained Spatial Relations in Radiology Reports , 2020, LREC.

[22]  Andrew Y. Ng,et al.  CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT , 2020, EMNLP.

[23]  Hieu H. Pham,et al.  Interpreting Chest X-rays via CNNs that Exploit Hierarchical Disease Dependencies and Uncertainty Labels , 2020, Neurocomputing.

[24]  Steven Horng,et al.  MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports , 2019, Scientific Data.

[25]  Hannaneh Hajishirzi,et al.  Entity, Relation, and Event Extraction with Contextualized Span Representations , 2019, EMNLP.

[26]  Kirk Roberts,et al.  Understanding Spatial Language in Radiology: Representation Framework, Annotation, and Spatial Relation Extraction from Chest X-ray Reports using Deep Learning , 2019, J. Biomed. Informatics.

[27]  Jackson M. Steinkamp,et al.  Toward Complete Structured Information Extraction from Radiology Reports Using Machine Learning , 2019, Journal of Digital Imaging.

[28]  Zhiyong Lu,et al.  Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets , 2019, BioNLP@ACL.

[29]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[30]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[31]  Yifan Yu,et al.  CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison , 2019, AAAI.

[32]  Gregory D. Hager,et al.  Deep Hierarchical Multi-label Classification of Chest X-ray Images , 2018, MIDL.

[33]  Denilson Barbosa,et al.  Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss , 2018, NAACL.

[34]  Priyadarshini Panda,et al.  Tree-CNN: A hierarchical Deep Convolutional Neural Network for incremental learning , 2018, Neural Networks.

[35]  Ronald M. Summers,et al.  NegBio: a high-performance tool for negation and uncertainty detection in radiology reports , 2017, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[36]  Saeed Hassanpour,et al.  Characterization of Change and Significance for Clinical Findings in Radiology Reports Through Natural Language Processing , 2017, Journal of Digital Imaging.

[37]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Heng Ji,et al.  CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases , 2016, WWW.

[39]  Robinson Piramuthu,et al.  HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[41]  Amir Tahmasebi,et al.  Classification of Pulmonary Nodular Findings based on Characterization of Change using Radiology Reports. , 2019, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[42]  C. Langlotz,et al.  Information extraction from multi-institutional radiology reports , 2016, Artif. Intell. Medicine.