Eliminating biasing signals in lung cancer images for prognosis predictions with deep learning

Deep learning has shown remarkable results for image analysis and is expected to aid individual treatment decisions in health care. To achieve this, deep learning methods need to be promoted from the level of mere associations to being able to answer causal questions. We present a scenario with real-world medical images (CT-scans of lung cancers) and simulated outcome data. Through the sampling scheme, the images contain two distinct factors of variation that represent a collider and a prognostic factor. We show that when this collider can be quantified, unbiased individual prognosis predictions are attainable with deep learning. This is achieved by (1) setting a dual task for the network to predict both the outcome and the collider and (2) enforcing independence of the activation distributions of the last layer with ordinary least squares. Our method provides an example of combining deep learning and structural causal models for unbiased individual prognosis predictions.

[1]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[2]  D. Ettinger,et al.  NCCN Guidelines Insights: Non-Small Cell Lung Cancer, Version 4.2016. , 2016, Journal of the National Comprehensive Cancer Network : JNCCN.

[3]  V. Goh,et al.  Imaging Heterogeneity in Lung Cancer: Techniques, Applications, and Challenges. , 2016, AJR. American journal of roentgenology.

[4]  Illtyd Trethowan Causality , 1938 .