The Impact of Digital Histopathology Batch Effect on Deep Learning Model Accuracy and Bias

The Cancer Genome Atlas (TCGA) is one of the largest biorepositories of digital histology. Deep learning (DL) models have been trained on TCGA to predict numerous features directly from histology, including survival, gene expression patterns, and driver mutations. However, we demonstrate that these features vary substantially across tissue submitting sites in TCGA for over 3,000 patients with six cancer subtypes. Additionally, we show that histologic image differences between submitting sites can easily be identified with DL. This site detection remains possible despite commonly used color normalization and augmentation methods, and we quantify the digital image characteristics constituting this histologic batch effect. As an example, we show that patient ethnicity within the TCGA breast cancer cohort can be inferred from histology due to site-level batch effect, which must be accounted for to ensure equitable application of DL. Batch effect also leads to overoptimistic estimates of model performance, and we propose a quadratic programming method to guide validation that abrogates this bias.

[1]  K. Arihiro,et al.  Deep Learning Models for Histopathological Classification of Gastric and Colonic Epithelial Tumours , 2020, Scientific Reports.

[2]  Nassir Navab,et al.  AggNet: Deep Learning From Crowds for Mitosis Detection in Breast Cancer Histology Images , 2016, IEEE Trans. Medical Imaging.

[3]  Eric J Topol,et al.  High-performance medicine: the convergence of human and artificial intelligence , 2019, Nature Medicine.

[4]  Bram van Ginneken,et al.  Automated Gleason Grading of Prostate Biopsies using Deep Learning , 2019, ArXiv.

[5]  Guillermo A. Gomez,et al.  DeepSurvNet: deep survival convolutional network for brain cancer survival rate classification based on histopathological images , 2020, Medical & Biological Engineering & Computing.

[6]  Alexander T. Pearson,et al.  Clinical-grade Detection of Microsatellite Instability in Colorectal Tumors by Deep Learning. , 2020, Gastroenterology.

[7]  Jeffrey S. Damrauer,et al.  Comprehensive Analysis of Genetic Ancestry and Its Molecular Correlates in Cancer , 2020, Cancer Cell.

[8]  Alexander W. Jung,et al.  Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis , 2019, Nature Cancer.

[9]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of head and neck squamous cell carcinomas , 2015, Nature.

[10]  Todd H. Stokes,et al.  Removing Batch Effects From Histopathological Images for Enhanced Cancer Diagnosis , 2014, IEEE Journal of Biomedical and Health Informatics.

[11]  Ron Kimmel,et al.  Artificial Intelligence Algorithms to Assess Hormonal Status From Tissue Microarrays in Patients With Breast Cancer , 2019, JAMA network open.

[12]  Erik Reinhard,et al.  Color Transfer between Images , 2001, IEEE Computer Graphics and Applications.

[13]  Joel H. Saltz,et al.  Research and applications: Cancer Digital Slide Archive: an informatics resource to support integrated in silico analysis of TCGA pathology data , 2013, J. Am. Medical Informatics Assoc..

[14]  J. S. Marron,et al.  A method for normalizing histology slides for quantitative analysis , 2009, 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[15]  Limsoon Wong,et al.  Why Batch Effects Matter in Omics Data, and How to Avoid Them. , 2017, Trends in biotechnology.

[16]  T. Nielsen,et al.  The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. , 2015, Annals of oncology : official journal of the European Society for Medical Oncology.

[17]  D. Schrag,et al.  Missed opportunities: racial disparities in adjuvant breast cancer treatment. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[18]  Derek R. Magee,et al.  Colour Normalisation in Digital Histopathology Images , 2009 .

[19]  Adrian V. Lee,et al.  An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics , 2018, Cell.

[20]  Shailja Chatterjee,et al.  Artefacts in histopathology , 2014, Journal of oral and maxillofacial pathology : JOMFP.

[21]  Ludvig Bergenstråhle,et al.  Integrating spatial gene expression and breast tumour morphology via deep learning , 2020, Nature Biomedical Engineering.

[22]  D. Brat,et al.  Predicting cancer outcomes from histology and genomics using convolutional networks , 2017, Proceedings of the National Academy of Sciences.

[23]  Ming Y. Lu,et al.  Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and Prognosis , 2019, IEEE Transactions on Medical Imaging.

[24]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[25]  Ziqian Wu,et al.  A machine learning-based prognostic predictor for stage III colon cancer , 2020, Scientific Reports.

[26]  Anant Madabhushi,et al.  Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent , 2017, Scientific Reports.

[27]  The Cancer Genome Atlas Research Network COMPREHENSIVE MOLECULAR CHARACTERIZATION OF CLEAR CELL RENAL CELL CARCINOMA , 2013, Nature.

[28]  D. Henson,et al.  The histologic grading of cancer , 1995, Cancer.

[29]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of clear cell renal cell carcinoma , 2013, Nature.

[30]  B. van Ginneken,et al.  Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. , 2020, The Lancet. Oncology.

[31]  Heather D. Couture,et al.  Image analysis with deep learning to predict breast cancer grade, ER status, histologic subtype, and intrinsic subtype , 2018, npj Breast Cancer.

[32]  Amina A. Qutub,et al.  Image-based Classification of Tumor Type and Growth Rate using Machine Learning: a preclinical study , 2019, Scientific Reports.

[33]  R. K. Agrawal,et al.  First and Second Order Statistics Features for Classification of Magnetic Resonance Brain Images , 2012 .

[34]  Peter H. N. de With,et al.  Stain normalization of histopathology images using generative adversarial networks , 2018, 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018).

[35]  Steven J. M. Jones,et al.  The Immune Landscape of Cancer , 2018, Immunity.

[36]  Steven J. M. Jones,et al.  Comprehensive molecular profiling of lung adenocarcinoma , 2014, Nature.

[37]  Jakob Nikolas Kather,et al.  Deep learning in cancer pathology: a new generation of clinical biomarkers , 2020, British Journal of Cancer.

[38]  Andreea Anghel,et al.  A High-Performance System for Robust Stain Normalization of Whole-Slide Images in Histopathology , 2019, Front. Med..

[39]  François Chollet,et al.  Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Jakob Nikolas Kather,et al.  Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer , 2019, Nature Medicine.

[41]  Sebastian Raschka,et al.  Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning , 2018, ArXiv.

[42]  N. Razavian,et al.  Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning , 2018, Nature Medicine.

[43]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[44]  Ellery Wulczyn,et al.  Deep learning-based survival prediction for multiple cancer types using histopathology images , 2019, PloS one.

[45]  Stephen C. Benz,et al.  A deep learning image-based intrinsic molecular subtype classifier of breast tumors reveals tumor heterogeneity that may affect survival , 2020, Breast Cancer Research.

[46]  Nico Karssemeijer,et al.  Whole-Slide Mitosis Detection in H&E Breast Histology Using PHH3 as a Reference to Train Distilled Stain-Invariant Convolutional Networks , 2018, IEEE Transactions on Medical Imaging.

[47]  Takaya Saito,et al.  The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets , 2015, PloS one.

[48]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[49]  Daisuke Komura,et al.  Machine Learning Methods for Histopathological Image Analysis , 2017, Computational and structural biotechnology journal.

[50]  Aleksey Boyko,et al.  Detecting Cancer Metastases on Gigapixel Pathology Images , 2017, ArXiv.

[51]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[52]  Rafael C. González,et al.  Digital image processing, 3rd Edition , 2008 .

[53]  G. Wainrib,et al.  Deep learning-based classification of mesothelioma improves prediction of patient outcome , 2019, Nature Medicine.

[54]  Junzhou Huang,et al.  Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis , 2017, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[55]  Sameer Singh,et al.  “Why Should I Trust You?”: Explaining the Predictions of Any Classifier , 2016, NAACL.

[56]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[57]  Jennifer A. Tom,et al.  Identifying and mitigating batch effects in whole genome sequencing data , 2017, BMC Bioinformatics.

[58]  Jakob Nikolas Kather,et al.  Deep learning detects virus presence in cancer histology , 2019, bioRxiv.

[59]  Lin Wang,et al.  DeepLRHE: A Deep Convolutional Neural Network Framework to Evaluate the Risk of Lung Cancer Recurrence and Metastasis From Histopathology Images , 2020, Frontiers in Genetics.

[60]  Hyun Goo Woo,et al.  Pan-cancer analysis of systematic batch effects on somatic sequence variations , 2017, BMC Bioinformatics.

[61]  Pierre Courtiol,et al.  A deep learning model to predict RNA-Seq expression of tumours from whole slide images , 2020, Nature Communications.

[62]  Constantino Carlos Reyes-Aldasoro,et al.  Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study , 2019, PLoS medicine.

[63]  Chandan Chakraborty,et al.  Efficient deep learning model for mitosis detection using breast histopathology images , 2017, Comput. Medical Imaging Graph..

[64]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[65]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of human colon and rectal cancer , 2012, Nature.

[66]  N. Shah,et al.  Implementing Machine Learning in Health Care - Addressing Ethical Challenges. , 2018, The New England journal of medicine.

[67]  O. Olopade,et al.  A perfect storm: How tumor biology, genomics, and health care delivery patterns collide to create a racial survival disparity in breast cancer and proposed interventions for change , 2015, CA: a cancer journal for clinicians.

[68]  Jakob Nikolas Kather,et al.  Pan-cancer image-based detection of clinically actionable genetic alterations , 2019, Nature Cancer.

[69]  Liviu Badea,et al.  Identifying transcriptomic correlates of histology using deep learning , 2020, bioRxiv.

[70]  Geert J. S. Litjens,et al.  Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology , 2019, Medical Image Anal..

[71]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of squamous cell lung cancers , 2012, Nature.