Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs.

[1]  A. Churg The fine structure of large cell undifferentiated carcinoma of the lung. Evidence for its relation to squamous cell carcinomas and adenocarcinomas. , 1978, Human pathology.

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  N. Dubrawsky Cancer statistics , 1989, CA: a cancer journal for clinicians.

[4]  D. Harpole,et al.  A prognostic model of recurrence and death in stage I non-small cell lung cancer utilizing presentation, histopathology, and oncoprotein expression. , 1995, Cancer research.

[5]  A. Jemal,et al.  Global cancer statistics , 2011, CA: a cancer journal for clinicians.

[6]  F Jönsson,et al.  Snoring, pregnancy-induced hypertension, and growth retardation of the fetus. , 2000, Chest.

[7]  W. Franklin,et al.  Diagnosis of lung cancer: pathology of invasive and preinvasive neoplasia. , 2000, Chest.

[8]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[9]  Guy E. Thwaites,et al.  The Diagnosis and Management of Tuberculous Meningitis , 2002, Practical Neurology.

[10]  Frank B. Golley,et al.  Paradigm shift , 1989, Landscape Ecology.

[11]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[12]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[13]  Thomas Lengauer,et al.  ROCR: visualizing classifier performance in R , 2005, Bioinform..

[14]  F. Meier,et al.  Clinical impact and frequency of anatomic pathology errors in cancer diagnoses , 2005, Cancer.

[15]  Carl Virtanen,et al.  Two subclasses of lung squamous cell carcinoma with different gene expression profiles and prognosis identified by hierarchical clustering and non-negative matrix factorization , 2005, Oncogene.

[16]  Robert Gray,et al.  Paclitaxel-carboplatin alone or with bevacizumab for non-small-cell lung cancer. , 2006, The New England journal of medicine.

[17]  K. Jöckel,et al.  Diagnostic agreement in the histopathological evaluation of lung cancer tissue in a population-based case-control study. , 2006, Lung cancer.

[18]  Anne E Carpenter,et al.  CellProfiler: image analysis software for identifying and quantifying cell phenotypes , 2006, Genome Biology.

[19]  Andrew H. Beck,et al.  Computerized morphometry as an aid in determining the grade of dysplasia and progression to adenocarcinoma in Barrett's esophagus , 2006, Laboratory Investigation.

[20]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[21]  R. Enck,et al.  Lung cancer: diagnosis and management. , 2007, American family physician.

[22]  Michael K Gould,et al.  Noninvasive staging of non-small cell lung cancer: ACCP evidenced-based clinical practice guidelines (2nd edition). , 2007, Chest.

[23]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[24]  Nigam H. Shah,et al.  The Stanford Tissue Microarray Database , 2007, Nucleic Acids Res..

[25]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[26]  Joel H. Saltz,et al.  Histopathological Image Analysis Using Model-Based Intermediate Representations and Color Texture: Follicular Lymphoma Grading , 2009, J. Signal Process. Syst..

[27]  M. Tsao,et al.  Molecular predictive and prognostic markers in non-small-cell lung cancer. , 2009, The Lancet. Oncology.

[28]  Jun Kong,et al.  Computer-aided prognosis of neuroblastoma on whole-slide images: Classification of stromal development , 2009, Pattern Recognit..

[29]  G. Scagliotti,et al.  The differential efficacy of pemetrexed according to NSCLC histology: a review of two Phase III studies. , 2009, The oncologist.

[30]  K. Kerr,et al.  Subtyping of Undifferentiated Non-small Cell Carcinomas in Bronchial Biopsy Specimens , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[31]  Metin Nafi Gürcan,et al.  Detection of Follicles From IHC-Stained Slides of Follicular Lymphoma Using Iterative Watershed , 2010, IEEE Transactions on Biomedical Engineering.

[32]  Manjiri Deshmukh,et al.  Refining the Diagnosis and EGFR Status of Non-small Cell Lung Carcinoma in Biopsy and Cytologic Material, Using a Panel of Mucin Staining, TTF-1, Cytokeratin 5/6, and P63, and EGFR Mutation Analysis , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[33]  V. Rusch,et al.  Pathologic diagnosis of advanced lung cancer based on small biopsies and cytology: a paradigm shift. , 2010, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[34]  C. Rueden,et al.  Metadata matters: access to image data in the real world , 2010, The Journal of cell biology.

[35]  Andrew H. Beck,et al.  Systematic Analysis of Breast Cancer Morphology Uncovers Stromal Features Associated with Survival , 2011, Science Translational Medicine.

[36]  John R. Gilbertson,et al.  Computer aided diagnostic tools aim to empower rather than replace pathologists: Lessons learned from computational chess , 2011, Journal of pathology informatics.

[37]  Yang Xue-ning International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society International Multidisciplinary Classification of Lung Adenocarcinoma , 2011 .

[38]  Masahiro Tsuboi,et al.  International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society International Multidisciplinary Classification of Lung Adenocarcinoma , 2011, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[39]  Akihiko Yoshizawa,et al.  Impact of proposed IASLC/ATS/ERS classification of lung adenocarcinoma: prognostic subgroups and implications for further revision of staging based on analysis of 514 stage I cases , 2011, Modern Pathology.

[40]  Anne E Carpenter,et al.  Improved structure, function and compatibility for CellProfiler: modular high-throughput image analysis software , 2011, Bioinform..

[41]  Trevor Hastie,et al.  Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. , 2011, Journal of statistical software.

[42]  Joachim M. Buhmann,et al.  Computational Pathology: Challenges and Promises for Tissue Analysis , 2015, Comput. Medical Imaging Graph..

[43]  Jian Feng,et al.  FoxQ1 Overexpression Influences Poor Prognosis in Non-Small Cell Lung Cancer, Associates with the Phenomenon of EMT , 2012, PloS one.

[44]  Iver Petersen,et al.  Reproducibility of histopathological subtypes and invasion in pulmonary adenocarcinoma. An international interobserver study , 2012, Modern Pathology.

[45]  Michael Thomas,et al.  The novel histologic International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society classification system of lung adenocarcinoma is a stage-independent predictor of survival. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[46]  Iver Petersen,et al.  Training increases concordance in classifying pulmonary adenocarcinomas according to the novel IASLC/ATS/ERS classification , 2012, Virchows Archiv.

[47]  K. Kerr Personalized medicine for lung cancer: new challenges for pathology , 2012, Histopathology.

[48]  M. Tsao,et al.  Ancillary Testing in Lung Cancer Diagnosis , 2012, Pulmonary medicine.

[49]  Steven J. M. Jones,et al.  Comprehensive genomic characterization of squamous cell lung cancers , 2012, Nature.

[50]  Iver Petersen,et al.  Interobserver variability in the application of the novel IASLC/ATS/ERS classification for pulmonary adenocarcinomas , 2012, European Respiratory Journal.

[51]  W. Travis,et al.  New pathologic classification of lung cancer: relevance for clinical practice and clinical trials. , 2013, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[52]  A. Jemal,et al.  Cancer statistics, 2013 , 2013, CA: a cancer journal for clinicians.

[53]  Christopher R. Cabanski,et al.  Validation of interobserver agreement in lung cancer assessment: hematoxylin-eosin diagnostic reproducibility for non-small cell lung cancer: the 2004 World Health Organization classification and therapeutically relevant subsets. , 2013, Archives of pathology & laboratory medicine.

[54]  H. Tsuda,et al.  Combined high‐grade neuroendocrine carcinoma of the lung: Clinicopathological and immunohistochemical study of 34 surgically resected cases , 2014, Pathology international.

[55]  C. Sima,et al.  Comprehensive Pathological Analyses in Lung Squamous Cell Carcinoma: Single Cell Invasion, Nuclear Diameter, and Tumor Budding Are Independent Prognostic Factors for Worse Outcomes , 2014, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[56]  M. Kojima,et al.  Tumor‐size‐based morphological features of metastatic lymph node tumors from primary lung adenocarcinoma , 2014, Pathology international.

[57]  Steven J. M. Jones,et al.  Comprehensive molecular profiling of lung adenocarcinoma , 2014, Nature.

[58]  Gwénaël Le Teuff,et al.  Subtype Classification of Lung Adenocarcinoma Predicts Benefit From Adjuvant Chemotherapy in Patients Undergoing Complete Resection. , 2015, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[59]  A. Warth,et al.  Proposal of a prognostically relevant grading scheme for pulmonary squamous cell carcinoma , 2015, European Respiratory Journal.

[60]  Kun‐Hsing Yu,et al.  Omics Profiling in Precision Oncology* , 2016, Molecular & Cellular Proteomics.

[61]  Michael Snyder Genomics and Personalized Medicine: What Everyone Needs to Know® , 2016 .