Novel image markers for non-small cell lung cancer classification and survival prediction

BackgroundNon-small cell lung cancer (NSCLC), the most common type of lung cancer, is one of serious diseases causing death for both men and women. Computer-aided diagnosis and survival prediction of NSCLC, is of great importance in providing assistance to diagnosis and personalize therapy planning for lung cancer patients.ResultsIn this paper we have proposed an integrated framework for NSCLC computer-aided diagnosis and survival analysis using novel image markers. The entire biomedical imaging informatics framework consists of cell detection, segmentation, classification, discovery of image markers, and survival analysis. A robust seed detection-guided cell segmentation algorithm is proposed to accurately segment each individual cell in digital images. Based on cell segmentation results, a set of extensive cellular morphological features are extracted using efficient feature descriptors. Next, eight different classification techniques that can handle high-dimensional data have been evaluated and then compared for computer-aided diagnosis. The results show that the random forest and adaboost offer the best classification performance for NSCLC. Finally, a Cox proportional hazards model is fitted by component-wise likelihood based boosting. Significant image markers have been discovered using the bootstrap analysis and the survival prediction performance of the model is also evaluated.ConclusionsThe proposed model have been applied to a lung cancer dataset that contains 122 cases with complete clinical information. The classification performance exhibits high correlations between the discovered image markers and the subtypes of NSCLC. The survival analysis demonstrates strong prediction power of the statistical model built from the discovered image markers.

[1]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[2]  Laurent D. Cohen,et al.  On active contour models and balloons , 1991, CVGIP Image Underst..

[3]  Shibo Li,et al.  Automated classification of metaphase chromosomes: Optimization of an adaptive computerized scheme , 2009, J. Biomed. Informatics.

[4]  Yang Gao,et al.  Multi-class Multi-instance Learning for Lung Cancer Image Classification Based on Bag Feature Selection , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[5]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[6]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[7]  Harald Binder,et al.  Incorporating pathway information into boosting estimation of high-dimensional risk prediction models , 2009, BMC Bioinformatics.

[8]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[9]  Mark Culp,et al.  and Development , 1998 .

[10]  L. Tanoue,et al.  The new lung cancer staging system. , 2009, Chest.

[11]  Heng Huang,et al.  Region-based progressive localization of cell nuclei in microscopic images with data adaptive modeling , 2013, BMC Bioinformatics.

[12]  Xiaobo Zhou,et al.  Towards Automated Cellular Image Segmentation for RNAi Genome-Wide Screening , 2005, MICCAI.

[13]  Jeffrey C Miecznikowski,et al.  Comparative survival analysis of breast cancer microarray studies identifies important prognostic genetic pathways , 2010, BMC Cancer.

[14]  R. Burnett,et al.  Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. , 2002, JAMA.

[15]  Stephen T. C. Wong,et al.  On-the-spot lung cancer differential diagnosis by label-free, molecular vibrational imaging and knowledge-based classification. , 2011, Journal of biomedical optics.

[16]  Yung-Nien Sun,et al.  Texture Feature Coding Method for Classification of Liver Sonography , 1996, ECCV.

[17]  Fabio A. González,et al.  A semi-automatic method for quantification and classification of erythrocytes infected with malaria parasites in microscopic images , 2009, J. Biomed. Informatics.

[18]  A. E. Hoerl,et al.  Ridge Regression: Applications to Nonorthogonal Problems , 1970 .

[19]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics (e1071), TU Wien , 2014 .

[20]  Yue Li,et al.  Mammogram retrieval through machine learning within BI-RADS standards , 2011, J. Biomed. Informatics.

[21]  D. Dockery,et al.  An association between air pollution and mortality in six U.S. cities. , 1993, The New England journal of medicine.

[22]  A. Madabhushi,et al.  Histopathological Image Analysis: A Review , 2009, IEEE Reviews in Biomedical Engineering.

[23]  Lin Yang,et al.  Robust Segmentation of Overlapping Cells in Histopathology Specimens Using Parallel Seed Detection and Repulsive Level Set , 2012, IEEE Transactions on Biomedical Engineering.

[24]  Kenneth I. Laws,et al.  Rapid Texture Identification , 1980, Optics & Photonics.

[25]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[26]  S. Bennett,et al.  Analysis of survival data by the proportional odds model. , 1983, Statistics in medicine.

[27]  Bjoern H. Menze,et al.  Medical Computer Vision. Large Data in Medical Imaging: Third International MICCAI Workshop, MCV 2013, Nagoya, Japan, September 26, 2013, Revised Selected Papers , 2014, MCV.

[28]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[29]  Taxiarchis Botsis,et al.  Molecular classification of nonsmall cell lung cancer using a 4‐protein quantitative assay , 2012, Cancer.

[30]  Fabio A. González,et al.  Content-based histopathology image retrieval using a kernel-based semantic annotation framework , 2011, J. Biomed. Informatics.

[31]  D.,et al.  Regression Models and Life-Tables , 2022 .

[32]  Jagath C. Rajapakse,et al.  Segmentation of Clustered Nuclei With Shape Markers and Marking Function , 2009, IEEE Transactions on Biomedical Engineering.

[33]  A. Harris,et al.  Angiogenesis, assessed by platelet/endothelial cell adhesion molecule antibodies, as indicator of node metastases and survival in breast cancer , 1992, The Lancet.

[34]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[36]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[37]  Andrey N. Chernikov,et al.  A mesh generation and machine learning framework for Drosophilagene expression pattern image analysis , 2013, BMC Bioinformatics.

[38]  Jinbo Bi,et al.  Effective 3D object detection and regression using probabilistic segmentation features in CT images , 2011, CVPR 2011.

[39]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Nigel D. Haig Image Processing for Missile Guidance , 1981 .

[41]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[42]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[43]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[44]  Nancy Lan Guo,et al.  Signaling pathway-based identification of extensive prognostic gene signatures for lung adenocarcinoma. , 2012, Lung cancer.

[45]  Xiaobo Zhou,et al.  Nuclei Segmentation Using Marker-Controlled Watershed, Tracking Using Mean-Shift, and Kalman Filter in Time-Lapse Microscopy , 2006, IEEE Transactions on Circuits and Systems I: Regular Papers.

[46]  Joseph Y. Lo,et al.  Mutual information-based template matching scheme for detection of breast masses: From mammography to digital breast tomosynthesis , 2011, J. Biomed. Informatics.

[47]  Yong Qian,et al.  Confirmation of Gene Expression–Based Prediction of Survival in Non–Small Cell Lung Cancer , 2008, Clinical Cancer Research.

[48]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Ming-Huwi Horng,et al.  Texture Feature Coding Method for Classification of Liver Sonography , 1996, ECCV.

[50]  Harald Binder,et al.  Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models , 2008, BMC Bioinformatics.

[51]  Gerhard Tutz,et al.  Boosting ridge regression , 2007, Comput. Stat. Data Anal..

[52]  Yousef Al-Kofahi,et al.  Improved Automatic Detection and Segmentation of Cell Nuclei in Histopathology Images , 2010, IEEE Transactions on Biomedical Engineering.

[53]  P. Bühlmann,et al.  Boosting With the L2 Loss , 2003 .

[54]  D CohenLaurent On active contour models and balloons , 1991 .

[55]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[56]  Xiaobo Zhou,et al.  An image score inference system for RNAi genome-wide screening based on fuzzy mixture regression modeling , 2009, J. Biomed. Informatics.

[57]  Le Lu,et al.  Computer Aided Diagnosis Using Multilevel Image Features on Large-Scale Evaluation , 2013, MCV.

[58]  Wei Du,et al.  Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines , 2003, FEBS letters.

[59]  Igor Jurisica,et al.  Gene expression–based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study , 2008, Nature Medicine.

[60]  Allen R. Tannenbaum,et al.  Localizing Region-Based Active Contours , 2008, IEEE Transactions on Image Processing.

[61]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[62]  Stella X. Yu,et al.  Finding dots: Segmentation as popping out regions from boundaries , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[63]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Hanchuan Peng,et al.  Automated image computing reshapes computational neuroscience , 2013, BMC Bioinformatics.

[65]  T R Fleming,et al.  Survival Analysis in Clinical Trials: Past Developments and Future Directions , 2000, Biometrics.

[66]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[67]  Hong Yu,et al.  Automatic figure classification in bioscience literature , 2011, J. Biomed. Informatics.

[68]  B. Yu,et al.  Boosting with the L_2-Loss: Regression and Classification , 2001 .

[69]  Karl Rohr,et al.  Fast Globally Optimal Segmentation of Cells in Fluorescence Microscopy Images , 2011, MICCAI.

[70]  Todd H. Stokes,et al.  Pathology imaging informatics for quantitative analysis of whole-slide images , 2013, Journal of the American Medical Informatics Association : JAMIA.

[71]  F. Azuaje,et al.  Multiple SVM-RFE for gene selection in cancer classification with expression data , 2005, IEEE Transactions on NanoBioscience.

[72]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.