Joint DBN and Fuzzy C-Means unsupervised deep clustering for lung cancer patient stratification

Abstract Patient stratification has made a great contribution to efficient and personalized medicine. An important task in patient stratification is to discover quite distinct disease subtypes for effective treatment. In this paper, we propose a new deep learning and clustering model which combines Deep Belief Network (DBN) and Fuzzy C-Means(FCM), called Unsupervised Deep Fuzzy C-Means clustering Network(UDFCMN), to cluster lung cancer patients from lung CT images. In our deep clustering network, images after preprocessing are first encoded into multiple layers of hidden variables to extract hierarchical features and feature distribution and form the high-level representations. Here, to solve the problem of feature homogenization in DBN, we introduce the Winner-Take-All (WTA) idea to meliorate the traditional DBN structure, called WTADBN. Then FCM is used to produce the initial cluster labels with the new representations learnt by stacked WTARBM. Therefore, the FCM-generated cluster labels are used for the fine-tuning of the DBN as ground-truth labels. And an unsupervised image clustering and patient stratification process is completed by cross iteration. We tested our deep FCM clustering algorithm to do experiment on both public dataset from the internet and private dataset from cooperate hospital. For the latter one, the clinical and biological verification was also performed. Experimental results reveal outperformance of UDFCMN as compared to the state-of-the-art unsupervised classification methods. These results also indicate that our approach may have practical applications in lung cancer pathogenesis studies and provide useful guidelines for personalized cancer therapy.

[1]  A Merline,et al.  Fuzzy-C-Means Clustering Based Segmentation and CNN-Classification for Accurate Segmentation of Lung Nodules , 2017 .

[2]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[3]  Mingxiang Wu,et al.  Association Between Imaging Characteristics and Different Molecular Subtypes of Breast Cancer. , 2017, Academic radiology.

[4]  A. Ardizzoni,et al.  Accuracy of Fine Needle Aspiration Cytology in the Pathological Typing of Non-small Cell Lung Cancer , 2011, Journal of thoracic oncology : official publication of the International Association for the Study of Lung Cancer.

[5]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[6]  Mark R. Trusheim,et al.  Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers , 2007, Nature Reviews Drug Discovery.

[7]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[8]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9]  Chia-Wen Lin,et al.  CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data , 2017, IEEE Transactions on Multimedia.

[10]  Steven Piantadosi,et al.  Patient-centric trials for therapeutic development in precision oncology , 2015, Nature.

[11]  Xinghua Lu,et al.  Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma , 2017, BMC Bioinformatics.

[12]  Tej D. Azad,et al.  Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities , 2015, Science Translational Medicine.

[13]  Maciej A Mazurowski,et al.  Radiogenomic analysis of breast cancer: luminal B molecular subtype is associated with enhancement dynamics at MR imaging. , 2014, Radiology.

[14]  Zhiqiong Wang,et al.  Improved lung nodule diagnosis accuracy using lung CT images with uncertain class , 2018, Comput. Methods Programs Biomed..

[15]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[16]  Nguyen Phuoc Long,et al.  Systematic assessment of cervical cancer initiation and progression uncovers genetic panels for deep learning-based early diagnosis and proposes novel diagnostic and prognostic biomarkers , 2017, Oncotarget.

[17]  John Quackenbush,et al.  Cancer subtype identification using somatic mutation data , 2017, British Journal of Cancer.

[18]  Brendan J. Frey,et al.  A Winner-Take-All Method for Training Sparse Convolutional Autoencoders , 2014, ArXiv.

[19]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[20]  D. Shen,et al.  Computer-Aided Diagnosis with Deep Learning Architecture: Applications to Breast Lesions in US Images and Pulmonary Nodules in CT Scans , 2016, Scientific Reports.

[21]  Bo Zhang,et al.  Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders , 2017, Pattern Recognit..

[22]  Xiangtao Li,et al.  Evolutionary Multiobjective Clustering and Its Applications to Patient Stratification , 2019, IEEE Transactions on Cybernetics.

[23]  Mechthild Krause,et al.  Radiation oncology in the era of precision medicine , 2016, Nature Reviews Cancer.

[24]  Harini Veeraraghavan,et al.  Breast cancer molecular subtype classifier that incorporates MRI features , 2016, Journal of magnetic resonance imaging : JMRI.

[25]  T. Nukiwa,et al.  Gene Mutations in Lung Cancer: Promising Predictive Factors for the Success of Molecular Therapy , 2005, PLoS medicine.

[26]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[27]  Aron K. Barbey,et al.  Risk literacy in medical decision-making , 2016, Science.

[28]  D. Ikeda,et al.  Unsupervised Clustering of Quantitative Image Phenotypes Reveals Breast Cancer Subtypes with Distinct Prognoses and Molecular Pathways , 2017, Clinical Cancer Research.

[29]  J. Xiong,et al.  非小细胞肺癌驱动基因研究进展 , 2015, Zhongguo fei ai za zhi = Chinese journal of lung cancer.

[30]  M. Ridanpää,et al.  p53 and ras Gene Mutations in Lung Cancer: Implications for Smoking and Occupational Exposures , 1995, Journal of occupational and environmental medicine.

[31]  Ting Chen,et al.  Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.