Identification of discriminative genes for predicting breast cancer subtypes

Breast cancer is a widespread cancer type in females and accounts for lots of cancer cases and cancer deaths in the world. Identifying the type of breast cancer plays a crucial role in selecting the best treatment. In this paper an optimized hierarchical model is proposed to predict the breast cancer subtype. Suitable filter feature selection methods and new hybrid feature selection methods are utilized in our model to find discriminative genes. The multi-class problem is handled using a proper classifier at each step in the hierarchical model to separate a subtype from the others. The parameters of each classifier are optimized to achieve a better performance. Our proposed model achieves 100% of accuracy for predicting the breast cancer subtypes using the same or even less number of genes.

[1]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[2]  Xin Jin,et al.  Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles , 2006, BioDM.

[3]  Hugues Bersini,et al.  A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[4]  J. Gustafsson,et al.  A genome-wide study of the repressive effects of estrogen receptor beta on estrogen receptor alpha signaling in breast cancer cells , 2008, Oncogene.

[5]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[6]  T. Sridevi,et al.  A Novel Feature Selection Method for Effective Breast Cancer Diagnosis and Prognosis , 2014 .

[7]  M Abdel-ZaherAhmed,et al.  Breast cancer classification using deep belief networks , 2016 .

[8]  Haleh Vafaie,et al.  Feature Selection Methods: Genetic Algorithms vs. Greedy-like Search , 2009 .

[9]  Bruce R Westley,et al.  TFF3 is a valuable predictive biomarker of endocrine response in metastatic breast cancer , 2015, Endocrine-related cancer.

[10]  Ferat Sahin,et al.  A survey on feature selection methods , 2014, Comput. Electr. Eng..

[11]  Alioune Ngom,et al.  A novel approach for finding informative genes in ten subtypes of breast cancer , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[12]  F. May,et al.  Ahmed ARH, Griffiths AB, Tilby MT, Westley BR, May FEB. TFF3 Is a Normal Breast Epithelial Protein and Is Associated with Differentiated Phenotype in Early Breast Cancer but Predisposes to Invasion and Metastasis in Advanced , 2012 .

[13]  Ayman M. Eldeib,et al.  Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..

[14]  Takashi Suzuki,et al.  BUB1 Immunolocalization in Breast Carcinoma: Its Nuclear Localization as a Potent Prognostic Factor of the Patients , 2013, Hormones and Cancer.

[15]  Fang Zhao,et al.  The estrogen-regulated anterior gradient 2 (AGR2) protein in breast cancer: a potential drug target and biomarker , 2013, Breast Cancer Research.

[16]  Kemal Polat,et al.  A New Classification Method for Breast Cancer Diagnosis: Feature Selection Artificial Immune Recognition System (FS-AIRS) , 2005, ICNC.

[17]  Anne L. Martel,et al.  Feature Selection in Computer-Aided Breast Cancer Diagnosis via Dynamic Contrast-Enhanced Magnetic Resonance Images , 2013, Journal of Digital Imaging.

[18]  F. May,et al.  TFF3 is a normal breast epithelial protein and is associated with differentiated phenotype in early breast cancer but predisposes to invasion and metastasis in advanced disease. , 2012, The American journal of pathology.

[19]  Xuesong Lu,et al.  Predicting features of breast cancer with gene expression patterns , 2008, Breast Cancer Research and Treatment.

[20]  O. Witt,et al.  HDAC11 is a novel drug target in carcinomas , 2013, International journal of cancer.

[21]  Nooritawati Md Tahir,et al.  Feature selection of breast cancer based on Principal Component Analysis , 2010, 2010 6th International Colloquium on Signal Processing & its Applications.

[22]  Xinguo Jiang,et al.  The immune system and inflammation in breast cancer , 2014, Molecular and Cellular Endocrinology.

[23]  Luis Rueda,et al.  Breast cancer subtype identification using machine learning techniques , 2014, 2014 IEEE 4th International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

[24]  Liang Liang,et al.  Dynamic changes in gene expression in vivo predict prognosis of tamoxifen-treated patients with breast cancer , 2010, Breast Cancer Research.

[25]  E.J. Delp,et al.  A Comparison of Feature Selection Methods for the Detection of Breast Cancers in Mammograms: Adaptive Sequential Floating Search vs. Genetic Algorithm , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[26]  Kemal Polat,et al.  Breast cancer diagnosis using least square support vector machine , 2007, Digit. Signal Process..

[27]  S. Appavu alias Balamurugan,et al.  A Novel Feature Selection Technique for Improved Survivability Diagnosis of Breast Cancer , 2015 .

[28]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[29]  Alioune Ngom,et al.  Identifying Informative Genes for Prediction of Breast Cancer Subtypes , 2013, PRIB.

[30]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[31]  B. Dörken,et al.  Nuclear localization and increased levels of transcription factor YB-1 in primary human breast cancers are associated with intrinsic MDR1 gene expression , 1997, Nature Medicine.